Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ dist/
.vscode/
.idea/
.claude/settings.local.json
.sdk-under-test/
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Keep scenarios separate when they're genuinely independent features or when they
- **Same `id` for SUCCESS and FAIL.** A check should use one slug and flip `status` + `errorMessage`, not branch into `foo-success` vs `foo-failure` slugs.
- **Optimize for Ctrl+F on the slug.** Repetitive check blocks are fine — easier to find the failing one than to unwind a clever helper.
- Reuse `ConformanceCheck` and other types from `src/types.ts` rather than defining parallel shapes.
- **Don't reimplement the runner.** New subcommands that need to "select scenarios → run them → print summary → compute exit code" must go through the existing `client` / `server` commands (subprocess via `process.execPath` like `tier-check` and `sdk` do) or call shared helpers — never a parallel suite-map / summary loop.
- Include `specReferences` pointing to the relevant spec section.
- **Severity follows the spec keyword:** MUST / MUST NOT → `FAILURE`; SHOULD / SHOULD NOT → `WARNING`. (CI treats WARNING as a failure, so Tier-1 SDKs still need to satisfy SHOULDs — see #245.)

Expand Down
44 changes: 44 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,50 @@ Run `npx @modelcontextprotocol/conformance list --server` to see all available s
- **resources-\*** - Resource management scenarios
- **prompts-\*** - Prompt management scenarios

## Running Against an SDK at a Specific Ref

The `sdk` subcommand clones an SDK repository at a given ref, builds it, and runs the **local** conformance build against it. This is the inner-loop tool for scenario authors and the basis for cross-SDK CI. Examples below use `npm start --` so they run from source — no `npm run build` between edits.

`--mode client` or `--mode server` is required — each invocation tests exactly one side, so client and server are run (and pass/fail) independently.

```bash
# Run the client conformance suite against typescript-sdk @main (v2)
npm start -- sdk typescript-sdk --mode client
# Run the server conformance suite (separate invocation)
npm start -- sdk typescript-sdk --mode server
# A specific main-line SHA or branch (v2 monorepo)
npm start -- sdk typescript-sdk@abc123f --mode client
npm start -- sdk typescript-sdk@some-branch --mode server
# The published v1.x line — separate entry (npm build), defaults to the v1.x branch
npm start -- sdk typescript-sdk-v1 --mode client
npm start -- sdk typescript-sdk-v1@v1.29.0 --mode server
# Use an existing local checkout (no clone, no fetch)
npm start -- sdk --path ../typescript-sdk --skip-build --mode client
# Narrow to one scenario / suite
npm start -- sdk --path ../typescript-sdk --mode server --scenario server-initialize
npm start -- sdk typescript-sdk --mode client --suite auth
```

Build/run commands for each official SDK are looked up by name from [`src/sdk-runner/known-sdks.ts`](src/sdk-runner/known-sdks.ts) — no config file is required in the SDK repo. Resolution order is **CLI flag > built-in entry**, so any field can be overridden on the command line for refs that diverge from the built-in.

An SDK can have more than one entry when its layout differs across major versions — e.g. `typescript-sdk` (v2, the `main` monorepo) and `typescript-sdk-v1` (the published npm v1.x line). An entry may set `defaultRef` (the branch used when you don't pass `@<ref>`) and `repo` (the real clone target when the entry name is an alias). Overriding for a one-off ref:

```bash
npm start -- sdk owner/go-sdk@some-branch \
--mode client \
--build-cmd 'go build -tags mcp_go_client_oauth -o ./.conformance-client ./conformance/everything-client' \
--client-cmd './.conformance-client'
```

To add a new SDK to the matrix, add an entry to `KNOWN_SDKS`.

Clones are cached under `.sdk-under-test/` and reused (fetched) on subsequent runs.

## SDK Tier Assessment

The `tier-check` subcommand evaluates an MCP SDK repository against [SEP-1730](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1730) (the SDK Tiering System):
Expand Down
4 changes: 4 additions & 0 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ import {
} from './expected-failures';
import { createTierCheckCommand } from './tier-check';
import { createNewSepCommand } from './new-sep';
import { createSdkCommand } from './sdk-runner';
import packageJson from '../package.json';

// Note on naming: `command` refers to which CLI command is calling this.
Expand Down Expand Up @@ -544,6 +545,9 @@ program.addCommand(createTierCheckCommand());
// New SEP scaffolding command
program.addCommand(createNewSepCommand());

// SDK command - run local conformance against an SDK at a specific ref
program.addCommand(createSdkCommand());

// List scenarios command
program
.command('list')
Expand Down
126 changes: 126 additions & 0 deletions src/sdk-runner/checkout.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
import { spawn } from 'child_process';
import { promises as fs } from 'fs';
import path from 'path';

export interface SdkSpec {
name: string;
ref: string;
}

/**
* A parsed `<name>[@<ref>]` argument. `ref` is left undefined when the user
* omits `@<ref>` so the caller can fall back to a per-SDK default branch
* (KNOWN_SDKS `defaultRef`) before settling on `main`.
*/
export interface ParsedSdkSpec {
name: string;
ref?: string;
}

const DEFAULT_ORG = 'modelcontextprotocol';

export function parseSdkSpec(spec: string): ParsedSdkSpec {
const at = spec.lastIndexOf('@');
if (at <= 0) {
return { name: spec };
}
// A trailing `@` (empty ref) is treated as "no ref given" so the caller's
// defaultRef/main fallback applies, rather than checking out the empty ref.
const ref = spec.slice(at + 1);
return ref ? { name: spec.slice(0, at), ref } : { name: spec.slice(0, at) };
}

function repoUrl(name: string): string {
if (name.includes('/')) {
return `https://github.com/${name}.git`;
}
return `https://github.com/${DEFAULT_ORG}/${name}.git`;
}

async function git(
args: string[],
cwd: string
): Promise<{ stdout: string; stderr: string }> {
const cmd = 'git';
return new Promise((resolve, reject) => {
const child = spawn(cmd, args, { cwd, stdio: ['ignore', 'pipe', 'pipe'] });
let stdout = '';
let stderr = '';
child.stdout.on('data', (d) => (stdout += d.toString()));
child.stderr.on('data', (d) => (stderr += d.toString()));
child.on('error', reject);
child.on('close', (code) => {
if (code === 0) {
resolve({ stdout, stderr });
} else {
reject(
new Error(
`${cmd} ${args.join(' ')} exited with ${code}\n${stderr || stdout}`
)
);
}
});
});
}

async function dirExists(dir: string): Promise<boolean> {
try {
const stat = await fs.stat(dir);
return stat.isDirectory();
} catch {
return false;
}
}

/**
* Ensure an SDK is checked out at the requested ref under cacheDir.
* Clones on first use; on subsequent calls fetches and resets to the ref.
* Returns the absolute path to the checkout.
*/
export async function ensureCheckout(
spec: SdkSpec,
cacheDir: string
): Promise<string> {
const safeName = spec.name.replace(/\//g, '__');
// Key the checkout by ref as well, so different refs of the same repo (e.g.
// the typescript-sdk `main` and typescript-sdk-v1 `v1.x` entries) get their
// own directory instead of thrashing one checkout between refs/build systems.
const safeRef = spec.ref.replace(/[^a-zA-Z0-9._-]/g, '_');
const dir = path.resolve(cacheDir, safeName, safeRef);
await fs.mkdir(path.dirname(dir), { recursive: true });

if (await dirExists(path.join(dir, '.git'))) {
console.error(`[sdk] Fetching ${spec.name} (cached at ${dir})`);
await git(['fetch', '--tags', 'origin'], dir);
} else {
console.error(`[sdk] Cloning ${repoUrl(spec.name)} -> ${dir}`);
await git(['clone', repoUrl(spec.name), dir], path.dirname(dir));
}

// Try the ref as a remote branch first, then fall back to a local-resolvable
// ref (tag or SHA).
const candidates = [`origin/${spec.ref}`, spec.ref];
let resolved: string | undefined;
for (const candidate of candidates) {
try {
await git(['rev-parse', '--verify', `${candidate}^{commit}`], dir);
resolved = candidate;
break;
} catch {
// rev-parse failure means this candidate doesn't exist; try the next form
}
}
if (!resolved) {
throw new Error(
`Ref '${spec.ref}' not found in ${spec.name} (tried ${candidates.join(', ')})`
);
}

console.error(`[sdk] Checking out ${spec.name}@${spec.ref} (${resolved})`);
await git(['checkout', '--detach', resolved], dir);

const { stdout } = await git(['rev-parse', '--short', 'HEAD'], dir);
console.error(`[sdk] HEAD is ${stdout.trim()}`);

return dir;
}
25 changes: 25 additions & 0 deletions src/sdk-runner/config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import { z } from 'zod';

export const SdkConfigSchema = z.object({
// Clone this repo instead of the KNOWN_SDKS key — lets an alias entry
// (e.g. typescript-sdk-v1) point at the real repo (typescript-sdk).
repo: z.string().optional(),
// Ref to check out when the SDK is named with no @ref (the "default branch").
defaultRef: z.string().optional(),
build: z.string().optional(),
client: z
.object({
command: z.string()
})
.optional(),
server: z
.object({
command: z.string(),
url: z.string().url(),
readyTimeoutMs: z.number().int().positive().optional()
})
.optional(),
expectedFailures: z.string().optional()
});

export type SdkConfig = z.infer<typeof SdkConfigSchema>;
Loading
Loading