Skip to content

Ambiguity in capability tests for SEP-2575 #279

@anubhav756

Description

@anubhav756

Problem

In the stateless scenario of the conformance suite (#273), the mock server runs a simple lifecycle test (negotiating version, listing tools, and cancelling a tool call). It does not execute any roots or elicitation flows.

Because these optional capabilities are never triggered or used on the wire during this test run:

  • If the mock server sees a client omits elicitation, how does it know if the client is correct or if the client is buggy and forgot to declare it?
  • If the mock server sees a client declares elicitation, how does it know if the client is correct or if the client is lying and doesn't actually support it?

Background

Optional vs. Required Capabilities

  • Client capabilities like roots (filesystem access), sampling (AI model queries), and elicitation (user prompt inputs) are optional capabilities in the MCP spec.
  • These tests implement stateless MCP optional client capability verification.
  • Note: Client is conformant under the spec even if it only supports tools and omits the others entirely.

Implication Rule

  • The specification dictates: "Clients that support [optional capability, e.g., elicitation] MUST declare it."
  • If a client does not support a capability, it MUST NOT declare it (otherwise it misleads the server and crashes during execution).
  • Therefore:{Supports Capability} iff {Declares Capability}

Potential Solutions

Option 1 (Recommended): Permissive Check

  • If the capability is present → verify it is a valid object {}SUCCESS.

  • If absent → SUCCESS (since it is optional).

  • Pros:

    • Simple and client-blind
    • Requires no extra config.
  • Cons:

    • Weak verification.
      • A buggy client that forgot to declare it passes anyway.
      • A lying client that erroneously declared it passes anyway.

Option 2: Strict Assertion + Conformance Baseline

  • The mock server strictly asserts that the client MUST declare all capabilities.

  • If any are missing → FAILURE.

  • Capability supported in the client under test?

    • Yes

      • Declares all → passes with SUCCESS.
      • If it forgets → FAILURE (build breaks).
    • No

      • Omits them → fails in harness.
      • The developer lists these failures in their expected failures file (conformance-baseline.yml).
      • CI passes cleanly.
      • If the client erroneously declares them → check passes → baseline becomes stale → runner fails the build (catches the lie!).
    • Pros:

      • Catches both forgetful and lying clients.
      • Keeps the mock server code clean and client-blind.
      • Uses standard conformance baseline mechanisms.
    • Cons:

      • Even though the tools-only clients (like MCP Toolbox) are perfectly conformant with the stateless protocol, they would be forced to modify their conformance-baseline.yml baseline files.
      • The MCP Working Group could find this annoying.

Reason for Rejection

  • The baseline file (conformance-baseline.yml) only accepts scenario names (e.g., - tools_call, - sse-retry, - auth/metadata-default).
  • It does not support baseline tracking at the individual check slug level.
  • So this option is not applicable.

Option 3: Probe-and-Callback

  • The mock server actively probes the client during a tool call by returning an InputRequiredResult containing an ElicitRequest.
    • A client that supports it must resolve and retry (declaring support).

    • A client that doesn't must fail/abort.

    • Pros: Dynamic and client-blind.

    • Cons:

      • High runtime complexity.
      • Duplicates elicitation defaults scenario logic into the simple stateless scenario.
      • Risk race conditions and CI timeouts.

Recommendation

I recommend adopt Option 2 because it provides a verification guarantee in both directions of the implication, requires no runtime complexity, and is client-blind and spec-compliant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions