RFC: Session-Scoped Architecture for vMCP by yrobla · Pull Request #38 · stacklok/toolhive-rfcs

yrobla · 2026-02-04T13:31:46Z

RFC: Session-Scoped Architecture for vMCP

This RFC refactors vMCP session management to fix per-request client creation and scattered session concerns.

Problem

Per-request overhead: Backend clients created/closed on every tool call, causing connection overhead and state loss for stateful
backends (Playwright, databases)
Scattered architecture: Session creation tangled across middleware, adapters, and SDK hooks with no clear ownership
Hard to extend: Current design makes it difficult to add features (optimizer-in-vmcp) or write integration tests without spinning up
the full server

Solution

Introduce two core interfaces:

Session - Domain object that owns backend clients, encapsulates routing logic, manages lifecycle
SessionFactory - Creates fully-formed sessions (capability discovery + client initialization)
SessionManager - Bridges domain logic to SDK protocol concerns

Key improvement: Clients created once during session init, reused for all requests, closed on expiration.

Copilot

Pull request overview

Adds an RFC proposing a shift from per-request/lazy MCP backend client creation to session-scoped client lifecycle management to simplify the current pooling approach while preserving backend state across tool calls.

Changes:

Introduces a new RFC describing a session-scoped backend client map owned by VMCPSession.
Specifies eager client initialization during session setup (AfterInitialize) and cleanup during session close.
Outlines phased implementation steps, testing strategy, and alternatives considered.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0034-session-scoped-client-lifecycle.md

The RFC captures the complete architectural vision from the discussion, providing a roadmap for simplifying the client pooling implementation while maintaining all current functionality.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

rfcs/THV-0034-session-scoped-client-lifecycle.md

rfcs/THV-0038-session-scoped-client-lifecycle.md

rfcs/THV-0034-session-scoped-client-lifecycle.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

jerm-dro

I left two bigger comments elaborating on my thoughts. To summarize, I think there are larger problems with how the session concept is handled within vMCP. Namely, we have session concerns scattered throughout the codebase and all of these concerns are tightly coupled. As a solution, I think we should create an interface that encapsulates these concerns:

// Session represents an active MCP session with all its capabilities and resources.
// This is a pure domain object - no protocol concerns and can be run without being spun up in a server.
type Session interface {
    ID() string

    // Capabilities - returns discovered tools/resources for this session
    // Perhaps these could be combined into one "GetCapabilities" method.
    Tools() []Tool
    Resources() []Resource

    // MCP operations - routing logic is encapsulated here
    CallTool(ctx context.Context, name string, arguments map[string]any) (*ToolResult, error)
    ReadResource(ctx context.Context, uri string) (*ResourceResult, error)
    GetPrompt(ctx context.Context, name string, arguments map[string]any) (*PromptResult, error)

    // Lifecycle
    Close() error
}

That would make solving this client-lifetime problem trivial.

rfcs/THV-0038-session-scoped-client-lifecycle.md

Address architectural feedback from PR #38 review, specifically jerm-dro's comments about the root cause being scattered session concerns rather than just client lifecycle management. Major changes: 1. **Expanded Problem Statement**: - Identified root cause: session concerns scattered across middleware, adapters, and hooks - Current VMCPSession is a passive data container, not a domain object - Session construction tangled with SDK lifecycle callbacks - Hard to create session-scoped resources (clients, optimizer features) 2. **Introduced Core Interfaces**: - Session interface: Domain object that owns clients, encapsulates routing, manages lifecycle (not just a data container) - SessionFactory interface: Creates fully-formed sessions with all dependencies (discovery, client init, routing setup) - SessionManager: Bridges between domain (Session) and protocol (SDK) 3. **Updated Goals**: - Primary goal: Encapsulate session behavior in clean interfaces - Decouple session creation from SDK wiring - Enable testing without full server (addresses toolhive#2852) - Enable future features through interface decoration 4. **Revised Implementation Plan**: - Phase 1: Introduce Session/SessionFactory interfaces - Phase 2: Wire SessionManager to SDK - Phase 3: Migrate existing code to use Session interface - Phase 4: Cleanup and observability 5. **Added Benefits Documentation**: - Concrete examples of decorator pattern (optimizer-in-vmcp) - Simplified capability refresh approach - Clear testing strategy without server dependency 6. **Fixed Formatting Issues** (from Copilot review): - Split Goals & Non-Goals into separate sections - Changed "Security" to "Security Considerations" - Restructured Compatibility section - Fixed References to use full repo-qualified format - Added credential refresh considerations The RFC now addresses the architectural root cause (scattered session concerns) rather than just the symptom (per-request client creation). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Refactor RFC to address scattered session architecture concerns Address architectural feedback from PR #38 review, specifically jerm-dro's comments about the root cause being scattered session concerns rather than just client lifecycle management. Major changes: 1. **Expanded Problem Statement**: - Identified root cause: session concerns scattered across middleware, adapters, and hooks - Current VMCPSession is a passive data container, not a domain object - Session construction tangled with SDK lifecycle callbacks - Hard to create session-scoped resources (clients, optimizer features) 2. **Introduced Core Interfaces**: - Session interface: Domain object that owns clients, encapsulates routing, manages lifecycle (not just a data container) - SessionFactory interface: Creates fully-formed sessions with all dependencies (discovery, client init, routing setup) - SessionManager: Bridges between domain (Session) and protocol (SDK) 3. **Updated Goals**: - Primary goal: Encapsulate session behavior in clean interfaces - Decouple session creation from SDK wiring - Enable testing without full server (addresses toolhive#2852) - Enable future features through interface decoration 4. **Revised Implementation Plan**: - Phase 1: Introduce Session/SessionFactory interfaces - Phase 2: Wire SessionManager to SDK - Phase 3: Migrate existing code to use Session interface - Phase 4: Cleanup and observability 5. **Added Benefits Documentation**: - Concrete examples of decorator pattern (optimizer-in-vmcp) - Simplified capability refresh approach - Clear testing strategy without server dependency 6. **Fixed Formatting Issues** (from Copilot review): - Split Goals & Non-Goals into separate sections - Changed "Security" to "Security Considerations" - Restructured Compatibility section - Fixed References to use full repo-qualified format - Added credential refresh considerations The RFC now addresses the architectural root cause (scattered session concerns) rather than just the symptom (per-request client creation). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Refactor RFC to address scattered session architecture concerns Address architectural feedback from PR #38 review, specifically jerm-dro's comments about the root cause being scattered session concerns rather than just client lifecycle management. Major changes: 1. **Expanded Problem Statement**: - Identified root cause: session concerns scattered across middleware, adapters, and hooks - Current VMCPSession is a passive data container, not a domain object - Session construction tangled with SDK lifecycle callbacks - Hard to create session-scoped resources (clients, optimizer features) 2. **Introduced Core Interfaces**: - Session interface: Domain object that owns clients, encapsulates routing, manages lifecycle (not just a data container) - SessionFactory interface: Creates fully-formed sessions with all dependencies (discovery, client init, routing setup) - SessionManager: Bridges between domain (Session) and protocol (SDK) 3. **Updated Goals**: - Primary goal: Encapsulate session behavior in clean interfaces - Decouple session creation from SDK wiring - Enable testing without full server (addresses toolhive#2852) - Enable future features through interface decoration 4. **Revised Implementation Plan**: - Phase 1: Introduce Session/SessionFactory interfaces - Phase 2: Wire SessionManager to SDK - Phase 3: Migrate existing code to use Session interface - Phase 4: Cleanup and observability 5. **Added Benefits Documentation**: - Concrete examples of decorator pattern (optimizer-in-vmcp) - Simplified capability refresh approach - Clear testing strategy without server dependency 6. **Fixed Formatting Issues** (from Copilot review): - Split Goals & Non-Goals into separate sections - Changed "Security" to "Security Considerations" - Restructured Compatibility section - Fixed References to use full repo-qualified format - Added credential refresh considerations The RFC now addresses the architectural root cause (scattered session concerns) rather than just the symptom (per-request client creation). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Update Last Updated date to 2026-02-09

rfcs/THV-0038-session-scoped-client-lifecycle.md

JAORMX

Hey, appreciate the work on this RFC. The per-request client creation problem is very real and worth solving (I filed stacklok/toolhive#2417 about exactly this pain point). The session-scoped client lifecycle and the Session-as-domain-object approach are solid improvements. That said, I have some concerns about how this interacts with the existing session storage architecture we built specifically for externalizing sessions to Redis/Valkey.

Let me dig in...

What we already have in place

Over PRs #1677, #1989, #1770, and #1771, we built a deliberate, phased session management architecture:

Storage interface (pkg/transport/session/storage.go): A 5-method pluggable abstraction (Store, Load, Delete, DeleteExpired, Close) designed specifically to swap between in-memory and distributed backends (Redis/Valkey)
LocalStorage: In-memory sync.Map implementation behind the Storage interface
Manager: Uses Storage + Factory, with NewManagerWithStorage() for injecting custom backends
serialization.go: JSON ser/deser already prepared for Redis/Valkey (currently nolint:unused, ready for when we implement a distributed backend)

The whole point of this layering was: when we're ready for distributed sessions, we implement a RedisStorage (or equivalent) satisfying the Storage interface, and nothing else changes. Not the Manager, not the proxies, not the vMCP server.

With this in mind, here are my concerns:

The parallel storage path problem

The RFC's SessionManager introduces its own sync.Map for session storage:

// From RFC: SessionManager
type sessionManager struct {
    sessions sync.Map  // bypasses the Storage interface
    factory  SessionFactory
}

So, basically, this creates a parallel session management path that doesn't go through transportsession.Manager or the Storage interface. The RFC says "TTL management and storage are handled by the existing session storage layer", but the proposed implementation stores sessions in its own sync.Map, not through the existing Storage interface.

This means when we implement Redis/Valkey for transport sessions, the vMCP session data managed by this new SessionManager won't be externalized. We'd need to separately refactor this sync.Map later, which is exactly the kind of rework the Storage interface was designed to prevent.

Suggestion: The SessionManager should delegate session storage to transportsession.Manager (which already uses the pluggable Storage interface) rather than maintaining its own sync.Map. The existing Manager.AddSession(), Manager.Get(), and Manager.Delete() methods already provide exactly what's needed here.

How do the two Session interfaces compose?

The RFC proposes a new Session interface (behavior-oriented: CallTool(), ReadResource(), etc.) while we already have transportsession.Session (storage-oriented: ID(), Type(), Touch(), GetData()). These serve different purposes, but the RFC doesn't show how they compose.

Note that currently VMCPSession embeds StreamableSession which implements transportsession.Session. That's what makes it compatible with the Storage interface and Manager. The new behavior-oriented Session needs to either:

Embed/implement transportsession.Session so it remains compatible with the existing storage layer, or
Clearly document the two-layer model: transport session (storage) wraps/references domain session (behavior)

Without this, the new Session type won't be storable through the Storage interface, which breaks the externalization path.

Partial initialization and serializability

The RFC allows sessions to continue with partial backends (some failed during MakeSession()). If we later add Redis/Valkey persistence, a session restored on a different instance wouldn't know which backends originally failed vs. which were never attempted. The session state model needs to account for this if it's going to be serializable.

What I think should change

Wire through transportsession.Manager instead of a standalone sync.Map. The Manager already supports custom factories and pluggable storage.
Show the composition model: How does the new behavior-oriented Session relate to transportsession.Session? Does it embed it? Wrap it? Replace it?
Add a section on session serializability: What part of the new Session is metadata (storable in Redis/Valkey) vs. runtime state (client connections, goroutines that must stay in-memory)? Issue stacklok/toolhive#2417 already sketched this dual-layer model.

The RFC solves a real problem and the domain-object approach is the right direction. I just want to make sure we don't accidentally create architectural debt that blocks the Redis/Valkey externalization path we've been building toward.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jerm-dro · 2026-02-13T02:20:55Z

@JAORMX Thanks for the review and context around transportsession.Manager!

I'm not opposed to preserving the existing session storage mechanism as part of this work. The main thrust of this RFC is encapsulating the session concept and behavior into a coherent domain object, and, based on my read of the code, that can still be accomplished without changing the storage mechanism.

That said, it's worth acknowledging that we're currently paying a cognitive cost to maintain the storage abstraction layer ahead of a distributed backend that doesn't exist yet. I'd be curious to hear your thoughts on the timeline for that work, since it would help us calibrate how much we invest in preserving compatibility with it right now vs. keeping things simple.

In the future, I also think it's worth exploring whether we can simplify the transportsession.Manager API. For example, rather than the current get/set pattern on the Manager, an alternative factoring could have the Session object become the sole owner of its state and expose a Marshal() method that produces a serialized representation on demand. That would make the ownership of data clearer and the storage layer thinner. But that's beyond the scope of this RFC and something we can think through at a later date. When we design this in earnest, we should think deeper about what state can change, when it changes how we signal to update its stored representation, etc.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/THV-0038-session-scoped-client-lifecycle.md

Copilot · 2026-02-13T12:41:59Z

rfcs/THV-0038-session-scoped-client-lifecycle.md

+   - **Bounded concurrency**: Limit parallel goroutines (e.g., 10 concurrent initializations) to avoid resource exhaustion
+   - **Per-backend timeout**: Apply context timeout (e.g., 5s per backend) so one slow backend doesn't block session creation
+   - **Partial initialization**: If some backends fail, log warnings and continue with successfully initialized backends (failed backends not added to clients map)


The parallel client initialization uses bounded concurrency (e.g., 10 concurrent initializations) and per-backend timeouts (e.g., 5s), but these specific values are provided as examples without justification. The optimal values depend on expected network latency, backend responsiveness, and resource constraints. Consider providing guidance on how to tune these parameters based on deployment characteristics, or making them configurable. Additionally, document the trade-offs: higher concurrency reduces session creation latency but increases resource usage and potential backend load spikes.

Suggested change

- **Bounded concurrency**: Limit parallel goroutines (e.g., 10 concurrent initializations) to avoid resource exhaustion

- **Per-backend timeout**: Apply context timeout (e.g., 5s per backend) so one slow backend doesn't block session creation

- **Partial initialization**: If some backends fail, log warnings and continue with successfully initialized backends (failed backends not added to clients map)

- **Bounded concurrency**: Limit parallel goroutines (for example, 10 concurrent initializations) to avoid resource exhaustion.

- These values **MUST be configurable** (e.g., via server configuration or environment variables), not hard-coded constants in the implementation.

- Tuning guidance:

- Start with a baseline such as `min(numBackends, 8–16)` and adjust based on CPU usage, memory pressure, backend rate limits, and startup latency SLOs.

- Higher concurrency reduces end-to-end session creation latency but increases CPU/memory usage and can cause load spikes or connection storms against backends.

- Lower concurrency smooths backend load and reduces resource contention at the cost of slower session initialization when many backends are configured.

- **Per-backend timeout**: Apply a per-backend context timeout (for example, 5s per backend) so one slow or hung backend does not block overall session creation.

- This timeout SHOULD be deployment-specific: shorter in low-latency, co-located deployments; longer in high-latency or intermittently slow environments.

- Consider setting the per-backend timeout relative to observed p95–p99 connection + handshake latency (e.g., `2–3x p99`) and revisiting it as telemetry is collected in production.

- Trade-off: shorter timeouts improve resilience and responsiveness to unhealthy backends but may prematurely drop slow-yet-healthy backends; longer timeouts improve tolerance for slow backends but increase worst-case session startup time.

- **Partial initialization**: If some backends fail (e.g., time out or error during client creation), log warnings and continue with successfully initialized backends (failed backends are not added to the clients map).

Copilot · 2026-02-13T12:42:00Z

rfcs/THV-0038-session-scoped-client-lifecycle.md

+1. **`Generate()` phase**: Creates empty session (UUID only), always succeeds
+   - Stores empty session in map
+   - Returns session ID to SDK
+   - SDK sends `Mcp-Session-Id` header in `InitializeResult`
+
+2. **`OnRegisterSession` hook phase**: Calls `SessionFactory.MakeSession()` - failure possible here
+   - If `MakeSession()` fails completely (e.g., all backends unreachable):
+     - Log error with session ID and failure details
+     - Create session with **empty capabilities** (no tools, resources, or prompts)
+     - SDK registers empty capability list, client sees "no tools available"
+     - Keep session in map (allows graceful handling without special error checks)
+   - Client experience:
+     - Session exists but has no capabilities
+     - Client can query session but sees empty tool list
+     - Client may delete session and re-initialize, or backend recovery may populate capabilities later
+
+**Rationale**: The two-phase creation pattern (empty session + populate via hook) is necessary because the SDK's `Generate()` must return a session ID. Additionally, the SDK's `OnRegisterSession` hook does not allow returning an error, so failures during `MakeSession()` cannot be propagated directly. Instead of storing initialization errors that require checks in every method (`InitError()` pattern is error-prone - easy to forget checks when adding new methods), we create sessions with empty capabilities. Failed backends don't advertise tools/resources, so clients never try to call them. This is simpler and safer than requiring defensive `InitError()` checks throughout the codebase.
+


The error handling section describes creating sessions with empty capabilities when MakeSession fails, but this approach could lead to confusing user experience. A client successfully creates a session but then finds no tools available, without clear indication that initialization failed. Consider enhancing the error handling to provide explicit feedback mechanisms, such as adding a session status field that clients can query, or including initialization errors in the session metadata that can be retrieved via a dedicated endpoint.

Suggested change

1. **`Generate()` phase**: Creates empty session (UUID only), always succeeds

- Stores empty session in map

- Returns session ID to SDK

- SDK sends `Mcp-Session-Id` header in `InitializeResult`

2. **`OnRegisterSession` hook phase**: Calls `SessionFactory.MakeSession()` - failure possible here

- If `MakeSession()` fails completely (e.g., all backends unreachable):

- Log error with session ID and failure details

- Create session with **empty capabilities** (no tools, resources, or prompts)

- SDK registers empty capability list, client sees "no tools available"

- Keep session in map (allows graceful handling without special error checks)

- Client experience:

- Session exists but has no capabilities

- Client can query session but sees empty tool list

- Client may delete session and re-initialize, or backend recovery may populate capabilities later

**Rationale**: The two-phase creation pattern (empty session + populate via hook) is necessary because the SDK's `Generate()` must return a session ID. Additionally, the SDK's `OnRegisterSession` hook does not allow returning an error, so failures during `MakeSession()` cannot be propagated directly. Instead of storing initialization errors that require checks in every method (`InitError()` pattern is error-prone - easy to forget checks when adding new methods), we create sessions with empty capabilities. Failed backends don't advertise tools/resources, so clients never try to call them. This is simpler and safer than requiring defensive `InitError()` checks throughout the codebase.

1. **`Generate()` phase**: Creates an empty session shell (UUID only), always succeeds

- Stores session in map with status `initializing`

- Returns session ID to SDK

- SDK sends `Mcp-Session-Id` header in `InitializeResult`

2. **`OnRegisterSession` hook phase**: Calls `SessionFactory.MakeSession()` - failure possible here

- On **success**:

- Populate session capabilities (tools, resources, prompts, clients map, etc.)

- Update session status to `ready`

- Optionally store a brief "initialized at <timestamp>" metadata record

- If `MakeSession()` fails completely (e.g., all backends unreachable):

- Log error with session ID and failure details

- Do **not** attempt to fabricate capabilities; leave capabilities empty

- Update session status to `failed`

- Store an initialization error record in session metadata, e.g.:

- `error_message` (human-readable summary)

- optional `error_code` / `backend_errors` for structured inspection

- Keep session in map (so the ID remains valid and can be inspected/deleted)

**Client experience for failed initialization**:

- Session exists but is marked as `failed`

- Capabilities list is empty (no tools, resources, or prompts)

- A dedicated "session inspection" surface exposes status + error metadata, e.g.:

- via an existing `GetSession`/`ListSessions` API augmented with `status` and `init_error`

- or via a small MCP-specific endpoint / tool that returns session status + error info

- SDK clients / UIs are expected to:

- Check the session status when they observe an empty capability list

- Surface the initialization error message to users

- Allow users (or automation) to delete and re-initialize the session once backends recover

**Rationale**: The two-phase creation pattern (empty session shell + populate via hook) is necessary because the SDK's `Generate()` must return a session ID, and the SDK's `OnRegisterSession` hook does not allow returning an error, so failures during `MakeSession()` cannot be propagated directly in-band. Instead of silently treating initialization failures as "normal" sessions with empty capabilities, we centralize failure handling in the session's `status` and initialization error metadata. This avoids the `InitError()` pattern (which requires checks in every method and is easy to forget when adding new methods), while still giving clients an explicit, queryable signal that initialization failed. Failed backends don't advertise tools/resources, so clients never try to call them, and the additional status+metadata allow UIs and automated clients to distinguish "no tools configured" from "initialization failed."

Copilot · 2026-02-13T12:42:00Z

rfcs/THV-0038-session-scoped-client-lifecycle.md

+   - Store a cryptographic hash of the original authentication token (e.g., `SHA256(bearerToken)`) in the session during creation
+   - On each request, validate that the current auth token hash matches the session's bound token hash
+   - If mismatch, reject with "session authentication mismatch" error and terminate session
+   - This prevents stolen session IDs from being used with different credentials
+


The session hijacking prevention section recommends storing a cryptographic hash of the authentication token (e.g., SHA256(bearerToken)), but doesn't address token rotation scenarios. If a client's authentication token is legitimately refreshed (e.g., OAuth token refresh), the hash would no longer match, causing session validation to fail even for the legitimate client. Consider designing the token binding mechanism to support token refresh patterns, such as storing multiple valid token hashes or implementing a token family concept.

Suggested change

- Store a cryptographic hash of the original authentication token (e.g., `SHA256(bearerToken)`) in the session during creation

- On each request, validate that the current auth token hash matches the session's bound token hash

- If mismatch, reject with "session authentication mismatch" error and terminate session

- This prevents stolen session IDs from being used with different credentials

- Store a **token binding record** in the session during creation, which includes a cryptographic hash of the initial authentication token (e.g., `SHA256(bearerToken)`)

- Support **token refresh/rotation** by allowing the token binding record to track a small set of non-expired token hashes (e.g., the current token plus one or two recently rotated tokens with short grace periods)

- On each request, validate that the hash of the current auth token matches one of the non-expired hashes in the token binding record; if it matches a new token, update the record to rotate out old hashes

- If no match is found, reject with a "session authentication mismatch" error and terminate the session

- This prevents stolen session IDs from being used with different credentials while remaining compatible with OAuth-style token refresh and rotation

Copilot · 2026-02-13T12:42:01Z

rfcs/THV-0038-session-scoped-client-lifecycle.md

+- **Note**: Phase 1 will be done incrementally across multiple PRs if necessary, reusing existing implementation pieces. This allows us to introduce the bulk of the code without having to worry about refactoring the existing system.
+


The Phase 1 implementation mentions "reusing existing implementation pieces" and doing work "incrementally across multiple PRs if necessary," but doesn't provide clear guidance on what can be reused versus what needs to be written fresh. This ambiguity could lead to implementation inconsistencies or unnecessary refactoring. Consider providing a more detailed breakdown of which existing components can be leveraged (e.g., aggregator, client factory) and which require new implementations.

Suggested change

- **Note**: Phase 1 will be done incrementally across multiple PRs if necessary, reusing existing implementation pieces. This allows us to introduce the bulk of the code without having to worry about refactoring the existing system.

- **Execution model**: Phase 1 will be done incrementally across multiple PRs. Each PR should be strictly additive (no behavior changes for existing session types) and should keep the old and new paths side‑by‑side.

- **Reused components**:

- Existing session storage implementations used by `transportsession.Manager` (e.g., in‑memory/local, Redis), including their TTL/`Touch()` behavior and expiration cleanup.

- Existing client creation code paths where possible (e.g., HTTP/backend client factory, aggregators, or other shared client wiring logic), wrapped or adapted inside the new `SessionFactory` rather than re‑implemented.

- Existing logging, metrics, and configuration loading mechanisms for sessions and backends.

- **New components**:

- Domain‑level `Session` type, `SessionFactory`, and `SessionManager` interfaces plus their initial concrete implementations.

- Wiring from `SessionManager` into the existing MCP SDK integration points (e.g., hooks that call `SessionFactory.MakeSession()` and map SDK session IDs to stored domain sessions).

- Any glue code needed to adapt existing client factories/aggregators into the new session‑scoped lifecycle (without changing their external behavior).

Copilot · 2026-02-13T12:42:01Z

rfcs/THV-0038-session-scoped-client-lifecycle.md

+- **Owns backend clients**: Session stores pre-initialized MCP clients in an internal map
+- **Encapsulates routing**: `CallTool()` looks up tool in routing table, routes to correct backend client
+- **Manageable lifetime**: `Close()` cleans up clients and any other resources. The SessionManager/caller is decoupled from what exactly happens on close()
+- **Thread-safe**: All methods protected by internal RWMutex for concurrent access to prevent data races on internal state. Read operations (Tools, Resources, Prompts, CallTool, ReadResource, GetPrompt) use read locks; write operations (Close) use write locks. Methods like `Tools()`, `Resources()`, `Prompts()` return defensive copies (not references to internal slices) to prevent caller mutations. Internal maps and slices are never exposed directly. Note: The RWMutex prevents data races but does not prevent `CallTool()` from using a closed client after `Close()` completes—the session implementation must track closed state explicitly (see "Client Closed Mid-Request" in Error Handling). Proper encapsulation may allow removal of mutexes elsewhere (e.g., SessionManager, current VMCPSession) since state is now owned by the Session


The thread safety section (line 299) states that "Proper encapsulation may allow removal of mutexes elsewhere (e.g., SessionManager, current VMCPSession)," but this is speculative without concrete analysis. Removing mutexes in other components could introduce race conditions if those components have concurrent access patterns not covered by Session's internal locking. This statement should either be removed, made conditional ("may enable future investigation into..."), or backed by specific analysis of where mutex removal is safe.

Suggested change

- **Thread-safe**: All methods protected by internal RWMutex for concurrent access to prevent data races on internal state. Read operations (Tools, Resources, Prompts, CallTool, ReadResource, GetPrompt) use read locks; write operations (Close) use write locks. Methods like `Tools()`, `Resources()`, `Prompts()` return defensive copies (not references to internal slices) to prevent caller mutations. Internal maps and slices are never exposed directly. Note: The RWMutex prevents data races but does not prevent `CallTool()` from using a closed client after `Close()` completes—the session implementation must track closed state explicitly (see "Client Closed Mid-Request" in Error Handling). Proper encapsulation may allow removal of mutexes elsewhere (e.g., SessionManager, current VMCPSession) since state is now owned by the Session

- **Thread-safe**: All methods protected by internal RWMutex for concurrent access to prevent data races on internal state. Read operations (Tools, Resources, Prompts, CallTool, ReadResource, GetPrompt) use read locks; write operations (Close) use write locks. Methods like `Tools()`, `Resources()`, `Prompts()` return defensive copies (not references to internal slices) to prevent caller mutations. Internal maps and slices are never exposed directly. Note: The RWMutex prevents data races but does not prevent `CallTool()` from using a closed client after `Close()` completes—the session implementation must track closed state explicitly (see "Client Closed Mid-Request" in Error Handling). Proper encapsulation confines concurrency concerns within the Session and may enable future investigation into whether mutex usage in other components (e.g., SessionManager, current VMCPSession) can be simplified, but this RFC does not prescribe or assume any such changes.

rfcs/THV-0038-session-scoped-client-lifecycle.md

JAORMX

Thanks for addressing the feedback, Yolanda! The integration with transportsession.Manager (instead of a parallel sync.Map), the dual-layer storage model (metadata vs. runtime), and the composition model (defaultSession embedding transportsession.Session) all look good. This preserves the Redis/Valkey externalization path we built out in #1677/#1989/#1770/#1771 while delivering the session encapsulation improvements.

LGTM!

jerm-dro

The changes motivated by others' feedback LGTM.

yrobla requested a review from Copilot February 4, 2026 13:31

Copilot started reviewing on behalf of yrobla February 4, 2026 13:32 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

yrobla force-pushed the session-scoped-mcp-client branch from f935166 to 6e503d4 Compare February 4, 2026 13:40

Add RFC: Session-scoped MCP client lifecycle management

4f970e6

The RFC captures the complete architectural vision from the discussion, providing a roadmap for simplifying the client pooling implementation while maintaining all current functionality.

yrobla force-pushed the session-scoped-mcp-client branch from 6e503d4 to 4f970e6 Compare February 4, 2026 13:46

yrobla requested a review from Copilot February 4, 2026 13:49

Copilot started reviewing on behalf of yrobla February 4, 2026 13:49 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

yrobla force-pushed the session-scoped-mcp-client branch 3 times, most recently from 840b512 to ddf4273 Compare February 4, 2026 15:26

yrobla requested a review from Copilot February 4, 2026 15:26

Copilot started reviewing on behalf of yrobla February 4, 2026 15:26 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

yrobla force-pushed the session-scoped-mcp-client branch 2 times, most recently from 50f4742 to ec70d08 Compare February 4, 2026 15:51

yrobla requested a review from Copilot February 4, 2026 15:52

Copilot started reviewing on behalf of yrobla February 4, 2026 15:52 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

yrobla force-pushed the session-scoped-mcp-client branch 2 times, most recently from 54dd1b4 to d714819 Compare February 4, 2026 16:28

jerm-dro reviewed Feb 4, 2026

View reviewed changes

rfcs/THV-0038-session-scoped-client-lifecycle.md Show resolved Hide resolved

This was referenced Feb 5, 2026

vMCP: Implement backend session persistence with client pooling stacklok/toolhive#3062

Open

Create an RFC for vMCP session management architecture improvements #39

Open

jerm-dro reviewed Feb 6, 2026

View reviewed changes

rfcs/THV-0038-session-scoped-client-lifecycle.md Show resolved Hide resolved

yrobla force-pushed the session-scoped-mcp-client branch from d698c3e to 9654f19 Compare February 9, 2026 15:16

ChrisJBurns reviewed Feb 11, 2026

View reviewed changes

JAORMX reviewed Feb 12, 2026

View reviewed changes

fixes from review

1d0cbe0

yrobla force-pushed the session-scoped-mcp-client branch from 6f2f6a9 to d79bb82 Compare February 12, 2026 10:07

yrobla requested a review from Copilot February 12, 2026 10:07

Copilot started reviewing on behalf of yrobla February 12, 2026 10:08 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

yrobla force-pushed the session-scoped-mcp-client branch 2 times, most recently from b05416f to 63c5d62 Compare February 12, 2026 10:20

yrobla requested review from ChrisJBurns, JAORMX, Copilot and jerm-dro February 12, 2026 10:21

Copilot started reviewing on behalf of yrobla February 12, 2026 10:22 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

rfcs/THV-0038-session-scoped-client-lifecycle.md Outdated Show resolved Hide resolved

yrobla force-pushed the session-scoped-mcp-client branch from 63c5d62 to 6d03d71 Compare February 12, 2026 10:45

yrobla requested a review from Copilot February 12, 2026 10:46

Copilot started reviewing on behalf of yrobla February 12, 2026 10:46 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

yrobla force-pushed the session-scoped-mcp-client branch from 6d03d71 to 6bcdb92 Compare February 12, 2026 10:53

yrobla requested a review from Copilot February 13, 2026 12:37

Copilot started reviewing on behalf of yrobla February 13, 2026 12:37 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

JAORMX previously approved these changes Feb 13, 2026

View reviewed changes

fixes from review

1f73b30

yrobla dismissed JAORMX’s stale review via 1f73b30 February 13, 2026 12:51

yrobla force-pushed the session-scoped-mcp-client branch from 6bcdb92 to 1f73b30 Compare February 13, 2026 12:51

yrobla mentioned this pull request Feb 13, 2026

Per-session prompt tracking in vMCP stacklok/toolhive#2841

Open

jerm-dro approved these changes Feb 13, 2026

View reviewed changes

-   - **Bounded concurrency**: Limit parallel goroutines (e.g., 10 concurrent initializations) to avoid resource exhaustion
-   - **Per-backend timeout**: Apply context timeout (e.g., 5s per backend) so one slow backend doesn't block session creation
-   - **Partial initialization**: If some backends fail, log warnings and continue with successfully initialized backends (failed backends not added to clients map)
+   - **Bounded concurrency**: Limit parallel goroutines (for example, 10 concurrent initializations) to avoid resource exhaustion.
+     - These values **MUST be configurable** (e.g., via server configuration or environment variables), not hard-coded constants in the implementation.
+     - Tuning guidance:
+       - Start with a baseline such as `min(numBackends, 8–16)` and adjust based on CPU usage, memory pressure, backend rate limits, and startup latency SLOs.
+       - Higher concurrency reduces end-to-end session creation latency but increases CPU/memory usage and can cause load spikes or connection storms against backends.
+       - Lower concurrency smooths backend load and reduces resource contention at the cost of slower session initialization when many backends are configured.
+   - **Per-backend timeout**: Apply a per-backend context timeout (for example, 5s per backend) so one slow or hung backend does not block overall session creation.
+     - This timeout SHOULD be deployment-specific: shorter in low-latency, co-located deployments; longer in high-latency or intermittently slow environments.
+     - Consider setting the per-backend timeout relative to observed p95–p99 connection + handshake latency (e.g., `2–3x p99`) and revisiting it as telemetry is collected in production.
+     - Trade-off: shorter timeouts improve resilience and responsiveness to unhealthy backends but may prematurely drop slow-yet-healthy backends; longer timeouts improve tolerance for slow backends but increase worst-case session startup time.
+   - **Partial initialization**: If some backends fail (e.g., time out or error during client creation), log warnings and continue with successfully initialized backends (failed backends are not added to the clients map).

-   - Store a cryptographic hash of the original authentication token (e.g., `SHA256(bearerToken)`) in the session during creation
-   - On each request, validate that the current auth token hash matches the session's bound token hash
-   - If mismatch, reject with "session authentication mismatch" error and terminate session
-   - This prevents stolen session IDs from being used with different credentials
+   - Store a **token binding record** in the session during creation, which includes a cryptographic hash of the initial authentication token (e.g., `SHA256(bearerToken)`)
+   - Support **token refresh/rotation** by allowing the token binding record to track a small set of non-expired token hashes (e.g., the current token plus one or two recently rotated tokens with short grace periods)
+   - On each request, validate that the hash of the current auth token matches one of the non-expired hashes in the token binding record; if it matches a new token, update the record to rotate out old hashes
+   - If no match is found, reject with a "session authentication mismatch" error and terminate the session
+   - This prevents stolen session IDs from being used with different credentials while remaining compatible with OAuth-style token refresh and rotation

		- Note: Phase 1 will be done incrementally across multiple PRs if necessary, reusing existing implementation pieces. This allows us to introduce the bulk of the code without having to worry about refactoring the existing system.

-- **Note**: Phase 1 will be done incrementally across multiple PRs if necessary, reusing existing implementation pieces. This allows us to introduce the bulk of the code without having to worry about refactoring the existing system.
+- **Execution model**: Phase 1 will be done incrementally across multiple PRs. Each PR should be strictly additive (no behavior changes for existing session types) and should keep the old and new paths side‑by‑side.
+- **Reused components**:
+  - Existing session storage implementations used by `transportsession.Manager` (e.g., in‑memory/local, Redis), including their TTL/`Touch()` behavior and expiration cleanup.
+  - Existing client creation code paths where possible (e.g., HTTP/backend client factory, aggregators, or other shared client wiring logic), wrapped or adapted inside the new `SessionFactory` rather than re‑implemented.
+  - Existing logging, metrics, and configuration loading mechanisms for sessions and backends.
+- **New components**:
+  - Domain‑level `Session` type, `SessionFactory`, and `SessionManager` interfaces plus their initial concrete implementations.
+  - Wiring from `SessionManager` into the existing MCP SDK integration points (e.g., hooks that call `SessionFactory.MakeSession()` and map SDK session IDs to stored domain sessions).
+  - Any glue code needed to adapt existing client factories/aggregators into the new session‑scoped lifecycle (without changing their external behavior).

Conversation

yrobla commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerm-dro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JAORMX left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

What we already have in place

The parallel storage path problem

How do the two Session interfaces compose?

Partial initialization and serializability

What I think should change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

jerm-dro commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

yrobla commented Feb 4, 2026 •

edited

Loading

JAORMX left a comment •

edited

Loading