stacklok · rdimitrov · May 21, 2026 · May 19, 2026 · May 19, 2026 · May 19, 2026
diff --git a/.github/upstream-projects.yaml b/.github/upstream-projects.yaml
@@ -35,7 +35,7 @@ projects:
 
   - id: toolhive
     repo: stacklok/toolhive
-    version: v0.27.2
+    version: v0.28.0
     # toolhive is a monorepo covering the CLI, the Kubernetes
     # operator, and the vMCP gateway. It also introduces cross-
     # cutting features that land in concepts/, integrations/,

diff --git a/docs/toolhive/faq.mdx b/docs/toolhive/faq.mdx
@@ -240,6 +240,21 @@ export TOOLHIVE_USAGE_METRICS_ENABLED=false
 Once you opt out, ToolHive stops collecting and sending usage metrics. You need
 to restart any running servers for the change to take effect.
 
+### How do I disable update checks?
+
+ToolHive periodically checks for new versions. To disable this check (and the
+usage-metrics collection it gates), set the `TOOLHIVE_SKIP_UPDATE_CHECK`
+environment variable to `true`:
+
+```bash
+export TOOLHIVE_SKIP_UPDATE_CHECK=true
+```
+
+The setting is honored by the CLI, the API server, and the Kubernetes operator
+telemetry service. For the operator, add it to the `operator.env` list in your
+Helm values. Update checks are also skipped automatically when ToolHive detects
+a CI environment.
+
 ## Security and permissions
 
 ### Is it safe to run MCP servers?

diff --git a/docs/toolhive/guides-cli/run-mcp-servers.mdx b/docs/toolhive/guides-cli/run-mcp-servers.mdx
@@ -248,6 +248,19 @@ specific proxy port instead, use the `--proxy-port` flag:
 thv run --proxy-port <PORT_NUMBER> <SERVER>
 ```
 
+### Override the session timeout
+
+ToolHive's proxy evicts idle MCP sessions after 2 hours by default. To raise or
+lower this inactivity timeout for a workload, pass `--session-ttl` with a Go
+duration string:
+
+```bash
+thv run --session-ttl 4h <SERVER>
+```
+
+Set a longer value when clients hold sessions open for long-running operations,
+or a shorter value to free resources faster.
+
 ### Run a server exposing only selected tools
 
 ToolHive can filter the tools returned to the client as result of a `tools/list`

diff --git a/docs/toolhive/guides-k8s/auth-k8s.mdx b/docs/toolhive/guides-k8s/auth-k8s.mdx
@@ -779,12 +779,78 @@ from the configured endpoint, and the `fieldMapping` section maps
 provider-specific response fields to standard user identity fields (for example,
 GitHub returns `login` instead of the standard `name` field).
 
-When you omit `userInfo`, the embedded auth server runs in synthesis mode for
-this upstream: it derives a non-personally-identifying subject (with a `tk-`
-prefix) from the access token and leaves `name` and `email` empty. Use this
-configuration for OAuth 2.0 servers that don't expose a userinfo endpoint, such
-as MCP authorization servers that comply with the
+When you omit `userInfo` and `identityFromToken`, the embedded auth server runs
+in synthesis mode for this upstream: it derives a non-personally-identifying
+subject (with a `tk-` prefix) from the access token and leaves `name` and
+`email` empty. Use this configuration for OAuth 2.0 servers that don't expose a
+userinfo endpoint and don't return identity in the token response, such as MCP
+authorization servers that comply with the
 [MCP authorization specification](https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization).
+For OAuth 2.0 servers that return identity in the token response itself, see
+[Extract identity from the token response](#extract-identity-from-the-token-response).
+
+:::
+
+### Extract identity from the token response
+
+Some providers don't expose a userinfo endpoint but return user identity in the
+OAuth 2.0 token response itself. For these providers, set `identityFromToken` on
+`oauth2Config` instead of `userInfo`. The embedded auth server then skips the
+userinfo HTTP call and extracts identity from the token response body using
+[gjson dot-notation paths](https://github.com/tidwall/gjson#path-syntax):
+`username` extracts a top-level field, `authed_user.id` extracts a nested field,
+and the pipe operator chains modifiers like `@upstreamjwt`.
+
+For example, Slack's `oauth.v2.access` response includes the authenticated user
+ID at `authed_user.id`:
+
+```yaml title="oauth2Config snippet for Slack"
+oauth2Config:
+  # highlight-start
+  identityFromToken:
+    subjectPath: authed_user.id
+  # highlight-end
+```
+
+Snowflake returns the authenticated login name as a top-level `username` field
+in every authorization-code grant response, and does not expose a userinfo
+endpoint:
+
+```yaml title="oauth2Config snippet for Snowflake"
+oauth2Config:
+  # highlight-start
+  identityFromToken:
+    subjectPath: username
+    namePath: username
+  # highlight-end
+```
+
+For providers whose token response embeds identity inside a JWT-shaped access
+token, the `@upstreamjwt` modifier decodes the JWT payload so subsequent path
+segments can drill into it:
+
+```yaml title="oauth2Config snippet for JWT-embedded identity"
+oauth2Config:
+  # highlight-start
+  identityFromToken:
+    subjectPath: 'access_token|@upstreamjwt|sub'
+  # highlight-end
+```
+
+`subjectPath` is required; `namePath` and `emailPath` are optional. Omit
+`namePath` and `emailPath` rather than setting them to empty strings.
+
+If you set both `identityFromToken` and `userInfo`, `identityFromToken` takes
+precedence and the userinfo HTTP call is skipped. If `identityFromToken` is set
+and extraction fails (path missing or unexpected type), authentication fails for
+that login attempt. There is no fallback to `userInfo`.
+
+:::warning[Trust model]
+
+Claims read from the token response are trusted via TLS only and are not
+cryptographically verified. The `@upstreamjwt` modifier decodes the JWT payload
+without verifying its signature. Prefer OIDC ID tokens when you need
+cryptographically verifiable claims.
 
 :::
 

diff --git a/docs/toolhive/guides-k8s/rate-limiting.mdx b/docs/toolhive/guides-k8s/rate-limiting.mdx
@@ -1,14 +1,14 @@
 ---
 title: Rate limiting
 description:
-  Configure per-user and shared rate limits on MCPServer resources to prevent
-  noisy neighbors and protect downstream services.
+  Configure per-user and shared rate limits on MCPServer and VirtualMCPServer
+  resources to prevent noisy neighbors and protect downstream services.
 ---
 
-Configure token bucket rate limits on MCPServer resources to control how many
-tool invocations users can make. Rate limiting prevents individual users from
-monopolizing shared servers and protects downstream services from traffic
-spikes.
+Configure token bucket rate limits on MCPServer and VirtualMCPServer resources
+to control how many tool invocations users can make. Rate limiting prevents
+individual users from monopolizing shared servers and protects downstream
+services from traffic spikes.
 
 ToolHive supports two scopes of rate limiting:
 
@@ -219,6 +219,55 @@ In this example:
   also count toward the 100 server-level limit).
 - All users combined can make 50 `shared_resource` calls per minute.
 
+## Rate limit a VirtualMCPServer
+
+VirtualMCPServer resources accept the same rate limit shape under
+`spec.config.rateLimiting`. The fields and token bucket semantics match the
+MCPServer examples above, but the prerequisites are stricter:
+
+- `spec.sessionStorage.provider` must be `redis`. The CRD rejects any
+  `rateLimiting` configuration without Redis-backed session storage.
+- `spec.incomingAuth.type` must be `oidc` when you configure any per-user
+  bucket - either at the server level or on a per-tool override.
+
+A request must pass both the server-level vMCP limit and the per-tool limit (if
+defined). Limits apply to the vMCP aggregator and are independent from any
+limits configured on the backend MCPServers it routes to.
+
+```yaml title="vmcp-ratelimit.yaml"
+apiVersion: toolhive.stacklok.dev/v1beta1
+kind: VirtualMCPServer
+metadata:
+  name: shared-toolkit
+  namespace: toolhive-system
+spec:
+  groupRef:
+    name: my-backends
+  incomingAuth:
+    type: oidc
+    oidcConfigRef:
+      name: my-oidc-config
+      audience: shared-toolkit
+  sessionStorage:
+    provider: redis
+    address: <YOUR_REDIS_ADDRESS>
+  config:
+    # highlight-start
+    rateLimiting:
+      shared:
+        maxTokens: 5000
+        refillPeriod: 1m0s
+      perUser:
+        maxTokens: 200
+        refillPeriod: 1m0s
+      tools:
+        - name: expensive_search
+          perUser:
+            maxTokens: 20
+            refillPeriod: 1m0s
+    # highlight-end
+```
+
 ## Next steps
 
 - [Token exchange](./token-exchange-k8s.mdx) to configure token exchange for

diff --git a/docs/toolhive/guides-vmcp/authentication.mdx b/docs/toolhive/guides-vmcp/authentication.mdx
@@ -491,6 +491,17 @@ at `authed_user.access_token`). Add a `tokenResponseMapping` block to the
 
 :::
 
+:::tip[Identity in the token response]
+
+When an upstream returns user identity in the token response itself (Slack
+returns it at `authed_user.id`; Snowflake embeds it in the access-token JWT),
+set `identityFromToken` on the `oauth2Config` with gjson dot-notation paths for
+`subjectPath` (required), `namePath`, and `emailPath`. See
+[Extract identity from the token response](../guides-k8s/auth-k8s.mdx#extract-identity-from-the-token-response)
+for the full pattern and trust-model caveats.
+
+:::
+
 ### Incoming auth with the embedded auth server
 
 When using the embedded auth server, configure `incomingAuth` to validate the

diff --git a/docs/toolhive/guides-vmcp/local-cli.mdx b/docs/toolhive/guides-vmcp/local-cli.mdx
@@ -272,6 +272,7 @@ All `thv vmcp` flags, with their defaults:
 | `--optimizer-embedding` | `false`                                                    | Enable Tier 2 semantic optimizer (implies `--optimizer`)             |
 | `--embedding-model`     | `BAAI/bge-small-en-v1.5`                                   | HuggingFace model name for the managed TEI container                 |
 | `--embedding-image`     | `ghcr.io/huggingface/text-embeddings-inference:cpu-latest` | TEI container image                                                  |
+| `--session-ttl`         | `30m`                                                      | Session inactivity timeout as a Go duration (`30m`, `2h`, `168h`)    |
 
 ### `thv vmcp init`
 

diff --git a/docs/toolhive/guides-vmcp/scaling-and-performance.mdx b/docs/toolhive/guides-vmcp/scaling-and-performance.mdx
@@ -159,9 +159,11 @@ configure Redis session storage. Total capacity scales as `replicas × 1,000`.
 
 ### Session time-to-live (TTL)
 
-The vMCP server applies a **30-minute inactivity TTL** to session metadata. A
-session that receives no activity for 30 minutes expires, and the client must
-reinitialize it.
+The vMCP server applies a **30-minute inactivity TTL** to session metadata by
+default. A session that receives no activity for the TTL window expires, and the
+client must reinitialize it. When running locally with `thv vmcp serve`, pass
+`--session-ttl` (Go duration, for example `--session-ttl=2h`) to raise or lower
+this default.
 
 With Redis session storage, the TTL is a sliding window: every request
 atomically refreshes the key's expiry. Active sessions remain valid indefinitely

diff --git a/docs/toolhive/reference/cli/thv_client_register.md b/docs/toolhive/reference/cli/thv_client_register.md
@@ -28,6 +28,7 @@ Valid clients:
   - cline: VS Code Cline extension
   - codex: OpenAI Codex CLI
   - continue: Continue.dev IDE plugins
+  - copilot-cli: GitHub Copilot CLI
   - cursor: Cursor editor
   - factory: Factory.ai Droid CLI
   - gemini-cli: Google Gemini CLI

diff --git a/docs/toolhive/reference/cli/thv_client_remove.md b/docs/toolhive/reference/cli/thv_client_remove.md
@@ -28,6 +28,7 @@ Valid clients:
   - cline: VS Code Cline extension
   - codex: OpenAI Codex CLI
   - continue: Continue.dev IDE plugins
+  - copilot-cli: GitHub Copilot CLI
   - cursor: Cursor editor
   - factory: Factory.ai Droid CLI
   - gemini-cli: Google Gemini CLI

diff --git a/docs/toolhive/reference/cli/thv_run.md b/docs/toolhive/reference/cli/thv_run.md
@@ -178,6 +178,7 @@ thv run [flags] SERVER_OR_IMAGE_OR_PROTOCOL [-- ARGS...]
       --runtime-add-package stringArray             Add additional packages to install in the builder and runtime stages (can be repeated)
       --runtime-image string                        Override the default base image for protocol schemes (e.g., golang:1.24-alpine, node:20-alpine, python:3.11-slim)
       --secret stringArray                          Specify a secret to be fetched from the secrets manager and set as an environment variable (format: NAME,target=TARGET)
+      --session-ttl duration                        Session inactivity timeout (e.g., 30m, 2h); zero uses the default (2h)
       --stateless                                   Declare the server as stateless (POST-only, no SSE). Use for MCP servers implementing streamable-HTTP stateless mode.
       --target-host string                          Host to forward traffic to (only applicable to SSE or Streamable HTTP transport) (default "127.0.0.1")
       --target-port int                             Port for the container to expose (only applicable to SSE or Streamable HTTP transport)

diff --git a/docs/toolhive/reference/cli/thv_vmcp_serve.md b/docs/toolhive/reference/cli/thv_vmcp_serve.md
@@ -42,6 +42,7 @@ thv vmcp serve [flags]
       --optimizer                Enable FTS5 keyword optimizer (Tier 1): exposes find_tool and call_tool instead of all backend tools
       --optimizer-embedding      Enable managed TEI semantic optimizer (Tier 2); implies --optimizer
       --port int                 Port to listen on (default 4483)
+      --session-ttl duration     Session inactivity timeout (e.g., 30m, 2h); zero uses the default (30m)
 ```
 
 ### Options inherited from parent commands

diff --git a/docs/toolhive/reference/client-compatibility.mdx b/docs/toolhive/reference/client-compatibility.mdx
@@ -15,6 +15,7 @@ We've tested ToolHive with these clients:
 | Client                     | Supported | Auto-configuration | Skills support | Notes                                       |
 | -------------------------- | :-------: | :----------------: | :------------: | ------------------------------------------- |
 | GitHub Copilot (VS Code)   |    ✅     |         ✅         |       ✅       | v1.102+ or Insiders version ([see note][3]) |
+| GitHub Copilot CLI         |    ✅     |         ✅         |       ❌       |                                             |
 | Claude Code                |    ✅     |         ✅         |       ✅       | v1.0.27+                                    |
 | Cursor                     |    ✅     |         ✅         |       ✅       | v0.50.0+                                    |
 | Cline (VS Code)            |    ✅     |         ✅         |       ✅       | v3.17.10+                                   |
@@ -281,6 +282,28 @@ global MCP configuration file whenever you run an MCP server. You can also
 configure project-specific MCP servers by creating a
 `.continue/mcpServers/<name>.yaml` file in your project directory.
 
+### GitHub Copilot CLI
+
+The [GitHub Copilot CLI](https://docs.github.com/en/copilot/how-tos/copilot-cli)
+stores its MCP configuration in a JSON file in your home directory.
+
+- **All platforms**: `~/.copilot/mcp-config.json`
+
+Example configuration:
+
+```json
+{
+  "mcpServers": {
+    "github": { "url": "http://localhost:19046/mcp", "type": "http" },
+    "fetch": { "url": "http://localhost:43832/mcp", "type": "http" },
+    "sqlite": { "url": "http://localhost:51712/sse#sqlite", "type": "sse" }
+  }
+}
+```
+
+When you register the Copilot CLI as a client, ToolHive automatically updates
+this file whenever you run an MCP server.
+
 ## Manual configuration
 
 If your client doesn't support automatic configuration, you'll need to set up