Skip to content

feat: improve memory management#2365

Merged
jensneuse merged 63 commits intomainfrom
feat/improve-memory-management
Feb 6, 2026
Merged

feat: improve memory management#2365
jensneuse merged 63 commits intomainfrom
feat/improve-memory-management

Conversation

@jensneuse
Copy link
Copy Markdown
Member

@jensneuse jensneuse commented Nov 26, 2025

How to use this version

You have two ways:

a) Users with custom modules

Follow the router upgrade instructions here:
https://github.com/wundergraph/router-examples?tab=readme-ov-file#-upgrade-router

When referencing this PR from your branch, use the following commit hash:

e67bc7307d3fc15a483b026a02ec1530f4f0ff5e

b) Regular users

Use one of the following Docker images:

  • Root image
    ghcr.io/wundergraph/cosmo/router:pr-2365

  • Non-root image
    ghcr.io/wundergraph/cosmo/router:pr-2365-nonroot

Summary by CodeRabbit

  • New Features

    • Configurable header propagation across subgraphs; header-aware subscriptions and new inbound/request deduplication toggles (including force flags).
  • Performance

    • Increased default HTTP connection pooling and timeouts for higher concurrency.
  • Tests

    • Vastly expanded deduplication and subscription test coverage; added header-propagation unit tests and updated telemetry/metrics expectations.
  • Chores

    • Dependency upgrades and added telemetry attribute to mark resolver deduplicated requests; various test and cleanup refinements.

# Conflicts:
#	router/core/graph_server.go
#	router/core/graphql_prehandler.go
#	router/pkg/config/config.schema.json
# Conflicts:
#	router/core/graphql_prehandler.go
# Conflicts:
#	demo/pkg/subgraphs/employees/employees.go
#	router-tests/testenv/testdata/config.json
#	router-tests/testenv/testdata/configWithEdfs.json
#	router-tests/testenv/testdata/configWithPlugins.json
# Conflicts:
#	router/pkg/pubsub/kafka/engine_datasource.go
#	router/pkg/pubsub/kafka/engine_datasource_test.go
#	router/pkg/pubsub/nats/engine_datasource.go
#	router/pkg/pubsub/nats/engine_datasource_test.go
#	router/pkg/pubsub/redis/engine_datasource.go
#	router/pkg/pubsub/redis/engine_datasource_test.go
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 6, 2026

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added Stale and removed Stale labels Jan 21, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 5, 2026

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added Stale and removed Stale labels Feb 5, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router/core/router.go (1)

1934-1947: ⚠️ Potential issue | 🟡 Minor

Verify ResponseHeaderTimeout default won't break slow subgraphs.

The new 30s ResponseHeaderTimeout is a welcome safety net against indefinite hangs, but it's a behavioral change from the previous implicit 0 (no timeout). Subgraph queries that legitimately take >30s before sending the first response header will now fail with a timeout error, even though RequestTimeout allows 60s.

Users can override via per-subgraph config, but this could be a surprising default change for existing deployments with slow-responding subgraphs (e.g., complex aggregations, AI/ML backends).

Consider documenting this as a potentially breaking change in the release notes.

🧹 Nitpick comments (1)
router/core/http_server.go (1)

124-141: Consider documenting that SwapGraphServer must not be called after Shutdown.

The current "NOT SAFE FOR CONCURRENT USE" caveat addresses concurrent callers, but there's an additional lifecycle constraint: calling SwapGraphServer after Shutdown would overwrite the notReadyState sentinel with a live graph server, effectively reviving the server. If the caller enforces this ordering (which it likely does), a brief note would help future maintainers.

 // NOT SAFE FOR CONCURRENT USE.
+// Must not be called after Shutdown.
 func (s *server) SwapGraphServer(ctx context.Context, svr *graphServer) {

…anagement

# Conflicts:
#	demo/go.sum
#	demo/pkg/subgraphs/projects/generated/service.pb.go
#	router-tests/go.mod
#	router-tests/go.sum
#	router/core/graphql_handler.go
#	router/core/header_rule_engine.go
#	router/core/header_rule_engine_test.go
#	router/core/router.go
#	router/go.mod
#	router/go.sum
#	router/pkg/pubsub/kafka/engine_datasource.go
#	router/pkg/pubsub/kafka/engine_datasource_factory_test.go
#	router/pkg/pubsub/nats/engine_datasource.go
#	router/pkg/pubsub/nats/engine_datasource_factory_test.go
#	router/pkg/pubsub/redis/engine_datasource.go
#	router/pkg/pubsub/redis/engine_datasource_factory_test.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
router/pkg/mcpserver/server.go (1)

780-787: ⚠️ Potential issue | 🟡 Minor

Pre-existing bug: err is always nil when passed to NewToolResultErrorFromErr.

Inside the if err := json.Unmarshal(...); err == nil && ... block, the shadowed err is guaranteed to be nil (that's the condition for entering the block). Lines 781 and 787 pass this nil error to mcp.NewToolResultErrorFromErr, which does not append any error details when the error is nil, making the FromErr variant unnecessary. The errorMessage / combinedErrorMsg strings are already constructed — use mcp.NewToolResultError with the message string instead.

Suggested fix
 		if len(graphqlResponse.Data) == 0 || string(graphqlResponse.Data) == "null" {
-			return mcp.NewToolResultErrorFromErr("Response Error", err), nil
+			return mcp.NewToolResultError(errorMessage), nil
 		}

 		// If we have both errors and data, include data in the error message
 		dataString := string(graphqlResponse.Data)
 		combinedErrorMsg := fmt.Sprintf("Response error with partial success, Error: %s, Data: %s)", errorMessage, dataString)
-		return mcp.NewToolResultErrorFromErr(combinedErrorMsg, err), nil
+		return mcp.NewToolResultError(combinedErrorMsg), nil
🤖 Fix all issues with AI agents
In `@router/pkg/pubsub/nats/engine_datasource_factory_test.go`:
- Around line 176-183: The mock setup for mockAdapter.Request registers four
matchers but the Adapter.Request signature only has three parameters; update the
expectation on mockAdapter (the On("Request", ... ) call) to remove the trailing
mock.Anything so it supplies exactly three matchers for ctx, cfg (MatchedBy for
*PublishAndRequestEventConfiguration) and event (MatchedBy for
datasource.StreamEvent), ensuring the mock matches the real Request(ctx, cfg,
event) call made by engine_datasource.go.
🧹 Nitpick comments (2)
router/pkg/pubsub/redis/engine_datasource.go (1)

199-199: Nit: header (singular) in Start vs headers (plural) in Load/LoadWithFiles.

This inconsistency exists across all three providers (kafka, nats, redis), so it likely comes from the interface definition. Consider unifying to one name in the interface for consistency.

router/core/header_rule_engine.go (1)

349-352: Avoid returning nil headers in the no-rule fast-path.
Returning a non-nil empty header keeps return values consistent and avoids downstream nil-map mutation risks.

♻️ Suggested change
-	if !h.hasRequestRulesForSubgraph(subgraphName) {
-		return nil, 0
-	}
+	if !h.hasRequestRulesForSubgraph(subgraphName) {
+		return http.Header{}, 0
+	}

Comment thread router/pkg/pubsub/nats/engine_datasource_factory_test.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants