Skip to content

performance_issues#26

Open
sergei-bronnikov wants to merge 6 commits into
mainfrom
17720_performance_issues
Open

performance_issues#26
sergei-bronnikov wants to merge 6 commits into
mainfrom
17720_performance_issues

Conversation

@sergei-bronnikov

@sergei-bronnikov sergei-bronnikov commented Jun 11, 2026

Copy link
Copy Markdown

https://bugtracker.codiodev.com/issue/codio-17720/eCornell-reported-Claude-Code-Performance-Issues-with-LLM-Proxy

Summary by CodeRabbit

  • New Features

    • Added tools authorization lookup capability for improved request validation.
  • Performance Improvements

    • Implemented HTTP connection pooling with optimized timeout configurations.
    • Enhanced event message consumer configuration for better throughput.
    • Optimized request handling with local variable caching to reduce repeated lookups.
  • Bug Fixes

    • Simplified utility helper functions for better maintainability.

@sergei-bronnikov

Copy link
Copy Markdown
Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
✅ Action performed

Full review finished.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack


Wait, I accidentally invented a range ID. Let me redo this correctly, placing range_a0cc85580c24 in the correct layer only once.

I keep making up range IDs. Let me be careful and only use what's in all_range_ids. The list is:

range_d881c3d182aa, range_7a25661e6a9c, range_be3686d2b0d0, range_94d4063657a4, range_9d4aecc1b108, range_850bff88efb8, range_0e1860a7aa09, range_47d312851312, range_4728347bbc6e, range_0758281058d0, range_e72c63438280, range_beaff7dcc4df, range_c4bb8aa893f9, range_0b86de6e0ed3, range_03f692ed69ee, range_a93eeb0b1efd, range_42e97eccb3a9, range_f2db3d9e0f14, range_ff9e6594cf7d, range_cb83cc80446b, range_4fdf23ffa9ff, range_08e667aaca88, range_8021d1b7fff3, range_a0cc85580c24, range_6c89b25f8e77, range_3e141d3ef842, range_c3a2b84ec212, range_5292026d3533

That's 28 range IDs. I need to assign each exactly once.

Walkthrough

The PR refactors getMiddleware to cache fullPath and method once per request, updates getProvider to accept fullPath as a parameter, adds an AllowedToolsSet map for O(1) tool authorization, conditionalizes response-logging writer initialization, tunes the upstream HTTP transport with explicit connection pooling, and increases the event consumer pool from 4 to 16 with a buffered channel of 1000.

Changes

Proxy Middleware and Infrastructure Improvements

Layer / File(s) Summary
AllowedToolsSet declaration and init population
internal/provider/openai/cost.go
Exports AllowedToolsSet map[string]struct{} and populates it from AllowedTools in init(), replacing the existing slice for O(1) tool membership checks.
getProvider refactor and getMiddleware fullPath/method caching
internal/server/web/proxy/middleware.go
getProvider accepts fullPath instead of reading c.FullPath() internally. getMiddleware caches fullPath and method at entry and uses them throughout; metadata header serialization changes from JSON-marshal to direct copy; responseWriter initialization is gated on kc.ShouldLogResponse; nil-blw guard added on non-stream response capture; contains simplified to slices.Contains.
Anthropic, Bedrock, and custom provider route branches
internal/server/web/proxy/middleware.go
Anthropic, Bedrock, and custom provider endpoint conditionals switched from c.FullPath()/c.Request.Method to fullPath/method; bodyStr caching added for gjson lookups in the custom provider branch.
OpenAI responses branch: AllowedToolsSet and container detection
internal/server/web/proxy/middleware.go
OpenAI responses routing uses fullPath and switches tool authorization to openai.AllowedToolsSet map lookup; container memory limit extracted by inspecting each tool's container payload.
OpenAI remaining endpoint branches and access-control checks
internal/server/web/proxy/middleware.go
All remaining OpenAI endpoint branches (vLLM, deepinfra, Azure, chat completions, embeddings, images, audio, video, assistants, threads, messages, runs, run steps, moderations, models, files) and access-control checks updated to use fullPath/method conditionals.
HTTP transport tuning and event consumer pipeline sizing
internal/server/web/proxy/proxy.go, cmd/bricksllm/main.go
NewProxyServer configures an explicit http.Transport with idle connection pooling limits and TLS handshake timeout; event channel buffered at 1000 and consumer pool uses numEventConsumers=16.
Minor correctness fixes
internal/server/web/proxy/anthropic.go, internal/server/web/proxy/video.go
Anthropic handler latency switches from time.Now().Sub(start) to time.Since(start); video handler drops a redundant trailing return.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • codio/BricksLLM#9: Directly modifies internal/provider/openai/cost.go and the OpenAI responses tool authorization logic in middleware.go, the same functions and variables changed in this PR.
  • codio/BricksLLM#14: Adds container memory-limit validation and cost estimation in the OpenAI responses flow, which this PR's revised container detection logic (now using AllowedToolsSet and payload inspection) builds upon.

Suggested reviewers

  • destitutus
  • AndreyNikitin
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'performance_issues' is vague and generic, using a non-descriptive term that doesn't convey the specific nature of the changes (consumer pool sizing, buffered channels, HTTP transport optimization, etc.). Use a more descriptive title that captures the main performance improvements, such as 'Optimize LLM proxy performance with configurable event consumers and connection pooling' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 17720_performance_issues

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/bricksllm/main.go`:
- Around line 366-372: The Consumer needs to ensure all worker goroutines
cleanly exit on shutdown using a sync.WaitGroup. Add a WaitGroup field to the
Consumer struct, then in the StartEventMessageConsumers method, add each worker
goroutine to the WaitGroup before spawning it (Add call), and have each worker
call Done() when it exits after receiving the stop signal. Finally, modify the
Stop() method to call wg.Wait() at the end so it blocks until all workers have
completed, ensuring the buffered eventMessageChan is fully drained before
returning and allowing the server shutdown to proceed.

In `@internal/server/web/proxy/proxy.go`:
- Around line 94-101: The http.Transport in the client variable is created from
scratch, which drops default transport behavior including environment proxy
routing (HTTP_PROXY/HTTPS_PROXY). Instead of creating a fresh http.Transport
literal, clone http.DefaultTransport as the base and then override only the
specific fields that need tuning: MaxIdleConns, MaxIdleConnsPerHost,
IdleConnTimeout, and TLSHandshakeTimeout. This preserves the default proxy
behavior while applying the desired connection pool optimizations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c00ed4c-ed76-42e4-9300-f57149d2e48b

📥 Commits

Reviewing files that changed from the base of the PR and between 0ddb013 and c1a8050.

📒 Files selected for processing (6)
  • cmd/bricksllm/main.go
  • internal/provider/openai/cost.go
  • internal/server/web/proxy/anthropic.go
  • internal/server/web/proxy/middleware.go
  • internal/server/web/proxy/proxy.go
  • internal/server/web/proxy/video.go
💤 Files with no reviewable changes (1)
  • internal/server/web/proxy/video.go

Comment thread cmd/bricksllm/main.go
Comment on lines +366 to 372
eventMessageChan := make(chan message.Message, 1000)
messageBus.Subscribe("event", eventMessageChan)

handler := message.NewHandler(rec, log, ace, ce, vllme, aoe, v, uv, m, um, rlm, accessCache, userAccessCache)

eventConsumer := message.NewConsumer(eventMessageChan, log, 4, handler.HandleEventWithRequestAndResponse)
eventConsumer := message.NewConsumer(eventMessageChan, log, numEventConsumers, handler.HandleEventWithRequestAndResponse)
eventConsumer.StartEventMessageConsumers()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect consumer stop/drain behavior and channel lifecycle.
ast-grep --lang go --pattern $'func (c *Consumer) Stop() { $$$ }'
rg -n -C4 'func \(c \*Consumer\) Stop\(|StartEventMessageConsumers\(|close\(c\.done\)|c\.done <-|WaitGroup|close\(.*event.*chan|close\(.*message.*chan'
rg -n -C3 'Subscribe\("event"|Publish\("event"'

Repository: codio/BricksLLM

Length of output: 50373


🏁 Script executed:

# Get full Consumer implementation
wc -l internal/message/consumer.go
head -80 internal/message/consumer.go

# Check shutdown sequence in main.go
grep -n "eventConsumer.Stop\|Stop()\|close\|defer" cmd/bricksllm/main.go | head -30

# Check for signal handling and graceful shutdown
rg -n "sigint|sigterm|signal\|interrupt|shutdown|graceful" cmd/bricksllm/main.go -i

Repository: codio/BricksLLM

Length of output: 2058


🏁 Script executed:

# Get context around shutdown in main.go
sed -n '385,415p' cmd/bricksllm/main.go

Repository: codio/BricksLLM

Length of output: 721


Verify graceful drain semantics for buffered event queue on shutdown.

Line 366 increases in-memory backlog to 1000 and Line 371 increases concurrency to 16 workers. The Consumer.Stop() method sends a signal on an unbuffered done channel, but only one worker receives it; the remaining 15 workers remain blocked in their select loops and never exit. Additionally, Stop() does not wait for workers to finish—it returns immediately—allowing the subsequent server shutdown (lines 401–408) to proceed without guaranteeing the 1000-buffered messageChan has been drained. Implement a sync.WaitGroup in the Consumer to coordinate all worker goroutines and ensure Stop() blocks until all workers acknowledge the shutdown signal.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/bricksllm/main.go` around lines 366 - 372, The Consumer needs to ensure
all worker goroutines cleanly exit on shutdown using a sync.WaitGroup. Add a
WaitGroup field to the Consumer struct, then in the StartEventMessageConsumers
method, add each worker goroutine to the WaitGroup before spawning it (Add
call), and have each worker call Done() when it exits after receiving the stop
signal. Finally, modify the Stop() method to call wg.Wait() at the end so it
blocks until all workers have completed, ensuring the buffered eventMessageChan
is fully drained before returning and allowing the server shutdown to proceed.

Comment on lines +94 to +101
client := http.Client{
Transport: &http.Transport{
MaxIdleConns: 200,
MaxIdleConnsPerHost: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
},
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether runtime/deploy config relies on outbound proxy env vars.
rg -n -C2 'HTTP_PROXY|HTTPS_PROXY|NO_PROXY|http_proxy|https_proxy|no_proxy'

Repository: codio/BricksLLM

Length of output: 41


🌐 Web query:

In Go net/http, if I construct &http.Transport{...} without setting Proxy, does it still honor HTTP_PROXY/HTTPS_PROXY from the environment?

💡 Result:

No, if you explicitly construct an http.Transport and leave the Proxy field as nil, it does not honor the environment variables HTTP_PROXY and HTTPS_PROXY [1][2][3]. In Go's net/http package, the Proxy field on an http.Transport is a function of type func(*Request) (*url.URL, error) [1][2][3]. If this field is nil, the transport treats it as "no proxy is used" [1][2][3]. The environment variables are only honored if you explicitly set the Proxy field to http.ProxyFromEnvironment [4][5]. This behavior is seen in the default configuration: the standard library's http.DefaultTransport is initialized with Proxy: http.ProxyFromEnvironment [5][6], which is why it automatically respects those environment variables, but a custom &http.Transport{} initialization starts with a zero value (nil) for the Proxy field [4][7].

Citations:


🏁 Script executed:

head -150 internal/server/web/proxy/proxy.go | tail -60

Repository: codio/BricksLLM

Length of output: 3167


Preserve default transport behavior while tuning pooling.

Line 95 creates a fresh http.Transport literal. This unintentionally drops default behavior (notably env proxy routing via HTTP_PROXY/HTTPS_PROXY), causing upstream connectivity regressions in proxied deployments. The Proxy field defaults to nil when not explicitly set, disabling environment variable support.

Clone http.DefaultTransport and override only the tuned fields:

Suggested fix
-	client := http.Client{
-		Transport: &http.Transport{
-			MaxIdleConns:        200,
-			MaxIdleConnsPerHost: 100,
-			IdleConnTimeout:     90 * time.Second,
-			TLSHandshakeTimeout: 10 * time.Second,
-		},
-	}
+	baseTransport := http.DefaultTransport.(*http.Transport).Clone()
+	baseTransport.MaxIdleConns = 200
+	baseTransport.MaxIdleConnsPerHost = 100
+	baseTransport.IdleConnTimeout = 90 * time.Second
+	baseTransport.TLSHandshakeTimeout = 10 * time.Second
+
+	client := http.Client{
+		Transport: baseTransport,
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
client := http.Client{
Transport: &http.Transport{
MaxIdleConns: 200,
MaxIdleConnsPerHost: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
},
}
baseTransport := http.DefaultTransport.(*http.Transport).Clone()
baseTransport.MaxIdleConns = 200
baseTransport.MaxIdleConnsPerHost = 100
baseTransport.IdleConnTimeout = 90 * time.Second
baseTransport.TLSHandshakeTimeout = 10 * time.Second
client := http.Client{
Transport: baseTransport,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/server/web/proxy/proxy.go` around lines 94 - 101, The http.Transport
in the client variable is created from scratch, which drops default transport
behavior including environment proxy routing (HTTP_PROXY/HTTPS_PROXY). Instead
of creating a fresh http.Transport literal, clone http.DefaultTransport as the
base and then override only the specific fields that need tuning: MaxIdleConns,
MaxIdleConnsPerHost, IdleConnTimeout, and TLSHandshakeTimeout. This preserves
the default proxy behavior while applying the desired connection pool
optimizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants