Sglang backend by pmukeshreddy · Pull Request #549 · OpenPipe/ART

pmukeshreddy · 2026-02-04T15:20:32Z

Summary

Adds SGLang as an alternative inference backend to vLLM, optimized for RL training workloads. SGLang's RadixAttention provides automatic prefix caching that significantly improves performance for multi-turn agent trajectories.

Motivation

In RL training loops, many rollouts share the same system prompt and conversation prefix. SGLang's RadixAttention automatically caches these common prefixes, providing:

~29% higher throughput vs vLLM (external benchmarks)
~22% lower TTFT (time to first token)
Persistent KV cache across training steps (no restart needed)

Architecture

┌─────────────────────────────────────────────────────────┐
│           Multi-GPU Split Architecture                   │
├─────────────────────────────────────────────────────────┤
│  GPU 0: SGLang Server    │  GPU 1+: Training (Unsloth)  │
│  - RadixAttention cache  │  - PEFT/LoRA model           │
│  - OpenAI-compatible API │  - GRPO optimizer            │
│  - LoRA hot-reload       │  - Gradient computation      │
└─────────────────────────────────────────────────────────┘
        Weight Sync: Hot-reload via HTTP API (~5-10s)

Key design decision: SGLang server runs in a separate Python environment to avoid torchao version conflicts (SGLang needs torchao==0.9, Unsloth needs torchao>=0.13). Communication happens via HTTP, not Python imports.

Features

Drop-in replacement for LocalBackend
Auto GPU detection: Uses split-mode on 2+ GPUs, fallback on single GPU
LoRA hot-reload: Updates weights without server restart (preserves cache)
Two-environment architecture: Avoids all dependency conflicts
Comprehensive documentation: See docs/sglang-integration.md

Usage

from art.sglang_backend import SGLangBackend

# Auto-detects GPUs and configures optimally
backend = SGLangBackend()
await backend.register(model)
result = await backend.train(model, trajectory_groups)

Testing

Tested end-to-end on 4x H100 80GB:

# Fresh server test
git clone -b sglang-backend https://github.com/pmukeshreddy/ART.git
cd ART

# Install Python 3.11
apt install -y software-properties-common
add-apt-repository -y ppa:deadsnakes/ppa
apt update
apt install -y python3.11 python3.11-venv python3.11-dev

# Create main venv for ART package
python3.11 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip wheel
pip install -e ".[sglang]"
deactivate

# Create separate SGLang server venv (REQUIRED - different torch/cuda versions)
python3.11 -m venv .venv-sglang-server
source .venv-sglang-server/bin/activate
pip install --upgrade pip wheel
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install "sglang[all]>=0.4.0" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
deactivate

# Run the E2E test
source .venv/bin/activate
python scripts/test_sglang_e2e.py

Test output:

[1/7] Importing modules...              ✓
[2/7] Checking SGLang server environment... ✓
[3/7] Initializing SGLangBackend...     ✓
[4/7] Registering model...              ✓
[5/7] Starting server and testing inference... ✓
[6/7] Running training step...          ✓
[7/7] Testing inference after training... ✓

ALL TESTS PASSED!

References

mukesh reddy and others added 5 commits February 4, 2026 09:37

feat: Add SGLang backend integration

1262dcb

Add sglang optional dependencies to pyproject.toml

c0f5098

Add missing modified files for SGLang integration

73dd032

Update sglang-integration.md

23d75fb

Update sglang-integration.md

e26b70a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sglang backend#549

Sglang backend#549
pmukeshreddy wants to merge 5 commits intoOpenPipe:mainfrom
pmukeshreddy:sglang-backend

pmukeshreddy commented Feb 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pmukeshreddy commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Architecture

Features

Usage

Testing

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pmukeshreddy commented Feb 4, 2026 •

edited

Loading