feat: Python version matrix for Docker images + error-masking fix by deanq · Pull Request #76 · runpod-workers/flash

deanq · 2026-03-06T20:55:59Z

Summary

GPU Docker images pinned to Python 3.12 (PyTorch base image runpod/pytorch:1.0.3-cu1281-torch291-ubuntu2204 ships 3.12 only -- not parameterizable)
CPU Docker images parameterized with PYTHON_VERSION build arg for 3.10, 3.11, 3.12
Makefile version matrix: build-*-versioned, build-wip-versioned, smoketest-versioned targets
Produces 8 tagged images: GPU (3.12) x 2 modes + CPU (3.10, 3.11, 3.12) x 2 modes
CI matrix restructured: GPU jobs build 3.12 only, CPU jobs build 3.10/3.11/3.12 (no duplicate matrix rows)
Build-time Python version validation in GPU Dockerfiles (asserts base image Python matches expected)
Use python -m pip instead of bare pip in Dockerfiles for interpreter safety
Use sys.executable -m pip in dependency_installer.py fallback for correct interpreter targeting
Fix error-masking in handler.py: deployed mode raises RuntimeError instead of silently falling back to Live Serverless

Context

AE-2391: Python version mismatch between user build environment and worker runtime causes silent failures with binary packages (numpy, etc). This PR adds versioned Docker images so the SDK can select the correct runtime for the user's Python version.

GPU images are 3.12 only because the PyTorch base image ships a single Python version. CPU images use python:X.Y-slim bases which are version-parameterizable.

Companion PR: runpod/flash#261 (SDK-side version detection and image selection)

Test plan

make build-all-versioned builds all 8 images locally
make smoketest-versioned verifies correct Python version in each image
Existing make test passes unchanged
Handler error-masking fix verified: deployed mode raises on handler load failure

Parameterize all 4 Dockerfiles with PYTHON_VERSION build arg. GPU images also accept PYTORCH_BASE to select the correct PyTorch base image per Python version. Add versioned Makefile targets: - build-all-versioned: builds 10 images (GPU 3.11/3.12, CPU 3.10-3.12) - build-wip-versioned: multi-platform push with latest alias - smoketest-versioned: verify Python version in each image GPU base image mapping: - Python 3.11: pytorch/pytorch:2.9.1-cuda12.8-cudnn9-runtime - Python 3.12: pytorch/pytorch:2.10.0-cuda12.8-cudnn9-runtime

Previously, _load_generated_handler() silently returned None on any failure (missing file, import error, syntax error), causing deployed endpoints to fall back to the FunctionRequest/Live Serverless handler. This masked real deployment issues like Python version mismatches. Now deployed mode (FLASH_RESOURCE_NAME set) treats handler loading failures as fatal RuntimeError. Live Serverless mode skips the generated handler entirely since it only uses FunctionRequest protocol.

Tests expected None returns but handler.py now raises RuntimeError in deployed mode. Updated all 8 TestLoadGeneratedHandler tests to use pytest.raises(RuntimeError). Also synced uv.lock to pick up latest runpod-flash version.

Copilot

Pull request overview

Adds a Python-versioned Docker image matrix (CPU/GPU, QB/LB) to prevent runtime Python mismatches with binary deps, and makes deployed endpoints fail fast when the generated handler can’t be loaded (instead of silently falling back to the Live Serverless protocol).

Changes:

Parameterizes Docker builds via PYTHON_VERSION / PYTORCH_BASE build args and introduces versioned image tags (e.g., :py3.11-...).
Adds Makefile targets to build/push/smoketest the multi-version image matrix.
Updates generated-handler loading behavior to raise in deployed mode, with corresponding unit test updates.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`Makefile`	Adds Python version matrix variables and versioned build/push/smoketest targets.
`Dockerfile`	Switches GPU base image selection to `PYTORCH_BASE` arg (and adds `PYTHON_VERSION` arg).
`Dockerfile-lb`	Same as GPU QB Dockerfile, but for LB image.
`Dockerfile-cpu`	Parameterizes CPU base image via `python:${PYTHON_VERSION}-slim`.
`Dockerfile-lb-cpu`	Parameterizes CPU LB base image via `python:${PYTHON_VERSION}-slim`.
`src/handler.py`	Makes deployed-mode handler loading failures fatal; keeps Live Serverless path separate.
`tests/unit/test_handler.py`	Updates `_load_generated_handler` tests to expect `RuntimeError` instead of `None`.
`uv.lock`	Updates lockfile with multiple dependency version bumps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Makefile

Dockerfile

Dockerfile-lb

Makefile

Include platform.python_version() in the worker boot banner for runtime version visibility during E2E testing.

- Add build-time Python version validation to GPU Dockerfiles - Restructure build-all-versioned to run setup once via internal targets - Add version assertion to smoketest-versioned (fail on mismatch)

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Makefile

Dockerfile-lb

tests/unit/test_handler.py

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

tests/unit/test_handler.py:179

test_loads_generated_handler_from_file patches handler.Path to return a tmp_path file, but _load_generated_handler() enforces handler_file.resolve().is_relative_to(Path('/app').resolve()). Since tmp_path is not under /app, this test will raise the "resolves outside /app" RuntimeError instead of returning a loaded handler. Adjust the test to bypass the /app sandbox check (e.g., mock resolve()/is_relative_to or patch the /app root used by the function) or refactor _load_generated_handler to use a patchable constant for the app root.

    def test_loads_generated_handler_from_file(self, tmp_path):
        """With valid generated handler file, loads and returns handler function."""
        handler_file = tmp_path / "handler_gpu_config.py"
        handler_file.write_text(
            "async def handler(event):\n"
            "    return {'result': event.get('input', {}).get('prompt', 'default')}\n"
        )

        with patch.dict("os.environ", {"FLASH_RESOURCE_NAME": "gpu_config"}):
            with patch("handler.Path", return_value=handler_file):
                result = _load_generated_handler()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

tests/unit/test_handler.py

- Add default case to pytorch_base() shell function in all Makefile targets - Guard test_handler.py import against FLASH_RESOURCE_NAME env var

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

…ng running interpreter GPU Dockerfiles symlinked /usr/local/bin/python to /usr/bin/python3.X, switching to the system Python which lacks torch and other base image packages. Removed the symlink to preserve the pytorch base image's environment. Changed dependency_installer to use sys.executable instead of --system so runtime package installs go into the same site-packages as torch.

- CI matrix builds py3.10/3.11/3.12 for all image types - Removed per-version pytorch base mapping (single runpod/pytorch image) - GPU/LB builds amd64-only (pytorch base has no arm64 manifest) - Added --break-system-packages to CPU Dockerfiles - Auto-detect local Python version in Makefile for build-wip - Bumped runpod-flash dependency to >=1.7.0 - Removed .python-version (version comes from build args)

Copilot

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Makefile

.github/workflows/ci.yml

Dockerfile

Dockerfile-lb

Makefile

GPU workers are pinned to Python 3.12 (torch/CUDA only installed for 3.12 in base image). CPU workers support 3.10-3.12. Added base image Python layout details to architecture doc.

- GPU Dockerfiles: remove PYTHON_VERSION ARG (base image is 3.12), add numpy install (excluded from tarballs by flash build) - CPU Dockerfiles: default PYTHON_VERSION from 3.11 to 3.12 - Makefile: GPU_PYTHON_VERSIONS reduced to 3.12 only, remove --build-arg PYTHON_VERSION from GPU/LB build targets

PYTHON_VERSION was hardcoded to 3.12 default, ignoring the user's local Python. Restore auto-detection so `make build-wip` with Python 3.10 produces py3.10-wip CPU images as expected. GPU images remain fixed at 3.12.

Copilot

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Makefile

src/dependency_installer.py

docs/System_Python_Runtime_Architecture.md

Dockerfile

.github/workflows/ci.yml

Dockerfile

Dockerfile-lb

README.md

…afety - Fix CI matrix duplication: use include-only for GPU jobs (3.12 only) - Remove duplicate python-version list from CPU job matrices - Add Python version validation to GPU Dockerfiles - Use python -m pip instead of bare pip in Dockerfiles - Use sys.executable for pip fallback in dependency_installer.py - Update docs to match uv pip --python sys.executable implementation - Correct image count comment in Makefile (8 not 10)

deanq added 3 commits March 6, 2026 12:30

fix: update handler tests for RuntimeError + sync uv.lock

0c56199

Tests expected None returns but handler.py now raises RuntimeError in deployed mode. Updated all 8 TestLoadGeneratedHandler tests to use pytest.raises(RuntimeError). Also synced uv.lock to pick up latest runpod-flash version.

deanq requested a review from Copilot March 6, 2026 21:32

Copilot started reviewing on behalf of deanq March 6, 2026 21:33 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Makefile Outdated Show resolved Hide resolved

Dockerfile Outdated Show resolved Hide resolved

Dockerfile-lb Show resolved Hide resolved

Makefile Outdated Show resolved Hide resolved

feat: add Python version to startup banner

f917b44

Include platform.python_version() in the worker boot banner for runtime version visibility during E2E testing.

deanq requested a review from Copilot March 6, 2026 21:52

Copilot started reviewing on behalf of deanq March 6, 2026 21:53 View session

fix(review): address PR #76 feedback

851c66a

- Add build-time Python version validation to GPU Dockerfiles - Restructure build-all-versioned to run setup once via internal targets - Add version assertion to smoketest-versioned (fail on mismatch)

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Makefile Outdated Show resolved Hide resolved

Dockerfile-lb Outdated Show resolved Hide resolved

tests/unit/test_handler.py Show resolved Hide resolved

deanq requested a review from Copilot March 6, 2026 22:00

Copilot started reviewing on behalf of deanq March 6, 2026 22:00 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

tests/unit/test_handler.py Show resolved Hide resolved

fix(review): address PR #76 round 2 feedback

37a90c0

- Add default case to pytorch_base() shell function in all Makefile targets - Guard test_handler.py import against FLASH_RESOURCE_NAME env var

deanq requested a review from Copilot March 6, 2026 23:22

Copilot started reviewing on behalf of deanq March 6, 2026 23:23 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

deanq added 3 commits March 7, 2026 10:49

fix: format dependency_installer.py

1dd5723

deanq requested a review from Copilot March 7, 2026 19:01

Copilot started reviewing on behalf of deanq March 7, 2026 19:01 View session

Copilot AI reviewed Mar 7, 2026

View reviewed changes

deanq added 4 commits March 8, 2026 03:45

docs: document Python version constraints and base image layout

4b5e839

GPU workers are pinned to Python 3.12 (torch/CUDA only installed for 3.12 in base image). CPU workers support 3.10-3.12. Added base image Python layout details to architecture doc.

chore: update lockfile

75ead82

fix: auto-detect local Python version for CPU image builds

313aa4a

PYTHON_VERSION was hardcoded to 3.12 default, ignoring the user's local Python. Restore auto-detection so `make build-wip` with Python 3.10 produces py3.10-wip CPU images as expected. GPU images remain fixed at 3.12.

deanq requested a review from Copilot March 9, 2026 07:49

Copilot started reviewing on behalf of deanq March 9, 2026 07:50 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

runpod-Henrik mentioned this pull request Mar 9, 2026

feat: Python version-aware Docker image selection runpod/flash#261

Merged

5 tasks

jhcipar approved these changes Mar 9, 2026

View reviewed changes

chore: bump runpod-flash dependency to 1.8.0

d5de8a8

deanq merged commit a9845c6 into main Mar 10, 2026
11 checks passed

deanq deleted the deanq/ae-2391-python-versions-for-workers branch March 10, 2026 00:24

runpod-workers-release-please-bot bot mentioned this pull request Mar 10, 2026

chore(main): release 1.2.0 #74

Merged

Conversation

deanq commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

deanq commented Mar 6, 2026 •

edited

Loading