Skip to content

fix: allow explicit CuDNN with CUDA minor versions#3070

Open
sarptandoven wants to merge 2 commits into
replicate:mainfrom
sarptandoven:fix/cuda-minor-explicit-cudnn
Open

fix: allow explicit CuDNN with CUDA minor versions#3070
sarptandoven wants to merge 2 commits into
replicate:mainfrom
sarptandoven:fix/cuda-minor-explicit-cudnn

Conversation

@sarptandoven

Copy link
Copy Markdown

Summary

  • Fix CUDA/CuDNN compatibility validation to treat user-specified CUDA minor versions like 11.6 as selectors for compatible patch-level base images such as 11.6.2.
  • Add a regression test that completes a GPU config with explicit cuda: "11.6" and cudnn: "8", then verifies the selected NVIDIA base image.

Why

latestCuDNNForCUDA and cudaBaseImageFor already use version.Matches, so cuda: "11.6" can resolve to nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04. The validation path used an exact string comparison, which meant the same config failed early when cudnn was set explicitly.

Test plan

  • git diff --check
  • go test ./pkg/config -run 'TestCUDABaseImageTag|TestLatestCuDNNForCUDA|TestValidateAndCompleteCUDA' -count=1
  • go test ./pkg/config -count=1

@sarptandoven sarptandoven requested a review from a team as a code owner June 22, 2026 10:10

@anish-sahoo anish-sahoo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread pkg/config/config_test.go
@markphelps markphelps enabled auto-merge June 23, 2026 20:02
@markphelps markphelps disabled auto-merge June 23, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants