Skip to content

Build Automodel compiled dependencies in CI image#15737

Open
pzelasko wants to merge 30 commits into
mainfrom
codex/automodel-compiled-deps
Open

Build Automodel compiled dependencies in CI image#15737
pzelasko wants to merge 30 commits into
mainfrom
codex/automodel-compiled-deps

Conversation

@pzelasko
Copy link
Copy Markdown
Collaborator

Summary

  • Add a compiled-deps wheel stage to docker/Dockerfile.ci for NeMo Automodel packages.
  • Build and install TransformerEngine, DeepEP V1, nv-grouped-gemm, causal-conv1d, mamba-ssm, and flash-attn into CI Dockerfile images.
  • Add GPU_TARGET with default h100plus and a100 option; A100 applies a DeepEP V1 patch to disable NVSHMEM.
  • Keep CI resource settings and Docker runtime resource limits unchanged.

Validation

  • bash -n over Dockerfile heredoc snippets
  • git diff --check
  • docker buildx build --call=check -f docker/Dockerfile.ci .
  • Local constrained builds/import smoke tests were run before the final review cleanup for both GPU_TARGET=h100plus and GPU_TARGET=a100.

chtruong814 and others added 28 commits May 13, 2026 13:40
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
This reverts commit 8c5a48e.

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Use decoder vocab size when generating synthetic TDT transcript labels so duration outputs from the joint are not sampled as labels.

Move CUDA graph compile exception types into cuda_python_utils per review feedback.

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
@pzelasko pzelasko requested a review from a team as a code owner May 29, 2026 16:57
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Base automatically changed from chtruong/uv-lock to main May 29, 2026 19:35
…led-deps

# Conflicts:
#	docker/Dockerfile.ci
@github-actions github-actions Bot added the ASR label May 29, 2026
@pzelasko
Copy link
Copy Markdown
Collaborator Author

/ok to test 14c7818

@github-actions
Copy link
Copy Markdown
Contributor

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants