MGMT-22635: add CaaS CI workflow and e2e-caas jobs for OSAC#79512
MGMT-22635: add CaaS CI workflow and e2e-caas jobs for OSAC#79512omer-vishlitzky wants to merge 4 commits into
Conversation
|
@omer-vishlitzky: This pull request references MGMT-22635 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
WalkthroughAdds a new osac-project-cluster-tool-caas CI workflow, a caas-agents provisioning step and scripts, parameterizes boot and test steps for CaaS, extends gather to collect CaaS diagnostics, and wires e2e-caas tests into service pipelines and schedules. ChangesCaaS Workflow and Infrastructure
Sequence DiagramsequenceDiagram
participant CI as CI System
participant Workflow as osac-project-cluster-tool-caas
participant AgentStep as caas-agents step
participant Boot as cluster-tool boot step
participant Test as cluster-tool test step
participant Gather as gather step
CI->>Workflow: trigger `osac-project-cluster-tool-caas`
Workflow->>AgentStep: run caas-agents ref
AgentStep->>Boot: prepare cluster/tool inputs (flavor, template)
Boot->>Test: boot cluster, pass E2E_NAMESPACE/FLAVOR/TEMPLATE
Test->>Gather: run tests from E2E_TEST_DIR and upload artifacts
Gather->>CI: collect CaaS/HyperShift diagnostics
🎯 3 (Moderate) | ⏱️ ~25 minutes
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh (1)
84-86: ⚡ Quick winCapture YAML for NodePools/Agents/InfraEnvs as well.
These three resources are currently captured only with
-o wide; adding YAML snapshots would make post-failure triage much easier and keep this section consistent with the other CaaS resources.Proposed patch
oc get nodepools -A -o wide > "${ARTIFACT_DIR}/caas/nodepools.txt" 2>&1 || true +oc get nodepools -A -o yaml > "${ARTIFACT_DIR}/caas/nodepools.yaml" 2>&1 || true oc get agents -A -o wide > "${ARTIFACT_DIR}/caas/agents.txt" 2>&1 || true +oc get agents -A -o yaml > "${ARTIFACT_DIR}/caas/agents.yaml" 2>&1 || true oc get infraenvs -A -o wide > "${ARTIFACT_DIR}/caas/infraenvs.txt" 2>&1 || true +oc get infraenvs -A -o yaml > "${ARTIFACT_DIR}/caas/infraenvs.yaml" 2>&1 || true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh` around lines 84 - 86, Add YAML snapshots for the three CaaS resources in addition to the existing -o wide outputs by running `oc get nodepools -A -o yaml`, `oc get agents -A -o yaml`, and `oc get infraenvs -A -o yaml` and writing them to "${ARTIFACT_DIR}/caas/nodepools.yaml", "${ARTIFACT_DIR}/caas/agents.yaml", and "${ARTIFACT_DIR}/caas/infraenvs.yaml" respectively (preserve the existing -o wide > ... .txt lines and the "2>&1 || true" behavior); update the block containing the existing `oc get ... -o wide` commands so each resource also has a corresponding `-o yaml` capture using the same ARTIFACT_DIR and error-tolerant redirection.ci-operator/step-registry/osac-project/cluster-tool/caas/osac-project-cluster-tool-caas-workflow.yaml (1)
21-21: ⚡ Quick winUse a team-owned immutable flavor image reference.
Line 21 points to a personal namespace with a tag. This can make CI reproducibility and ownership brittle; prefer an org-owned image and digest pinning.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/osac-project/cluster-tool/caas/osac-project-cluster-tool-caas-workflow.yaml` at line 21, The CLUSTER_TOOL_FLAVOR_IMAGE value currently points to a personal namespace and a mutable tag ("quay.io/rh-ee-ovishlit/cluster-flavors:osac-caas"); change the value of CLUSTER_TOOL_FLAVOR_IMAGE to use a team/org-owned repository and pin to an immutable digest (e.g., quay.io/<team-or-org>/cluster-flavors@sha256:<digest>) so CI uses a stable, owned image; update any related image publishing workflow to publish the digested image into the team repo if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-commands.sh`:
- Line 67: The script prints a sensitive discovery ISO URL via the echo "ISO
URL: ${ISO_URL}" statement; remove this direct printing and either delete the
echo or replace it with a non-sensitive check (e.g., log that ISO_URL is
set/empty or print a redacted placeholder/length) in the same script section so
that the ISO_URL variable is not emitted to CI logs; locate and update the echo
line referencing ISO_URL in osac-project-cluster-tool-caas-agents-commands.sh to
implement this change.
In
`@ci-operator/step-registry/osac-project/cluster-tool/test/osac-project-cluster-tool-test-commands.sh`:
- Around line 49-50: Validate TEST_DIR explicitly instead of using loose
substring matching: check TEST_DIR is non-empty and equals the expected value or
matches a strict pattern (e.g., "^tests/caas(/.*)?$") before the KubeVirt
availability branch and before invoking pytest, and replace the unquoted usage
with a quoted expansion when passing to pytest (use "${TEST_DIR}"). Update the
if condition that currently uses [[ "${TEST_DIR}" != *"caas"* ]] to perform the
explicit validation and ensure later references to TEST_DIR (the pytest
invocation) are quoted to avoid word-splitting.
---
Nitpick comments:
In
`@ci-operator/step-registry/osac-project/cluster-tool/caas/osac-project-cluster-tool-caas-workflow.yaml`:
- Line 21: The CLUSTER_TOOL_FLAVOR_IMAGE value currently points to a personal
namespace and a mutable tag
("quay.io/rh-ee-ovishlit/cluster-flavors:osac-caas"); change the value of
CLUSTER_TOOL_FLAVOR_IMAGE to use a team/org-owned repository and pin to an
immutable digest (e.g., quay.io/<team-or-org>/cluster-flavors@sha256:<digest>)
so CI uses a stable, owned image; update any related image publishing workflow
to publish the digested image into the team repo if needed.
In
`@ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh`:
- Around line 84-86: Add YAML snapshots for the three CaaS resources in addition
to the existing -o wide outputs by running `oc get nodepools -A -o yaml`, `oc
get agents -A -o yaml`, and `oc get infraenvs -A -o yaml` and writing them to
"${ARTIFACT_DIR}/caas/nodepools.yaml", "${ARTIFACT_DIR}/caas/agents.yaml", and
"${ARTIFACT_DIR}/caas/infraenvs.yaml" respectively (preserve the existing -o
wide > ... .txt lines and the "2>&1 || true" behavior); update the block
containing the existing `oc get ... -o wide` commands so each resource also has
a corresponding `-o yaml` capture using the same ARTIFACT_DIR and error-tolerant
redirection.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 8e07b33e-6abb-494a-b1cc-229a609e0104
📒 Files selected for processing (17)
ci-operator/config/osac-project/fulfillment-service/osac-project-fulfillment-service-main.yamlci-operator/config/osac-project/osac-aap/osac-project-osac-aap-main.yamlci-operator/config/osac-project/osac-installer/osac-project-osac-installer-main.yamlci-operator/config/osac-project/osac-operator/osac-project-osac-operator-main.yamlci-operator/config/osac-project/osac-test-infra/osac-project-osac-test-infra-main.yamlci-operator/step-registry/osac-project/cluster-tool/boot/osac-project-cluster-tool-boot-commands.shci-operator/step-registry/osac-project/cluster-tool/boot/osac-project-cluster-tool-boot-ref.yamlci-operator/step-registry/osac-project/cluster-tool/caas-agents/OWNERSci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-commands.shci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-ref.metadata.jsonci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-ref.yamlci-operator/step-registry/osac-project/cluster-tool/caas/OWNERSci-operator/step-registry/osac-project/cluster-tool/caas/osac-project-cluster-tool-caas-workflow.metadata.jsonci-operator/step-registry/osac-project/cluster-tool/caas/osac-project-cluster-tool-caas-workflow.yamlci-operator/step-registry/osac-project/cluster-tool/test/osac-project-cluster-tool-test-commands.shci-operator/step-registry/osac-project/cluster-tool/test/osac-project-cluster-tool-test-ref.yamlci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh
| sleep 5 | ||
| done | ||
| [[ -z "${ISO_URL}" ]] && { echo "Timed out waiting for ISO URL"; exit 1; } | ||
| echo "ISO URL: ${ISO_URL}" |
There was a problem hiding this comment.
Avoid logging the discovery ISO URL.
Line 67 prints a URL derived from cluster state; this can expose sensitive cluster URL/token material in CI logs.
Proposed fix
-echo "ISO URL: ${ISO_URL}"
+echo "Discovery ISO URL acquired."As per coding guidelines, "Protect sensitive information in step registry scripts - never echo or print passwords, tokens, API keys, cluster URLs, or kubeconfig contents".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo "ISO URL: ${ISO_URL}" | |
| echo "Discovery ISO URL acquired." |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-commands.sh`
at line 67, The script prints a sensitive discovery ISO URL via the echo "ISO
URL: ${ISO_URL}" statement; remove this direct printing and either delete the
echo or replace it with a non-sensitive check (e.g., log that ISO_URL is
set/empty or print a redacted placeholder/length) in the same script section so
that the ISO_URL variable is not emitted to CI logs; locate and update the echo
line referencing ISO_URL in osac-project-cluster-tool-caas-agents-commands.sh to
implement this change.
| if [[ "${TEST_DIR}" != *"caas"* ]]; then | ||
| echo "Waiting for KubeVirt to be Available..." |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n 'E2E_TEST_DIR:' ci-operator/config/osac-project/**/osac-project-*-main.yamlRepository: openshift/release
Length of output: 305
🏁 Script executed:
# View the full script to understand context around lines 49-83
cat -n ci-operator/step-registry/osac-project/cluster-tool/test/osac-project-cluster-tool-test-commands.sh | sed -n '1,90p'Repository: openshift/release
Length of output: 3739
🏁 Script executed:
# Search for other references to E2E_TEST_DIR in the repository to see if it's sourced elsewhere
rg -n 'E2E_TEST_DIR|TEST_DIR=' --type yaml --type sh ci-operator/ | head -40Repository: openshift/release
Length of output: 2558
Harden TEST_DIR handling with explicit validation.
Line 49 uses substring matching and line 83 passes ${TEST_DIR} unquoted to pytest. While current configuration strictly sets E2E_TEST_DIR: tests/caas, explicit validation makes the intent clearer and prevents future misconfiguration.
🔧 Proposed fix
TEST_DIR="$7"
+
+case "${TEST_DIR}" in
+ tests/caas) ;;
+ *)
+ echo "ERROR: unsupported E2E_TEST_DIR: ${TEST_DIR}"
+ exit 1
+ ;;
+esac
@@
-if [[ "${TEST_DIR}" != *"caas"* ]]; then
+if [[ "${TEST_DIR}" != "tests/caas" ]]; then
@@
- pytest ${TEST_DIR}/ -v --junitxml=/tmp/test-results/junit_e2e.xml
+ pytest -- "${TEST_DIR%/}/" -v --junitxml=/tmp/test-results/junit_e2e.xml🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/osac-project/cluster-tool/test/osac-project-cluster-tool-test-commands.sh`
around lines 49 - 50, Validate TEST_DIR explicitly instead of using loose
substring matching: check TEST_DIR is non-empty and equals the expected value or
matches a strict pattern (e.g., "^tests/caas(/.*)?$") before the KubeVirt
availability branch and before invoking pytest, and replace the unquoted usage
with a quoted expansion when passing to pytest (use "${TEST_DIR}"). Update the
if condition that currently uses [[ "${TEST_DIR}" != *"caas"* ]] to perform the
explicit validation and ensure later references to TEST_DIR (the pytest
invocation) are quoted to avoid word-splitting.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh (1)
90-90:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winChange
ipaddresspooltoipaddresspoolson line 90.The MetalLB resource name is
ipaddresspools(plural), notipaddresspool(singular). The current command fails silently due to|| trueand does not collect MetalLB diagnostics. Useoc get ipaddresspools -A -o yaml > "${ARTIFACT_DIR}/caas/metallb-pools.yaml" 2>&1 || trueinstead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh` at line 90, Replace the incorrect MetalLB resource name in the gather command: change the oc invocation that currently uses "ipaddresspool" to the plural "ipaddresspools" so the command becomes oc get ipaddresspools -A -o yaml > "${ARTIFACT_DIR}/caas/metallb-pools.yaml" 2>&1 || true; update the line that writes to "${ARTIFACT_DIR}/caas/metallb-pools.yaml" to use the corrected resource name so MetalLB diagnostics are actually collected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In
`@ci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh`:
- Line 90: Replace the incorrect MetalLB resource name in the gather command:
change the oc invocation that currently uses "ipaddresspool" to the plural
"ipaddresspools" so the command becomes oc get ipaddresspools -A -o yaml >
"${ARTIFACT_DIR}/caas/metallb-pools.yaml" 2>&1 || true; update the line that
writes to "${ARTIFACT_DIR}/caas/metallb-pools.yaml" to use the corrected
resource name so MetalLB diagnostics are actually collected.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2e889a68-b836-46dd-99af-c48c28fe9d4a
📒 Files selected for processing (2)
ci-operator/step-registry/osac-project/cluster-tool/caas-agents/osac-project-cluster-tool-caas-agents-commands.shci-operator/step-registry/osac-project/gather/osac-project-gather-commands.sh
4aa7f6f to
fe0113f
Compare
|
/pj-rehearse |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-caas pull-ci-osac-project-osac-test-infra-main-e2e-caas pull-ci-osac-project-osac-aap-main-e2e-caas pull-ci-osac-project-fulfillment-service-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-osac-project-fulfillment-service-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-caas pull-ci-osac-project-osac-test-infra-main-e2e-caas pull-ci-osac-project-osac-aap-main-e2e-caas pull-ci-osac-project-fulfillment-service-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-caas pull-ci-osac-project-osac-test-infra-main-e2e-caas pull-ci-osac-project-osac-aap-main-e2e-caas pull-ci-osac-project-fulfillment-service-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/retest |
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-caas pull-ci-osac-project-osac-test-infra-main-e2e-caas pull-ci-osac-project-osac-aap-main-e2e-caas pull-ci-osac-project-fulfillment-service-main-e2e-caas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@omer-vishlitzky: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Parameterize cluster-tool boot and test steps to support both VMaaS and CaaS. Add CaaS agent setup step, CaaS workflow, e2e-caas presubmit jobs for all OSAC repos, and CaaS periodic jobs. Update gather step with CaaS/HyperShift diagnostics.
…ut, add nodepool yaml to gather
Remove hardcoded CLONE_NAME variables (vmaas-kustomize, ci-test) and use CLUSTER_TOOL_FLAVOR_NAME directly. This ensures the clone name matches the flavor: vmaas-kustomize for vmaas, caas-kustomize for caas. Co-Authored-By: Claude Code <noreply@anthropic.com>
fe0113f to
a68d221
Compare
|
/pj-rehearse pull-ci-osac-project-osac-installer-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-caas pull-ci-osac-project-osac-test-infra-main-e2e-caas pull-ci-osac-project-osac-aap-main-e2e-caas pull-ci-osac-project-fulfillment-service-main-e2e-caas pull-ci-osac-project-osac-operator-main-e2e-vmaas pull-ci-osac-project-fulfillment-service-main-e2e-vmaas pull-ci-osac-project-osac-test-infra-main-e2e-vmaas pull-ci-osac-project-osac-installer-main-e2e-vmaas pull-ci-osac-project-osac-aap-main-e2e-vmaas |
|
@omer-vishlitzky: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
Summary
CLUSTER_TOOL_FLAVOR_NAMEandE2E_CLUSTER_TEMPLATEenv vars, replace hardcoded flavor nameE2E_TEST_DIRenv var, make KubeVirt wait conditionalosac-project-cluster-tool-caas-agents): boots agent VM for HyperShift provisioningosac-project-cluster-tool-caas): boot + agents + test + cleanupe2e-caaspresubmit job to all 5 OSAC repos (fulfillment-service, osac-operator, osac-installer, osac-test-infra, osac-aap)Test plan
Overview
This PR adds CaaS (Cluster-as-a-Service) support to the OSAC OpenShift CI infra by introducing a new parameterizable CaaS workflow and the steps required to boot, provision HyperShift agents, run CaaS e2e tests, collect CaaS diagnostics, and clean up agent VMs. Changes are additive and preserve existing VMaaS behavior via environment variable defaults.
Practical impact — which CI/infrastructure is affected
New workflow and steps
New workflow: osac-project-cluster-tool-caas
New step: caas-agents
Parameterization & compatibility
Boot step:
Test step:
These defaults maintain backward compatibility with existing e2e-vmaas jobs.
Diagnostics & cleanup
CI job changes
Other notes / fixes in commit
Testing notes (from PR)