Skip to content

Add v1.1 Space parameters and addon version validation warning#421

Open
aws-brianxia wants to merge 3 commits into
aws:mainfrom
aws-brianxia:space_template_v1_1
Open

Add v1.1 Space parameters and addon version validation warning#421
aws-brianxia wants to merge 3 commits into
aws:mainfrom
aws-brianxia:space_template_v1_1

Conversation

@aws-brianxia
Copy link
Copy Markdown
Contributor

Before/After UX

Adds 6 new parameters to the Space v1.1 template to match the upstream workspace operator CRD (workspace.jupyter.org/v1alpha1):

  • --access-type — who can connect to the workspace (Public/OwnerOnly)
  • --env — environment variables for the workspace container
  • --access-strategy — reference to a WorkspaceAccessStrategy
  • --pod-security-context — pod-level security context
  • --container-security-context — container-level security context
  • --init-containers — init containers to run before the workspace container

Also adds an addon version compatibility warning: when users create/update a space, the CLI checks the installed amazon-sagemaker-spaces addon version via eks:DescribeAddon and warns if it's below the minimum required version (0.1.6 for v1.1 template).

Before:

New workspace fields (accessType, env, etc.) were not available through the CLI or SDK. Users had no way to set them without manually crafting K8s YAML.

After:

hyp create hyp-space \
    --name my-space \
    --display-name "My Space" \
    --access-type OwnerOnly \
    --env '[{"name": "MY_VAR", "value": "hello"}]' \
    --access-strategy name=my-strategy,namespace=default \
    --pod-security-context '{"runAsNonRoot": true}' \
    --container-security-context '{"allowPrivilegeEscalation": false}' \
    --init-containers '[{"name": "setup", "image": "busybox", "command": ["sh", "-c", "echo init"]}]'

If the addon version is too old, users see:

UserWarning: The installed 'amazon-sagemaker-spaces' addon version is 0.1.1,
but this operation requires addon version >= 0.1.6. Some space parameters may
be ignored or rejected by the operator.

How was this change tested?

  • Unit tests: 135 space-related tests pass
  • Manual CLI validation: confirmed all 6 parameters parse correctly and serialize to the expected K8s spec fields
  • Addon version warning tested against a live v0.1.1 cluster in eu-north-1

Are unit tests added?

Yes - 14 new tests added to test/unit_tests/test_space_utils.py covering _parse_version, get_spaces_addon_version, and warn_if_addon_version_incompatible.

Are integration tests added?

No - integration test not required for this change

Reviewer Guidelines

‼️ Merge Requirements: PRs with failing integration tests cannot be merged without justification.

One of the following must be true:

  • All automated PR checks pass
  • Failed tests include local run results/screenshots proving they work
  • Changes are documentation-only

| `--lifecycle` | TEXT | No | Lifecycle specifies actions that the management system should take in response to container lifecycle events (JSON string) |
| `--app-type` | TEXT | No | AppType specifies the application type for this workspace |
| `--service-account-name` | TEXT | No | ServiceAccountName specifies the name of the ServiceAccount to use for the workspace pod |
| `--queue-name` | TEXT | No | Queue name for space scheduling (1-63 characters, alphanumeric with hyphens). Required when task governance is enabled on HyperPod EKS clusters. Sets the `kueue.x-k8s.io/queue-name` label. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The queue name is based on the namespace, right? Should we just fill this in for them? If they submit a workspace without a queue name it won't get scheduled.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline and decided to keep it as-is in case users do not use task governance.

| **memory-limit** | TEXT | No | Memory resource limit |
| **gpu** | TEXT | No | GPU resource request |
| **gpu-limit** | TEXT | No | GPU resource limit |
| **accelerator-partition-type** | TEXT | No | Fractional GPU partition type (e.g., 'mig-3g.20gb') |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these have been removed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I think we don't need to list exhaustively all the parameters in the getting_started doc.

@@ -2,6 +2,9 @@
from typing import Optional, List, Dict, Literal, Any
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean users with older add on versions like 0.1.1 will still be able to use this hyperpod cli version? Another option would be to say this HP CLI version is not compatible with add on <0.1.6, and customer can either downgrade HP CLI or upgrade their add on. Was this considered?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the customer is using addon 0.1.1 and trying to use the new v1.1 parameters, it will warn them of this issue because its MIN_ADDON_VERSION is 0.1.6. However, it's a warning, not a hard enforcement.

@mollyheamazon
Copy link
Copy Markdown
Collaborator

Will there be integ test added for v1.1, or should the integ test for v1.0 be updated to import v1.1 for sanity check?

@aws-brianxia
Copy link
Copy Markdown
Contributor Author

Will there be integ test added for v1.1, or should the integ test for v1.0 be updated to import v1.1 for sanity check?

can I add integ tests for the v1.1 afterward because I will need to upgrade the addon for the test infra?

Comment thread hyperpod-space-template/hyperpod_space_template/v1_1/schema.json
Comment thread test/unit_tests/test_space_utils.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants