Skip to content

Drop the numeric-team_id Trino org-id validator#672

Merged
bill-ph merged 1 commit into
mainfrom
trino-drop-teamid-validator
Jun 4, 2026
Merged

Drop the numeric-team_id Trino org-id validator#672
bill-ph merged 1 commit into
mainfrom
trino-drop-teamid-validator

Conversation

@bill-ph
Copy link
Copy Markdown
Collaborator

@bill-ph bill-ph commented Jun 4, 2026

What

The Trino opt-in path rejected any non-numeric org id, on a false Org.Name == posthog team_id assumption. duckgres has no team_idOrg.Name is the primary key, a DNS-1123 label (e.g. ben-iceberg-cnpg), validated by ducklingOrgIDPattern. This removes that wrong validator and aligns the Trino opt-in with the rest of the org lifecycle.

Why it's safe

The numeric check was redundant. The Trino catalog / group / resource-group / OPA identifiers are already derived via trinoSanitize(Org.Name) (lowercase, then non-[a-z0-9_]_), which is injective over the DNS-1123 charset (only hyphens remap), so two distinct org names can't collide into one catalog. The DNS-1123 charset also can't inject into password.db / group.db lines or the resource-groups / OPA JSON. /provision validates the org id as DNS-1123 before the trino block, so the block can rely on it.

Reviewed twice by codex (xhigh) — both passes confirmed the removal is behaviorally safe; the only findings were comment-accuracy (raw-vs-sanitized placeholders), which are addressed.

Changes

  • Remove trinoOrgIDPattern (^[0-9]+$) and its two call sites (the /provision trino block + enableTrino).
  • enableTrino now validates with ducklingOrgIDPattern — the same DNS-1123 gate provisionWarehouse already applies — so invalid names like Bad_Org still 400.
  • Tests: invert the two "rejects non-numeric" assertions to "enables"; the standalone-endpoint test uses the real ben-iceberg-cnpg (the one shape where raw ≠ sanitized); a repurposed test still rejects non-DNS-1123 input.
  • Comment-only: drop the team_id framing across the whole Trino path (provisioner + OPA policy.rego / types.go), and make resource-group / group placeholders show the sanitized form (root.tenants.ben_iceberg_cnpg, org_ben_iceberg_cnpg).

No behavior change beyond dropping the wrong validator. This unblocks opting a DNS-1123-named org (the real dev shape, e.g. ben-iceberg-cnpg) into Trino.

Test

go build / go vet / go test -tags kubernetes across controlplane/provisioner, controlplane/provisioner/opa, and controlplane/provisioning — all pass.

🤖 Generated with Claude Code

The Trino opt-in surfaces (POST /orgs/:id/trino and the trino block on
/provision) rejected any non-numeric org id, on the false premise that
Org.Name == a posthog team_id. duckgres has no team_id concept: Org.Name
is the primary key and a DNS-1123 label (e.g. "ben-iceberg-cnpg"),
validated everywhere by ducklingOrgIDPattern.

The numeric check was both wrong and redundant. The Trino catalog,
group, resource-group, and OPA projections already derive their
identifiers via trinoSanitize(Org.Name) (lowercase, then non-[a-z0-9_]
to '_'), which is injective over the DNS-1123 charset (only hyphens
remap), so two distinct org names can't collide into one catalog. The
collision concern that motivated the validator is already handled by
sanitization plus the DNS-1123 gate.

- Remove trinoOrgIDPattern and its two call sites.
- enableTrino now validates with ducklingOrgIDPattern -- the same
  DNS-1123 gate provisionWarehouse already applies -- so an invalid
  name like "Bad_Org" still 400s.
- Invert/rename the tests that asserted non-numeric ids are rejected;
  the standalone-endpoint test now uses the real dev org name
  "ben-iceberg-cnpg" (the one shape where raw != sanitized).
- Comment-only: replace the misleading "team_id" framing across the
  Trino path (provisioner + OPA policy/types) with "org name
  (DNS-1123)", and make resource-group / group placeholders show the
  sanitized form (root.tenants.ben_iceberg_cnpg, org_ben_iceberg_cnpg).

No behavior change beyond dropping the wrong validator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bill-ph bill-ph merged commit c5e0733 into main Jun 4, 2026
24 checks passed
@bill-ph bill-ph deleted the trino-drop-teamid-validator branch June 4, 2026 01:44
bill-ph added a commit that referenced this pull request Jun 4, 2026
buildCatalogProperties emitted `fs.s3.enabled=true`, which Trino 476
rejects at CREATE CATALOG ("Configuration property 'fs.s3.enabled' was
not used"), cascading s3.region + s3.iam-role into "unused" and breaking
every per-org Iceberg catalog.

Trino removed the legacy Hadoop S3 filesystem; since Trino 458 the native
S3 filesystem must be enabled per-catalog with `fs.native-s3.enabled=true`.
The prop set had drifted from the proven-working maintenance script
(charts/trino/files/trino_maintenance.py) the doc comment claims to
mirror -- it uses fs.native-s3.enabled + s3.region + s3.iam-role. This
restores that match (one property name).

Latent until now: no org could be Trino-enabled until the numeric-org-id
validator was removed (#672), so CREATE CATALOG had never executed
end-to-end. Found live enabling the first Trino org (ben-cnpg-ice) in
mw-dev, where the reconcile loops on this error every ~10s.

- buildCatalogProperties: fs.s3.enabled -> fs.native-s3.enabled
- Correct the doc comment + the unit test that asserted the wrong name

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant