diff --git a/design-proposals/application-definition-versioning/README.md b/design-proposals/application-definition-versioning/README.md new file mode 100644 index 0000000..07837e6 --- /dev/null +++ b/design-proposals/application-definition-versioning/README.md @@ -0,0 +1,410 @@ +# ApplicationDefinition multi-version conversion + +- **Title:** ApplicationDefinition multi-version conversion +- **Author(s):** @kvaps +- **Date:** 2026-04-30 +- **Status:** Draft + +## Overview + +Cozystack ships managed applications (Postgres, Kubernetes, virtual +machines, buckets, etc.) as Helm charts wrapped by an +`ApplicationDefinition`. The chart's `values.yaml` is the long-lived +storage form for each instance. Today there is no API versioning around +that storage form: the only representation of an instance is its current +`values.yaml` shape, and changing that shape is a breaking change for +every existing tenant. + +This proposal defines a versioned API surface in front of `values.yaml`. +Each `ApplicationDefinition` may carry multiple `versions[]`, with one +designated as `storage`. Cross-version conversion is expressed as a +small pair of go-template snippets per non-storage version. A migration +controller normalizes existing HelmReleases to the current storage +version in the background, and a chart-side guard turns version +mismatches into human-readable errors. + +The goal is to let us improve `values.yaml` shapes (rename fields, +switch maps to lists, flatten nested objects) without breaking tenants +and without operating per-application conversion webhooks. + +## Context + +### Today + +- Each application is a Helm chart under `packages/apps//` plus a + `cozyrds/.yaml` (`kind: ApplicationDefinition`) that embeds an + OpenAPI schema as a single serialized JSON string. +- The schema today is generated from `values.yaml` comments by + `cozyvalues-gen`. The result is functional but not human-readable, and + the schema is the only versioning surface. +- Tenants interact with the cozystack-api-server, which is a thin layer + in front of HelmRelease objects (Flux). It strips `_namespace`, + `_cluster` and similar service fields from `values.yaml` when + projecting them as spec. +- Tenants do not interact with HelmRelease directly. The api-server is + the only public surface; HelmRelease is private storage. + +### The problem + +Concrete shapes we want to fix in `values.yaml` but cannot fix safely +today: + +- `users: { kvaps: { admin: true } }` should become + `users: [ { name: kvaps, admin: true } ]`. Object-as-map prevents + ordering, validation of duplicate names, and consistent UI rendering. +- `postgresql.parameters.max_connections` is a deeply nested path for + one scalar; it should be flattened to `maxConnections` on the spec + surface. +- `databases..roles.{admin,readonly}` should be flattened to + `databases[*].{admins,readers}`. + +Any of these changes today is a breaking change for every cluster that +already has these resources. + +## Goals + +- Allow each `ApplicationDefinition` to expose more than one API + version without operating a Kubernetes-style conversion webhook per + application. +- Keep `values.yaml` as the storage form. No additional storage layer. +- Express each non-storage version as a pair of go-template snippets + that convert that version's values to and from the storage version's + values. +- Migrate existing HelmReleases to the current storage version in the + background, transparently, with no required tenant action. +- Surface chart/version mismatches with a clear error message rather + than a template parse error. +- Tooling generates JSON Schema, conversion artifacts, and chart + annotations from the per-version source files. Authors do not + maintain conversion code by hand for the trivial cases. + +### Non-goals + +- Replacing `cozyvalues-gen` in this proposal. The new format can + coexist during transition. +- Designing the UI metadata (groups, widgets, conditional rendering). + That is a follow-up; this proposal only locks down `schema` and + `conversion`. See open question on UI placement. +- Cross-cluster or per-tenant migration policy. Migration is + per-cluster, driven by the controller of the same release. +- A general bidirectional transformation language. The pair of + templates is intentionally not auto-inverted; authors write both + halves. + +## Design + +### File layout + +Per application package: + +``` +packages/apps// +├── values.yaml # storage form, single source of truth +├── api/ +│ ├── v1alpha1.yaml # legacy served version +│ └── v1.yaml # storage version (served) +└── templates/ + └── _cozystack-version.tpl # auto-injected guard +``` + +`api/v*.yaml` is the human-edited source. `cozyrds/.yaml` +(`ApplicationDefinition`) becomes a tooling output, generated by +`cozyschema build`. + +### ApplicationSchema (per version) + +```yaml +apiVersion: cozystack.io/v1 +kind: ApplicationSchema +metadata: { name: postgres } +spec: + version: v1alpha1 + served: true + storage: false + + schema: + type: object + properties: + replicas: { type: integer, default: 2 } + ... + + conversion: + # values_ -> values_storage + to: | + replicas: {{ .replicas }} + maxConnections: {{ index .postgresql.parameters "max_connections" }} + users: + {{- range $name, $u := .users | default dict }} + - name: {{ $name | quote }} + {{- with $u.password }} + password: {{ . | quote }} + {{- end }} + {{- end }} + ... + + # values_storage -> values_ + from: | + replicas: {{ .replicas }} + postgresql: + parameters: + max_connections: {{ .maxConnections }} + users: + {{- range .users | default list }} + {{ .name }}: + {{- with .password }} + password: {{ . | quote }} + {{- end }} + {{- end }} + ... +``` + +The storage version (`spec.storage: true`) has no `conversion:` block. + +### `_version` field + +The api-server stamps `_version: ` into each HelmRelease's +`values.yaml`. Other `_*` fields (`_namespace`, `_cluster`) follow the +existing convention. Tenants never see `_*` fields through the api +surface; the api-server adds them on write and strips them on read. + +### Conversion flow + +Storage version is the hub. The api-server orchestrates: + +``` +Write spec_X (any served X): + if X == storage: + write values_X + else: + write versions[X].to(values_X) + stamp _version = storage + +Read as Y (any served Y): + read values_storage from HelmRelease + if Y == storage: + return values_storage minus _* fields + else: + return versions[Y].from(values_storage) minus _* fields +``` + +`spec` and `values` for a given version are the same object minus +service `_*` fields. There is no separate spec-to-values projection +per version: the api-server does this generically. + +### Migration controller + +A controller watches HelmReleases. When a HelmRelease's `_version` +does not match the current storage version (this happens after a +Cozystack upgrade that bumps the storage version), the controller +applies the non-storage version's `to` template and rewrites the +HelmRelease. + +``` +for hr in helmreleases of this kind: + cv := hr.values._version + if cv == storageVersion: continue + if cv not in servedVersions: + emit Warning: ManualMigrationRequired + continue + valuesNew := apply(versions[cv].to, valuesOld) + valuesNew._version = storageVersion + patch(hr, valuesNew) +``` + +The migration is idempotent and self-healing. Helm reconcile failures +during the migration window (chart already updated, values not yet +migrated) are expected and recover once the controller catches up. + +### Chart-side guard + +To turn the migration window's failures into human-readable errors, +`cozyschema build` injects a small template: + +```gotemplate +{{- define "cozystack.versionGuard" -}} +{{- $expected := index .Chart.Annotations "cozystack.io/expectsVersion" -}} +{{- $actual := .Values._version | default "" | toString -}} +{{- if ne $actual $expected -}} +{{- fail (printf "chart %s@%s expects _version=%s, got _version=%s" + .Chart.Name .Chart.Version $expected $actual) -}} +{{- end -}} +{{- end -}} +``` + +The guard is included in a template that always renders, and the +build tool writes `cozystack.io/expectsVersion: ` to +`Chart.yaml`'s annotations. Charts target exactly one version (the +current storage). They do not branch on `_version` internally. + +### `cozyschema build` + +Inputs: `api/v*.yaml`, the chart directory. + +Outputs: + +- `ApplicationDefinition` (replacement for the current + `cozyrds/*.yaml`). +- `values.schema.json` for Helm validation. +- `templates/_cozystack-version.tpl` and the include site. +- `Chart.yaml` annotation update. + +CI checks: + +- All keys in `values.yaml` are referenced by at least one served + version's schema. +- Each non-storage served version provides both `to` and `from`. +- Round-trip identity on golden samples: + `from(to(values_X))` equals `values_X` for every served X. +- `helm template` renders without error on the storage-form sample. +- A version cannot be removed from `served` while the controller + still observes HelmReleases at that `_version`. + +## User-facing changes + +- Tenants see new API versions appearing on cozystack-api-server + resources. Old versions remain served for at least one full + deprecation cycle. +- `kubectl get` of a managed resource may transparently report values + in whichever version was requested via `apiVersion`. +- No change to chart authoring conventions for application maintainers + who do not introduce a new API version. The new `api/` directory is + required only when a version bump is desired. + +## Upgrade and rollback compatibility + +- Upgrading Cozystack ships the new chart and the updated + `ApplicationDefinition` together. The migration controller + normalizes existing resources in the background. +- During the migration window, Helm reconcile may fail on + not-yet-migrated resources with the guard's explicit message. + Flux's retry loop drives convergence. +- Rollback to a previous Cozystack release ships the previous chart + and previous `ApplicationDefinition`. Resources already migrated + to the new storage version are converted back via the new version's + `to` template (now `from` from the previous storage's perspective). + This requires that conversion templates be symmetric, which the + round-trip CI check enforces. +- A single Cozystack release should not bump storage version twice + in a row. Two consecutive storage bumps would require a transitive + conversion path that is not supported here. + +## Security + +- Templates are stored in `ApplicationDefinition`. Sprig's filesystem + and network functions (`env`, `readFile`, `lookup`, + `getHostByName`, etc.) are denied during conversion-template + execution. Only data-shaping functions (`toJson`, `toYaml`, + `quote`, `nindent`, `default`, `range`, `with`, arithmetic) are + permitted. +- The migration controller writes through the api-server's normal + authorization path and obeys per-tenant RBAC. +- `_version` and other `_*` fields remain server-side metadata. They + cannot be set or read by tenants through the api surface. + +## Failure and edge cases + +- Missing `_version` on a HelmRelease (legacy resource): the + `ApplicationDefinition` may declare a `legacyDefault: `. + The api-server treats absent `_version` as that default on the + next read or migration. +- `_version` set to a value not in `served`: the api-server returns + 410 Gone with a message pointing the operator at the cozystack + upgrade documentation. The migration controller skips the resource + and emits a warning event. +- Concurrent writes during migration: the migration controller uses + optimistic concurrency on HelmRelease's `resourceVersion` and + retries on conflict. +- Conversion-template render produces invalid YAML: caught at CI + time on golden samples; at runtime the api-server returns 500 + with the template error and refuses the write or read. +- A field exists in version Y but not in version X (downgrade-style + read): the field is omitted from the X projection. If round-trip + identity is required for that field, the field's data is preserved + in the storage form and re-projected on the next upgrade-style + read. + +## Testing + +- Unit tests on `cozyschema build`: parse `api/v*.yaml`, render + conversion templates on synthetic inputs, assert round-trip + identity. +- Integration test on a real cluster: create a v1alpha1 resource, + upgrade ApplicationDefinition to make v1 storage, observe + migration, read back as both v1alpha1 and v1, assert structural + equivalence. +- Helm template smoke test against the storage-form sample for every + application package. +- Chart guard fail-fast test: render chart with mismatched + `_version`, assert the guard's error message. + +## Rollout + +1. Land `cozyschema build` and the per-version file format. No + ApplicationDefinition needs to use it yet; existing + `cozyvalues-gen`-generated definitions continue to work. +2. Migrate one application package (`postgres`) to the new format + with only one served version. Verify the generated + `ApplicationDefinition` is byte-equivalent to the existing one. +3. Introduce a second served version on the same package, exercising + the conversion templates and the migration controller end-to-end + on a test cluster. +4. Migrate the rest of the application packages incrementally. +5. Once all packages use the new format, remove `cozyvalues-gen`. + +## Open questions + +- **Where does UI metadata live?** This proposal deliberately leaves + `ui:` (groups, widgets, `showIf`, `itemTitle`) out of + `ApplicationSchema`. The leading suggestion is to move it to a + separate resource, e.g. `ApplicationView` or `ApplicationForm`, + bound to an `ApplicationSchema` by name and version. Reasons in + favor of separation: schema and UI rev independently; UI may want + multiple variants (admin form vs tenant form); UI metadata is + consumed only by the dashboard, while schema and conversion are + consumed by the api-server. Reasons against: two files per version + to keep in sync; cross-references add CI surface. A follow-up + proposal should pick one path and define the resource shape. +- Should `legacyDefault` (for `_version`-less legacy resources) be a + per-application setting, a cluster-wide policy, or both? +- Where is the right include site for the chart-side guard, given + the variety of templates across packages? A first-rendered + template, `_helpers.tpl`, or a synthetic `templates/_init.yaml`? +- Should `cozyschema build` be a separate binary or a subcommand of + an existing tool (`cozypkg`, `cozyvalues-gen`)? +- Beyond `to`/`from`, do we need a `validate` template per version + for cross-field validation that JSON Schema cannot express? + +## Alternatives considered + +- **Single jq expression per direction.** jq is JSON-native and + compact, especially for `to_entries`/`from_entries` work. Rejected + because go-template plus Sprig is already the lingua franca in + Cozystack charts; introducing a new runtime adds a dependency and + a learning surface for marginal gain on the most common shapes. +- **Lens-based bidirectional DSL** (rename, mapToArray, flatten, + prune). Author writes one description, both directions are + derived. Rejected for now: the inverse-derivation rules are + subtle, the DSL becomes a new language to learn, and the savings + over two go-templates are not large for the shapes we actually + rewrite. +- **Conversion webhooks (Kubernetes style).** Each application would + ship a webhook deployment with TLS. Rejected because it adds an + operating burden disproportionate to the value, and because + `values.yaml` is already a natural hub. +- **Comments-as-DSL inside `values.yaml`** (extend + `cozyvalues-gen`). Reads as a working example, but multi-line + `@values` blocks become YAML-in-comment, lint becomes brittle, + and DRY across versions is hard. Kept as a possible documentation + surface; not the source of truth. +- **JSON Patch (RFC 6902) as the conversion language.** Operations + are obvious for renames and moves but cannot express `map` ↔ + `array` with a key projection without iterating over + instance-specific keys. Rejected as too narrow. + +--- + +