Skip to content

Add TigeraStatus warnings for ignored resources and override correlation#4649

Open
caseydavenport wants to merge 14 commits intotigera:masterfrom
caseydavenport:casey-override-warnings
Open

Add TigeraStatus warnings for ignored resources and override correlation#4649
caseydavenport wants to merge 14 commits intotigera:masterfrom
caseydavenport:casey-override-warnings

Conversation

@caseydavenport
Copy link
Copy Markdown
Member

@caseydavenport caseydavenport commented Apr 7, 2026

Description

Builds on #4644 and #4645 to add two more status manager improvements.

Unsupported ignore annotation warning: When a resource has the unsupported.operator.tigera.io/ignore annotation, the operator silently skips managing it. This was invisible in TigeraStatus - now a warning surfaces in the Available condition message so users know the operator isn't managing that resource.

Override correlation hints: When the operator applies user-specified probe timing or resource overrides and the corresponding pod is failing, the status manager now includes a hint in the diagnostic message. For example, if a pod is failing readiness and the user has custom readiness probe configuration, the message says "Pod X is running but not ready; custom readiness probe configuration is in effect". Similarly for liveness probe failures (exit code 137) and OOMKilled with custom resource limits.

The override correlation works via an annotation (operator.tigera.io/custom-overrides) that the render package sets on workloads when applying overrides. The status manager reads this annotation when diagnosing pod failures.

Example TigeraStatus messages

Pod crash looping with OOMKilled and custom resource limits:

Pod calico-system/calico-node-abc has crash looping container: calico-node (OOMKilled, exit code 137); custom resource limits are in effect

Pod failing readiness with custom probe config:

Pod calico-system/calico-node-xyz is running but not ready; custom readiness probe configuration is in effect

Possible liveness failure with custom liveness config:

Pod calico-system/compliance-server-abc has crash looping container: compliance-server (exit code 137, possible liveness probe failure); custom liveness probe configuration is in effect

Unsupported ignore annotation (in Available condition message):

All objects available; DaemonSet "calico-system/calico-node" has the unsupported ignore annotation; the operator is not managing this resource

Depends on #4645.

None

…Issues

Wire diagnosePods and summarizeIssues into syncState, replacing the
old podsFailing/containerErrorMessage functions. Each workload type
now reports not-found as a degraded condition instead of silently
continuing. DaemonSets and Deployments pass revision info so
diagnosePods can distinguish old-revision pods from current ones.
When an object has the unsupported.operator.tigera.io/ignore annotation,
surface a warning through TigeraStatus so users know the operator is not
managing the resource. Clear the warning if the annotation is later removed.
…applied

When the render package applies probe timing or resource overrides to a
workload, set an operator.tigera.io/custom-overrides annotation with a
comma-separated list of which override types were applied. This will be
used by diagnosePods to correlate pod failures with user overrides.
@caseydavenport caseydavenport force-pushed the casey-override-warnings branch from 2e1de6e to 6ddf9a8 Compare April 7, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants