Skip to content

Resolve dict-via-variable trigger serialize() in static check#67300

Open
dkranchii wants to merge 1 commit into
apache:mainfrom
dkranchii:trigger-serialize-dict-var-resolver
Open

Resolve dict-via-variable trigger serialize() in static check#67300
dkranchii wants to merge 1 commit into
apache:mainfrom
dkranchii:trigger-serialize-dict-var-resolver

Conversation

@dkranchii
Copy link
Copy Markdown

Summary

Follow-up to #66960. Extends the static check scripts/ci/prek/check_trigger_serialize_init.py to resolve trigger serialize() methods built via a local variable — the most common pattern that the original check explicitly skipped to avoid false positives. Also adds the first test file for the hook.

Triggers like WorkflowTrigger and DagStateTrigger (standard provider) that were silently skipped before this change are now actually validated.

What's resolved

The new resolver handles this shape:

def serialize(self):
    data = {"a": self.a}                       # literal-dict init (Assign or AnnAssign)
    data["b"] = self.b                         # subscript assign with string-constant key
    data.update({"c": self.c, "d": self.d})    # .update() with a literal-dict argument
    if cond:
        data["e"] = self.e                     # conditional branches are unioned
    return "<classpath>", data

All key-adding paths (including conditional branches) are unioned — a key that could appear in the output is treated as preserved on the round-trip.

Conservative-skip preserved

The check still returns None (skip rather than guess) on every dynamic shape it cannot prove statically. New skip cases:

  • Reassignment of the return variable to a non-dict expression
  • Multiple literal-dict initializations of the same variable
  • Subscript assignments with a non-string-constant key
  • .update() with a non-literal-dict argument or with kwargs
  • Variable never initialized in the function

Plus a small structural improvement: return statements are now collected via a new _walk_skip_nested helper, so a return inside a nested helper function inside serialize() no longer trips the "exactly one return" check (it never was a real second return for serialize).

Verification

  • Static check on providers/ tree: exit 0 — no false positives introduced.
  • WorkflowTrigger (11 init params) and DagStateTrigger (5 init params) now resolve cleanly — both were silently skipped before.
  • 19 unit tests covering both resolvers (literal-dict and dict-via-variable), 6 parametrized unresolvable shapes, nested-helper isolation, and 2 still-skipped sanity shapes — all pass.
  • Synthetic-bug test (a WorkflowTrigger with a deliberately-dropped param) is correctly flagged.

Test plan

  • uv run --project scripts pytest scripts/tests/ci/prek/test_check_trigger_serialize_init.py -xvs passes
  • prek run --from-ref upstream/main --stage pre-commit passes
  • prek run --from-ref upstream/main --stage manual passes
  • Static check on providers tree (uv run --project scripts python scripts/ci/prek/check_trigger_serialize_init.py) returns 0

Was generative AI tooling used to co-author this PR?
  • Yes — Cursor (Claude Opus 4.7)

Generated-by: Cursor (Claude Opus 4.7) following the guidelines

Extends the trigger __init__/serialize() consistency check to handle
serialize() methods that build their return dict via a local variable
(literal-dict init, optional subscript assignments, and .update() with
literal-dict args, with conditional branches unioned). Triggers like
WorkflowTrigger and DagStateTrigger that were previously silently
skipped are now actually validated.

Also adds the first test file for this hook.
@boring-cyborg boring-cyborg Bot added area:dev-tools backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch labels May 21, 2026
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented May 21, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example Dag that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant