Skip to content

AIP-76: Propagate partition_date to consumers of partitioned assets#67285

Open
nathadfield wants to merge 1 commit into
apache:mainfrom
king:propagate-partition-date-to-consumers
Open

AIP-76: Propagate partition_date to consumers of partitioned assets#67285
nathadfield wants to merge 1 commit into
apache:mainfrom
king:propagate-partition-date-to-consumers

Conversation

@nathadfield
Copy link
Copy Markdown
Contributor

@nathadfield nathadfield commented May 21, 2026

What this does

Makes partition_date a first-class template variable on the consumer side of AIP-76 partitioned assets, so authors can write:

WHERE dt = "{{ partition_date | ds }}"

instead of slicing strings out of partition_key.

How

  • AssetEvent.partition_date is a new column (migration 0116); the producer stamps it from ti.dag_run.partition_date at emission, so the partition's datetime form rides every event the consumer sees.
  • The consumer's target date is computed alongside target_partition_key in assets/manager.py:_queue_partitioned_dags and frozen on a new AssetPartitionDagRun.partition_date column. The scheduler simply copies that value into the consumer DagRun, so the date stays consistent with the mapper that produced the key even if mapper code or config is later edited.
  • IdentityMapper passes the source date through; the StartOf*Mapper family normalises via a new internal to_downstream_normalized (no public mapper-protocol change). Other mappers (ProductMapper, ChainMapper, AllowedKeyMapper, custom) leave partition_date=None and consumers fall back to partition_key.
  • partition_date is output-only on trigger payloads, so a manual trigger cannot create an inconsistent (partition_key, partition_date) pair.
  • Surfaced through to Context["partition_date"] (coerced to tz-aware), the execution-API DagRun / AssetEventResponse / DagRunAssetReference (Cadwyn-versioned to strip the field for older Task SDK clients), the core-API DAGRunResponse, OTel span attributes, and the standard provider's PythonVirtualenvOperator / ExternalPythonOperator serializable-context allow-list.

Known limitations / follow-ups

  • For ProductMapper, ChainMapper, AllowedKeyMapper, and custom mappers, partition_date is None. A public to_downstream_partition_date extension hook would close that gap but expands the mapper protocol — intentionally deferred.
  • If two source events resolve to the same target_partition_key with different normalised datetimes (e.g., two upstream assets with different timezone-configured mappers), the first event wins and a warning is logged in _get_or_create_apdr.

closes: #67239


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Claude Code (Opus 4.7) following the guidelines

@boring-cyborg boring-cyborg Bot added area:airflow-ctl area:API Airflow's REST/HTTP API area:db-migrations PRs with DB migration area:providers area:Scheduler including HA (high availability) scheduler area:task-sdk area:UI Related to UI/UX. For Frontend Developers. backport-to-airflow-ctl/v0-1-test kind:documentation provider:standard labels May 21, 2026
@nathadfield nathadfield force-pushed the propagate-partition-date-to-consumers branch 4 times, most recently from f06372a to 27b7c87 Compare May 22, 2026 07:53
Consumers of partitioned assets receive partition_key (str) but
partition_date (datetime) is None on the consumer DagRun, so templates
have to parse the key string. Propagate the datetime form alongside
the string so consumers can use the canonical filter idiom
`{{ partition_date | ds }}` and friends.

The producer's partition_date is stamped onto AssetEvent at emission;
the consumer's partition_date is computed alongside its target
partition_key at APDR creation (in assets/manager.py:_queue_partitioned_dags)
and stored on AssetPartitionDagRun. The scheduler copies apdr.partition_date
into the consumer DagRun. IdentityMapper passes the source date through;
the StartOf*Mapper family normalizes via to_downstream_normalized; other
mappers leave partition_date None and consumers fall back to partition_key.

closes: apache#67239
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:airflow-ctl area:API Airflow's REST/HTTP API area:db-migrations PRs with DB migration area:providers area:Scheduler including HA (high availability) scheduler area:task-sdk area:UI Related to UI/UX. For Frontend Developers. backport-to-airflow-ctl/v0-1-test kind:documentation provider:standard

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AIP-76: propagate partition_date to consumers

1 participant