Skip to content

feat: support TRIGGER ON UPDATE and SCHEDULE EVERY for MV and ST#1434

Open
sd-db wants to merge 4 commits into
1.12.latestfrom
sd-db/feature/mv-st-trigger-schedule-syntax
Open

feat: support TRIGGER ON UPDATE and SCHEDULE EVERY for MV and ST#1434
sd-db wants to merge 4 commits into
1.12.latestfrom
sd-db/feature/mv-st-trigger-schedule-syntax

Conversation

@sd-db
Copy link
Copy Markdown
Collaborator

@sd-db sd-db commented Apr 30, 2026

Summary

Adds the full Databricks refresh-schedule grammar on materialized views and streaming tables: `SCHEDULE CRON`, `SCHEDULE EVERY`, and `TRIGGER ON UPDATE [AT MOST EVERY INTERVAL ...]`.

Closes #1293.

User-facing config:

config:
  schedule:
    cron: '0 0 * * * ? *'
    time_zone_value: 'America/Los_Angeles'
    # OR
    every: '2 HOURS'
    # OR
    on_update: true
    at_most_every: '15 MINUTES'

Test plan

  • Unit tests for parser, validator, diff, and refresh-schedule macros (873 unit tests passing).
  • Functional round-trip tests per mode for both MV and ST against live UC SQL endpoint.
  • Functional lifecycle tests walking each relation through MANUAL → CRON → ON_UPDATE → EVERY → (non-refresh component change) → MANUAL.
  • `pre-commit run --all-files` clean (ruff, ruff-format, mypy).

Adds the full Databricks refresh-schedule grammar on materialized
views and streaming tables: SCHEDULE CRON, SCHEDULE EVERY, and
TRIGGER ON UPDATE [AT MOST EVERY INTERVAL ...].

- RefreshConfig is a discriminated config with parse-time validation
  on the new modes.
- Idempotent runs no longer emit spurious ALTERs across user/server
  unit differences.
- Auto-REFRESH is suppressed for every/on_update modes.

Coverage:
- Unit tests for parser, validator, diff, and refresh-schedule macros.
- Functional round-trip tests per mode for both MV and ST.
- Functional lifecycle tests walking each relation through every mode
  transition plus a non-refresh-component change.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 30, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  dbt/adapters/databricks/relation_configs
  refresh.py 46, 55, 66, 134, 157, 180, 211, 220
  streaming_table.py
Project Total  

This report was generated by python-coverage-comment-action

Drop log-line and SQL-emission assertions, keep only mode-value checks
on the resulting relation. Remove tests that only inspected dbt logs.
@tejassp-db

This comment was marked as duplicate.

1 similar comment
@tejassp-db
Copy link
Copy Markdown
Collaborator

/integration-test

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Integration tests dispatched for PR #1434 by @tejassp-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Integration results for PR #1434 — UC cluster ❌ cancelled · SQL warehouse ❌ cancelled · All-purpose cluster ❌ failure

Run details.

@sd-db
Copy link
Copy Markdown
Collaborator Author

sd-db commented May 8, 2026

/integration-test

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Integration tests dispatched for PR #1434 by @sd-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Integration results for PR #1434 — UC cluster ⏩ skipped · SQL warehouse ⏩ skipped · All-purpose cluster ⏩ skipped · Shard coverage ⏩ skipped

Run details.

@sd-db
Copy link
Copy Markdown
Collaborator Author

sd-db commented May 11, 2026

/integration-test

@github-actions
Copy link
Copy Markdown

Integration tests dispatched for PR #1434 by @sd-db. Track progress in the Actions tab.

@github-actions
Copy link
Copy Markdown

Integration results for PR #1434 — UC cluster ✅ success · SQL warehouse ✅ success · All-purpose cluster ✅ success · Shard coverage ✅ success

Run details.

Comment thread dbt/adapters/databricks/relation_configs/refresh.py
ON_UPDATE = "on_update"


CRON_REGEX = re.compile(r"^CRON '(.*)' AT TIME ZONE '(.*)'$")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we take into account trailing comments with hash?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRON_REGEX is used to parse realtion config from the server side, basically the result from DESCRIBE EXTENDED. The format here is heavily sanitised and quite simple. For CRON_REGEX the value is exiting REGEX as well that is used in production

CRON_REGEX = re.compile(r"^CRON '(.*)' AT TIME ZONE '(.*)'$")
EVERY_REGEX = re.compile(r"^EVERY (\d+) (HOURS?|DAYS?|WEEKS?)$", re.IGNORECASE)
TRIGGER_REGEX = re.compile(
r"^TRIGGER ON UPDATE(?: AT MOST EVERY INTERVAL (\d+) SECONDS?)?$",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be any valid time qualifier like MINUTES, HOURS, etc. See docs. Example

CREATE OR REFRESH STREAMING TABLE catalog.schema.customer_orders
  TRIGGER ON UPDATE AT MOST EVERY INTERVAL 5 MINUTES
AS SELECT
    o.customer_id,
    o.name,
    o.order_id
FROM catalog.schema.orders o;

Copy link
Copy Markdown
Collaborator Author

@sd-db sd-db May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The syntax is for parsing server side values that are returned from DESCIRBE EXTENDED. The syntax is derived from real tests against warehouse to verify on the schema. While in the yml we can have multiple formats on server side it is always stored in seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants