Restore Sort unparser guard for correct ORDER BY placement (v2)#20923
Closed
alamb wants to merge 4 commits intoapache:mainfrom
Closed
Restore Sort unparser guard for correct ORDER BY placement (v2)#20923alamb wants to merge 4 commits intoapache:mainfrom
alamb wants to merge 4 commits intoapache:mainfrom
Conversation
26 tasks
Contributor
|
Apologies for not updating original PR - was busy last couple days. Thanks @alamb! |
Contributor
Author
|
I am hoping we can get the original PR in (I can't merge my own pull request without another committer). |
github-merge-queue bot
pushed a commit
that referenced
this pull request
Mar 21, 2026
## Which issue does this PR close? - closes #20905 - closes #20923 Restore the `already_projected()` guard in the Sort case of `select_to_sql_recursively`. Without it, queries with ORDER BY expressions generate invalid SQL when the Sort node follows an outer Projection. ## Problem ## Two regressions in the DuckDB federation unparser after upgrading to DataFusion 52: 1. clickbench q25: Queries like `SELECT "SearchPhrase" ... ORDER BY to_timestamp("EventTime") LIMIT 10` produce ORDER BY outside the subquery, referencing table names ("hits") that are out of scope. 2. tpcds q36: ORDER BY with `GROUPING()` expressions loses the `derived_sort` subquery alias, placing sort references outside their valid scope. ## Root Cause ## DF52's optimizer merges `Limit` into `Sort` as a fetch parameter, changing the plan shape from `Projection → Limit → Sort → ...` to `Projection → Sort(fetch) → ....` The Sort case previously had a guard that wrapped the sort into a `derived_sort` subquery when `already_projected()` was true. This guard was removed in DF52, causing ORDER BY and LIMIT to be placed on the outer query where inner table references are invalid. ## Fix ## Re-add the guard at the top of the Sort match arm in select_to_sql_recursively: ```rust if select.already_projected() { return self.derive_with_dialect_alias("derived_sort", plan, relation, false, vec![]); } ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Fixes a regression in the unparser introduced in Datafusion 52, as described by @krinart in #20658
I can't fix #20658 because I can't push to the fork, so I made a new PR to resolve a merge conflict and apply some suggestions
What changes are included in this PR?
Are these changes tested?
By CI and new tests
Are there any user-facing changes?
Bug fix