Skip to content

Restore Sort unparser guard for correct ORDER BY placement (v2)#20923

Closed
alamb wants to merge 4 commits intoapache:mainfrom
alamb:viktor/fix-unparser-origin
Closed

Restore Sort unparser guard for correct ORDER BY placement (v2)#20923
alamb wants to merge 4 commits intoapache:mainfrom
alamb:viktor/fix-unparser-origin

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 13, 2026

Which issue does this PR close?

Rationale for this change

Fixes a regression in the unparser introduced in Datafusion 52, as described by @krinart in #20658

I can't fix #20658 because I can't push to the fork, so I made a new PR to resolve a merge conflict and apply some suggestions

What changes are included in this PR?

  1. Start with Restore Sort unparser guard for correct ORDER BY placement #20658
  2. Merge up from main and resolve a conflict

Are these changes tested?

By CI and new tests

Are there any user-facing changes?

Bug fix

@alamb alamb changed the title Restore Sort unparser guard for correct ORDER BY placement (replacement) Restore Sort unparser guard for correct ORDER BY placement (v2) Mar 13, 2026
@krinart
Copy link
Contributor

krinart commented Mar 14, 2026

Apologies for not updating original PR - was busy last couple days.

Thanks @alamb!

@alamb
Copy link
Contributor Author

alamb commented Mar 18, 2026

I am hoping we can get the original PR in

(I can't merge my own pull request without another committer).

@alamb alamb closed this Mar 18, 2026
github-merge-queue bot pushed a commit that referenced this pull request Mar 21, 2026
## Which issue does this PR close?

- closes #20905
- closes #20923

Restore the `already_projected()` guard in the Sort case of
`select_to_sql_recursively`. Without it, queries with ORDER BY
expressions generate invalid SQL when the Sort node follows an outer
Projection.

## Problem ##

Two regressions in the DuckDB federation unparser after upgrading to
DataFusion 52:

1. clickbench q25: Queries like `SELECT "SearchPhrase" ... ORDER BY
to_timestamp("EventTime") LIMIT 10` produce ORDER BY outside the
subquery, referencing table names ("hits") that are out of scope.
2. tpcds q36: ORDER BY with `GROUPING()` expressions loses the
`derived_sort` subquery alias, placing sort references outside their
valid scope.

## Root Cause ##

DF52's optimizer merges `Limit` into `Sort` as a fetch parameter,
changing the plan shape from `Projection → Limit → Sort → ...` to
`Projection → Sort(fetch) → ....` The Sort case previously had a guard
that wrapped the sort into a `derived_sort` subquery when
`already_projected()` was true. This guard was removed in DF52, causing
ORDER BY and LIMIT to be placed on the outer query where inner table
references are invalid.

## Fix ##

Re-add the guard at the top of the Sort match arm in
select_to_sql_recursively:

```rust
 if select.already_projected() {
     return self.derive_with_dialect_alias("derived_sort", plan, relation, false, vec![]);
 }
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect ORDER BY in unparser after DF 52 upgrade

2 participants