[SPARK-56328][SQL] Fix inline table collation handling for INSERT VALUES and DEFAULT COLLATION#55160
Open
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Open
[SPARK-56328][SQL] Fix inline table collation handling for INSERT VALUES and DEFAULT COLLATION#55160ilicmarkodb wants to merge 1 commit intoapache:masterfrom
ilicmarkodb wants to merge 1 commit intoapache:masterfrom
Conversation
bc4dc9d to
0f5e5ed
Compare
0f5e5ed to
b173882
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR fixes two related issues with how collations interact with inline tables (VALUES clauses):
1. Eager evaluation bypasses DEFAULT COLLATION for CREATE TABLE/VIEW
Inline tables are eagerly evaluated during parsing for performance. But when inside
CREATE TABLE ... DEFAULT COLLATION UTF8_LCASE AS SELECT * FROM VALUES ('a') AS T(c1), the default collation must be applied to string literals during analysis. Since eager evaluation happens before analysis, the collation was lost.The fix adds
canEagerlyEvaluateInlineTablewhich prevents eager evaluation when the inline table is inside a CREATE TABLE/VIEW statement and contains string literals that need collation resolution.2. INSERT INTO VALUES fails with INCOMPATIBLE_TYPES_IN_INLINE_TABLE for collated columns
When using
INSERT INTO ... VALUESwith collated columns, the inline table resolution could fail because values in the same column end up with different collations. This happens when:ResolveColumnDefaultInCommandInputQueryresolvesDEFAULTto a typed null with the target column's collation, which differs from other literals' collationCOLLATEon values produces mismatched collations across rowsThe fix adds an
ignoreCollationparameter toEvaluateUnresolvedInlineTablethat strips collations from input types before finding the common type. This is safe for INSERT because the INSERT coercion will cast each value to the target column's type, including collation.The collation stripping is applied only when the inline table is the direct VALUES clause of an INSERT statement:
isInlineTableInsideInsertValuesClausewalks up the parser context tree to detectINSERT INTO t VALUES (...)vsINSERT INTO t SELECT * FROM VALUES (...) AS TResolveInlineTablespattern-matchesInsertIntoStatementwith a directUnresolvedInlineTablequery childStandalone
SELECT * FROM VALUES (...)and CTAS with conflicting explicit collations continue to fail as expected.Why are the changes needed?
Without this fix:
Does this PR introduce any user-facing change?
Yes.
CREATE TABLE/VIEW ... DEFAULT COLLATION ... AS SELECT * FROM VALUES (...)now correctly applies the default collation to inline table literals.INSERT INTO ... VALUESwith collated columns now succeeds in cases that previously failed withINCOMPATIBLE_TYPES_IN_INLINE_TABLE.How was this patch tested?
New tests in
CollationSuitecovering both eager and non-eager evaluation paths (EAGER_EVAL_OF_UNRESOLVED_INLINE_TABLE_ENABLED = true/false):New single-column inline table test variants in
DefaultCollationTestSuitefor CTAS and CREATE VIEW with DEFAULT COLLATION.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code