-
Notifications
You must be signed in to change notification settings - Fork 307
Fix: map_from_arrays() with NULL inputs causes native crash #3356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 22 commits
768b3e9
c68c342
d887555
231aa90
9500bbb
9577481
3791557
7c2f082
609a605
a151b2c
ad3e7f5
ea92e4b
8dfeca3
559741e
ebda14e
408152e
d7857b2
aef41be
5ac1c58
9ae8e23
55aa1fe
faffcbc
d82900a
5ca3888
160a817
88fc313
ffa68f6
e14c180
e353c82
4b015fa
b5458ab
c831844
16d50e2
4f9e165
b9b9d2e
bf7ea20
47e4aa2
ff6eb1c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,9 +26,7 @@ INSERT INTO test_map_from_arrays VALUES (array('a', 'b', 'c'), array(1, 2, 3)), | |
| query spark_answer_only | ||
| SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NOT NULL | ||
|
|
||
| -- Comet bug: map_from_arrays(NULL, NULL) causes native crash "map key cannot be null" | ||
| -- https://github.com/apache/datafusion-comet/issues/3327 | ||
| query ignore(https://github.com/apache/datafusion-comet/issues/3327) | ||
| query spark_answer_only | ||
| SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NULL | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also add tests where
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it will be the same behavior: in the original data, value == null when key == null. statement
INSERT INTO test_map_from_arrays VALUES (array('a', 'b', 'c'), array(1, 2, 3)), (array(), array()), (NULL, NULL)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ran some additional tests locally and found that the null check needs to cover both inputs, not just the keys. The Spark docs say I added the row Here's a suggested updated test file that covers the missing cases. Note that I did use AI to generate these tests, so it is possible that there are mistakes. -- ConfigMatrix: parquet.enable.dictionary=false,true
statement
CREATE TABLE test_map_from_arrays(k array<string>, v array<int>) USING parquet
statement
INSERT INTO test_map_from_arrays VALUES
(array('a', 'b', 'c'), array(1, 2, 3)),
(array(), array()),
(NULL, NULL),
(array('x'), NULL),
(NULL, array(99))
-- basic functionality
query spark_answer_only
SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NOT NULL AND v IS NOT NULL
-- both inputs NULL should return NULL
query
SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NULL AND v IS NULL
-- keys not null but values null should return NULL (Spark behavior)
query
SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NOT NULL AND v IS NULL
-- keys null but values not null should return NULL (Spark behavior)
query
SELECT map_from_arrays(k, v) FROM test_map_from_arrays WHERE k IS NULL AND v IS NOT NULL
-- all rows including nulls
query spark_answer_only
SELECT map_from_arrays(k, v) FROM test_map_from_arrays
-- literal arguments
query spark_answer_only
SELECT map_from_arrays(array('a', 'b'), array(1, 2))
-- literal null arguments
query
SELECT map_from_arrays(NULL, array(1, 2))
query
SELECT map_from_arrays(array('a'), NULL)
query
SELECT map_from_arrays(NULL, NULL)To fix the serde, the condition should check both inputs for null. Something like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tanks @andygrove, fixed this. |
||
|
|
||
| -- literal arguments | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fully native?