Skip to content

fix: support literal sha2() with 'Unsupported argument types'#3466

Open
0lai0 wants to merge 4 commits intoapache:mainfrom
0lai0:native_engine_crashes_on_sha2
Open

fix: support literal sha2() with 'Unsupported argument types'#3466
0lai0 wants to merge 4 commits intoapache:mainfrom
0lai0:native_engine_crashes_on_sha2

Conversation

@0lai0
Copy link

@0lai0 0lai0 commented Feb 10, 2026

Which issue does this PR close?

Closes #3340
part of #3328

Rationale for this change

When Spark's ConstantFolding optimizer rule is disabled, an all-literal sha2('test', 256) call reaches the native engine as ScalarValue arguments.

What changes are included in this PR?

Added a dedicated scalar execution path to handle literal inputs, preventing crashes when ConstantFolding is disabled.

How are these changes tested?

cargo test -p datafusion-comet-spark-expr hash_funcs::sha2::tests

@andygrove
Copy link
Member

Test failure example:

2026-02-13T16:58:50.3294698Z - hash functions *** FAILED *** (392 milliseconds)
2026-02-13T16:58:50.3299944Z   org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 3753.0 failed 1 times, most recent failure: Lost task 4.0 in stage 3753.0 (TID 12908) (localhost executor driver): org.apache.comet.CometNativeException: could not cast array of type Binary to arrow_array::array::byte_array::GenericByteArray<arrow_array::types::GenericStringType<i32>>.
2026-02-13T16:58:50.3302951Z This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues

@andygrove
Copy link
Member

Thanks for looking into this @0lai0. I don't think we should implement a new version of Sha2. The version in this PR has less functionality than the upstream version that it is replacing (fewer data types supported). I looked at the latest upstream code and it does appear to support scalar arguments. I wonder if there were improvements in DF v52.0.0 (we have a PR for upgrading to that). If not, perhaps it would be better just to fall back to Spark if all arguments are scalar, by updating the Scala-side getSupportLevel checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native engine crashes on literal sha2() with 'Unsupported argument types'

2 participants