[AURON #2076] Fix typos and unnecessary memory allocation in native engine#2077
[AURON #2076] Fix typos and unnecessary memory allocation in native engine#2077Deegue wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses minor quality issues in the native engine by correcting typos in protobuf deserialization and removing eager string allocations caused by expect(&format!(...)) patterns.
Changes:
- Fix typoed error message strings in
pb_deserializer.rs(“scheam” → “schema”, “exits” → “exists”). - Replace
expect(&format!(...))withunwrap_or_else(...panic!(...))to avoid unconditional string allocation. - Apply the same allocation-avoidance pattern in shuffle and ORC exec paths.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| native-engine/datafusion-ext-plans/src/shuffle/buffered_data.rs | Avoids eager allocation when reporting hash evaluation failures. |
| native-engine/datafusion-ext-plans/src/orc_exec.rs | Avoids eager allocation in a test when resolving schema column indices. |
| native-engine/datafusion-ext-plans/src/flink/serde/pb_deserializer.rs | Fixes typoed diagnostics and removes eager allocation from multiple expect(&format!(...)) calls. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| schema | ||
| .index_of(col_name) | ||
| .expect(&format!("Column '{col_name}' not found")), | ||
| .unwrap_or_else(|_| panic!("Column '{col_name}' not found")), |
There was a problem hiding this comment.
schema.index_of(col_name) returns an error that’s currently discarded (|_|). Consider including the error value in the panic message to make test failures easier to diagnose.
| .unwrap_or_else(|_| panic!("Column '{col_name}' not found")), | |
| .unwrap_or_else(|e| panic!("Column '{col_name}' not found: {e}")), |
| .unwrap_or_else(|| { | ||
| panic!("nested field {msg_field_name} not exists in message_descriptor") | ||
| }); |
There was a problem hiding this comment.
Grammar: the panic message says "not exists". Consider changing to "does not exist" for clearer/grammatically correct diagnostics (e.g., "nested field ... does not exist in message_descriptor").
| .unwrap_or_else(|| { | ||
| panic!( | ||
| "nested innermost field {field_name} not exists in message_descriptor" | ||
| ) | ||
| }); |
There was a problem hiding this comment.
Grammar: the panic message says "not exists". Consider changing to "does not exist" for clearer/grammatically correct diagnostics.
| .get_field_by_name(field.name()) | ||
| .expect(&format!("{} not exits in message_descriptor", field.name())); | ||
| .unwrap_or_else(|| panic!("{} not exists in message_descriptor", field.name())); | ||
| pb_schema_fields.push(create_arrow_field(msg_field_desc.clone(), skip_fields)); |
There was a problem hiding this comment.
Grammar: the panic message says "not exists". Consider changing to "does not exist" for clearer/grammatically correct diagnostics.
| .expect(&format!( | ||
| "Field {field_name} not exits in message_descriptor", | ||
| )); | ||
| .unwrap_or_else(|| panic!("Field {field_name} not exists in message_descriptor",)); |
There was a problem hiding this comment.
Grammar: the panic message says "not exists". Consider changing to "does not exist" for clearer/grammatically correct diagnostics.
| .unwrap_or_else(|| panic!("Field {field_name} not exists in message_descriptor",)); | |
| .unwrap_or_else(|| panic!("Field {field_name} does not exist in message_descriptor",)); |
| let hashes = evaluate_hashes(partitioning, &batch) | ||
| .expect(&format!("error evaluating hashes with {partitioning}")); | ||
| .unwrap_or_else(|_| panic!("error evaluating hashes with {partitioning}")); | ||
| evaluate_partition_ids(hashes, partitioning.partition_count()) |
There was a problem hiding this comment.
The error from evaluate_hashes is being discarded (|_|). Consider capturing the error (|e|) and including it in the panic message so failures have actionable context without reintroducing an eager allocation.
ShreyeshArangath
left a comment
There was a problem hiding this comment.
panicking seems a little aggresive, can you explain the rationale a little more here?
| let msg_field_desc = message_descriptor | ||
| .get_field_by_name(msg_field_name) | ||
| .unwrap_or_else(|| { | ||
| panic!("nested field {msg_field_name} not exists in message_descriptor") |
There was a problem hiding this comment.
This applies to the other instances too
| panic!("nested field {msg_field_name} not exists in message_descriptor") | |
| panic!("nested field {msg_field_name} does not exist in message_descriptor") |
Which issue does this PR close?
Closes #2076
Rationale for this change
Fix typos, improve unnecessary memory usage.
What changes are included in this PR?
Fix typos.
Use
unwrap_or_else(|| panicinstead ofexpect(&format.Are there any user-facing changes?
No.
How was this patch tested?
UT & manual tests