[common] Deep copy BinaryString in createDeepFieldGetter to fix use-after-free#3045
Closed
YannByron wants to merge 1 commit intoapache:mainfrom
Closed
[common] Deep copy BinaryString in createDeepFieldGetter to fix use-after-free#3045YannByron wants to merge 1 commit intoapache:mainfrom
YannByron wants to merge 1 commit intoapache:mainfrom
Conversation
…fter-free IndexedRow.getString() returns a zero-copy BinaryString view into the underlying MemorySegment. When DefaultCompletedFetch.drain() releases the backing Netty ByteBuf, these BinaryString views become dangling references, causing data corruption on platforms that eagerly reclaim freed memory (e.g., Linux CI). This fixes the flaky test RemoteLogScannerITCase#testScanFromRemoteAndProject by adding a STRING case to createDeepFieldGetter and createDeepElementGetter that calls BinaryString.copy() to materialize the string data into independent heap memory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
luoyuxia
reviewed
Apr 10, 2026
| * array/map/row types. | ||
| * | ||
| * <p>NOTE: Currently, it is only used for deep copying {@link ColumnarRow} for Arrow which | ||
| * avoid the arrow buffer is released before accessing elements. It doesn't deep copy STRING and |
Contributor
There was a problem hiding this comment.
Sorry for miss this part. From the comments, seems STRING is not need to copy. And I'm also afraid copy string will cause performance degrade.
Contributor
Author
|
Thanks for the review @luoyuxia and @fresh-borzoni! @luoyuxia #3008 has a more detailed explanation on why @fresh-borzoni Thanks for pointing out #3008! Your approach with the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RemoteLogScannerITCase#testScanFromRemoteAndProject(Issue [test] Unstable test RemoteLogScannerITCase.testScanFromRemoteAndProject #2992)IndexedRow.getString()returns a zero-copyBinaryStringview into the underlyingMemorySegment. WhenDefaultCompletedFetch.drain()releases the backing NettyByteBuf, these views become dangling references, causing data corruption on platforms that eagerly reclaim freed memory (e.g., Linux CI)STRINGcase tocreateDeepFieldGetterandcreateDeepElementGetterthat callsBinaryString.copy()to materialize string data into independent heap memoryTest plan
RemoteLogScannerITCase#testScanFromRemoteAndProjectpasses consistently (3/3 runs)