Make parquet/datafusion memory settings configurable#21547
Make parquet/datafusion memory settings configurable#21547gaurav-amz wants to merge 2 commits intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit a40037f)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to a40037f Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit feaa5f0
Suggestions up to commit 1b1eb34
|
|
❌ Gradle check result for 1b1eb34: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit feaa5f0 |
|
❌ Gradle check result for feaa5f0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
- Add parquet.min_native_allocation / max_native_allocation_ceiling companions for the existing parquet.max_native_allocation percentage setting; ArrowBufferPool now clamps the resolved value when configured as a percentage. - Add parquet.arrow.child_allocation, replacing the hardcoded root/10 per-VSR cap in ArrowBufferPool. - Convert datafusion.memory_pool_limit_bytes and spill_memory_limit_bytes from longSetting(maxMemory/N) heap-based defaults to simpleString percentage of non-heap memory, with .min / .max companions for floor/ceiling clamping. - Mirrors the indices.memory.native_index_buffer_size pattern in IndexingMemoryController, the only existing OpenSearch precedent for non-heap percentage memory settings. Signed-off-by: Gaurav Singh <snghsvn@amazon.com>
- Apply spotless formatting to ArrowBufferPool and DataFusionPlugin to fix the gradle check failure (lines that exceeded the column budget were re-flowed by spotless). - Add a setting-time validator on parquet.max_native_allocation, datafusion.memory_pool_limit_bytes, and datafusion.spill_memory_limit_bytes so malformed values (e.g. "abc%", "10xb") are rejected at update time instead of surfacing as parser exceptions on the next read. - Snapshot the static memory_pool_limit min/max companions at plugin startup; the dynamic-update listener now resolves new limits against this snapshot rather than re-reading ClusterService#getSettings(), which can return stale or default values during transient cluster-state propagation. Signed-off-by: Gaurav Singh <snghsvn@amazon.com>
feaa5f0 to
a40037f
Compare
|
Persistent review updated to latest commit a40037f |
|
❌ Gradle check result for a40037f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
[Describe what this change achieves]
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.