Fix bounded-dask proximity crash on skinny rasters#2877
Merged
Conversation
_halo_depth() can return a pixel search radius larger than the raster height or width. That depth was passed straight to da.map_overlap, which rejects a depth larger than the array along an axis and raised "ValueError: The overlapping depth ... is larger than your array ...". Add _fit_halo_to_chunks(): when the halo is deeper than the smallest chunk on an axis, fold that axis into a single chunk and drop its depth to zero. Every chunk then sees the full axis, so no target within max_distance is missed and the result still matches the numpy backend. Applied to both the dask+numpy and dask+cupy map_overlap call sites. Add a regression test covering the 3x100 skinny raster across proximity, allocation, and direction, pinned against the numpy backend.
brendancol
commented
Jun 2, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Fix bounded-dask proximity crash on skinny rasters
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- xrspatial/proximity.py:307
_fit_halo_to_chunksrechunks an over-deep axis to a single chunk instead of just clamping the depth. That is the correct call: clamping the depth while keeping multiple chunks would silently drop targets that live in non-adjacent chunks. The cost is that the folded axis is fully materialised per task, which is cheap for the skinny axis that triggers it. Worth a one-line docstring note that the fold trades chunking on that axis for correctness, so a later reader does not "optimise" it back to a bare depth clamp.
What looks good
- The fix lives at the two map_overlap call sites (dask+numpy at proximity.py:1431, dask+cupy at proximity.py:614) and leaves
_halo_depthalone. The depth math is unchanged; only the chunking moves, and only when it has to. - Folding the over-deep axis to a single chunk and zeroing its depth is correct. Every chunk then sees the whole axis, so no target within max_distance is lost. Checked against numpy for the skinny case, a multi-chunk skinny case, and a great-circle skinny case near the pole.
- The common non-skinny path is untouched. A halo smaller than the chunk leaves the chunking alone, so there is no extra materialisation in the normal case.
- The numpy path reassigns
raster.data, which matches the unbounded branch right above it. The cupy path uses a localraster_datato avoid mutating the cupy-backed input. The asymmetry is deliberate. - Regression test covers proximity, allocation, and direction on the 3x100 raster, pinned against numpy with equal_nan.
Checklist
- Algorithm correctness: depth math unchanged; chunking fold is correct.
- Backend parity: dask+numpy and dask+cupy both fixed; numpy/cupy paths untouched. The dask+cupy fold is not exercised in CI (no GPU) but mirrors the numpy path.
- NaN handling: unchanged; boundary=np.nan still used.
- Edge cases: skinny, multi-chunk skinny, and great-circle skinny near the pole all checked.
- Dask chunk boundaries: handled by the rechunk-to-single-chunk fold.
- No premature materialisation in the common path.
- Docstring present on the new helper.
- README/benchmark: not applicable (pure bug fix, no new public API).
brendancol
commented
Jun 2, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after 9f2c39e)
The single nit from the first pass is addressed: _fit_halo_to_chunks now carries a docstring note that the single-chunk fold is deliberate and must not be swapped for a bare depth clamp (which would drop targets in non-adjacent chunks).
No new findings. No blockers or suggestions outstanding. The skinny-raster regression test still passes and flake8 is clean on the changed file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2854
What
_halo_depth()can return a pixel search radius larger than the raster height or width. That depth went straight intoda.map_overlap, which rejects a depth larger than the array along an axis and raisedValueError: The overlapping depth ... is larger than your array .... Reproduced on a 3x100 raster with one target andmax_distance=10._fit_halo_to_chunks(): when the halo is deeper than the smallest chunk on an axis, fold that axis into a single chunk and set its depth to 0. Every chunk then sees the full axis, so no target withinmax_distanceis dropped and the result still matches the numpy backend. Applied to both the dask+numpy and dask+cupymap_overlapcall sites.Backend coverage
Fix touches the dask+numpy and dask+cupy paths. numpy and cupy (non-dask) paths are unaffected. The normal (non-skinny) dask path keeps its existing chunking, so there is no extra materialisation in the common case.
Test plan
test_bounded_dask_skinny_raster_matches_numpycovers the 3x100 skinny raster across proximity, allocation, and direction, pinned against the numpy backend.test_proximity.pypasses (282 tests).