Make allocation/direction tie-breaking deterministic across backends#2881
Merged
Conversation
…2856) When two or more targets are equidistant from a pixel, allocation and direction now pick the target with the lowest flat (row-major) index on every backend. This is the policy the brute-force CPU path and the CUDA kernel already used; the NumPy line-sweep and the Dask KDTree path disagreed with it. - Route ALLOCATION and DIRECTION through the exact brute-force search on the numpy and dask+numpy backends. The line-sweep broke ties as a side effect of its four-pass propagation order. PROXIMITY keeps the faster line-sweep since the distance is identical for tied targets. - Pin the Dask KDTree query to the lowest target index on a tie. cKDTree.query does not promise which equidistant neighbour it returns; the new helper queries the two nearest and keeps the smaller index when they tie. Targets are stored in row-major order, so the smaller index is the lowest flat index. - Document the tie-break policy in the allocation and direction docstrings. - Add cross-backend tests pinning the tie outcome for allocation and direction, plus a test for the eager tiled KDTree fallback.
brendancol
commented
Jun 2, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Make allocation/direction tie-breaking deterministic across backends
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
- Performance tradeoff on the unbounded numpy/dask+numpy path:
_process_numpynow sends ALLOCATION and DIRECTION through_process_numpy_bruteforce, which is O(N*T): every pixel scans every target. The line-sweep it replaces is roughly O(N). For a raster with many target pixels and nomax_distancebound, that is a real slowdown. It buys a deterministic cross-backend tie-break and matches what the CUDA kernel already does, so the tradeoff is reasonable, but a one-line code comment would tell the next reader the cost was deliberate. (xrspatial/proximity.py,_process_numpy)
Nits (optional improvements)
_kdtree_query_lowest_indexonly resolves 2-way ties, since it queries k=2. A pixel equidistant to 3 or more targets could still land on a non-minimal index if cKDTree returns the 2nd and 3rd nearest as the top two. The global KDTree path stores targets in row-major order and cKDTree tends to return the lower index first, so this is unlikely to bite in practice, but a short comment noting the k=2 limitation would set expectations. (xrspatial/proximity.py,_kdtree_query_lowest_index)
What looks good
- The tie-break policy (lowest flat row-major index) matches the brute-force and CUDA backends that already existed, so this aligns the two stragglers rather than inventing a new rule.
- Both KDTree entry points (global lazy
_kdtree_chunk_fnand eager tiled_tiled_chunk_query) go through the same helper, so the fallback path is covered. - The
np.isfinite(dists2[:, 1])guard avoids treating an out-of-range second neighbour (inf distance underdistance_upper_bound) as a tie. - Tests pin the actual tie outcome across all four backends for both allocation and direction, plus a dedicated test for the tiled fallback. The direction test pins to the numpy reference rather than a hand-written grid, which is reasonable for an angle.
- Docstrings on both public functions now state the policy.
Checklist
- Algorithm matches reference/policy (lowest flat index, matches brute-force and CUDA)
- All implemented backends produce consistent results
- NaN / out-of-range handling is correct (isfinite guard on the tie mask)
- Edge cases covered by tests (tie on a 3x3, tiled fallback)
- Dask chunk boundaries handled correctly (targets collected in row-major order)
- Performance: brute-force on unbounded numpy alloc/direction is slower than the line-sweep; deliberate tradeoff for determinism
- Benchmark not needed (bug fix, no new function)
- README feature matrix unchanged (no new function, no backend change)
- Docstrings present and accurate
- Note in _process_numpy that brute-force is O(N*T) versus the line-sweep's O(N), a deliberate trade for a cross-backend tie-break. - Note in _kdtree_query_lowest_index that the k=2 query resolves 2-way ties and relies on row-major target order for 3-or-more-way ties.
brendancol
commented
Jun 2, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review
Both items from the first pass are addressed in 8ddf938 (comments only, no behavior change):
- The brute-force performance trade is now spelled out in
_process_numpy: O(N*T) versus the line-sweep's O(N), chosen so allocation and direction break ties the same way on every backend. _kdtree_query_lowest_indexnow states that the k=2 query covers 2-way ties and leans on the row-major target order for the rarer 3-or-more-way case.
No new findings. Behavior is unchanged from the first pass; the proximity suite still passes (288 tests). No blockers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2856
When two or more targets are equidistant from a pixel,
allocationanddirectioncould return different results depending on the backend. The distance is the same for tied targets, but the chosen target was not stable. This pins one policy everywhere: on a tie, the target with the lowest flat (row-major) index wins. That is what the brute-force CPU path and the CUDA kernel already did.Changes
cKDTree.querydoes not promise which of several equidistant neighbours it returns, so a helper queries the two nearest and keeps the smaller index when they tie. Targets are stored in row-major order, so the smaller index is the lowest flat index. This covers both the global (lazy) and tiled (eager fallback) KDTree paths.allocationanddirectiondocstrings.Backend coverage
numpy, cupy, dask+numpy, dask+cupy all resolve ties the same way. The unbounded dask+cupy path converts to dask+numpy and uses the KDTree; the bounded dask+cupy path uses the CUDA kernel, which already followed the policy.
Test plan
allocation(exact expected grid) anddirection(pinned to the numpy reference) over all four backends.test_proximity.pysuite passes (288 tests).