Skip to content

Make allocation/direction tie-breaking deterministic across backends#2881

Merged
brendancol merged 2 commits into
mainfrom
issue-2856
Jun 3, 2026
Merged

Make allocation/direction tie-breaking deterministic across backends#2881
brendancol merged 2 commits into
mainfrom
issue-2856

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2856

When two or more targets are equidistant from a pixel, allocation and direction could return different results depending on the backend. The distance is the same for tied targets, but the chosen target was not stable. This pins one policy everywhere: on a tie, the target with the lowest flat (row-major) index wins. That is what the brute-force CPU path and the CUDA kernel already did.

Changes

  • ALLOCATION and DIRECTION now run through the exact brute-force search on the numpy and dask+numpy backends. The line-sweep broke ties as a side effect of its four-pass propagation order. PROXIMITY keeps the faster line-sweep, since the distance is identical for tied targets.
  • The Dask KDTree query is pinned to the lowest target index on a tie. cKDTree.query does not promise which of several equidistant neighbours it returns, so a helper queries the two nearest and keeps the smaller index when they tie. Targets are stored in row-major order, so the smaller index is the lowest flat index. This covers both the global (lazy) and tiled (eager fallback) KDTree paths.
  • The policy is documented in the allocation and direction docstrings.

Backend coverage

numpy, cupy, dask+numpy, dask+cupy all resolve ties the same way. The unbounded dask+cupy path converts to dask+numpy and uses the KDTree; the bounded dask+cupy path uses the CUDA kernel, which already followed the policy.

Test plan

  • New cross-backend tests pin the tie outcome for allocation (exact expected grid) and direction (pinned to the numpy reference) over all four backends.
  • New test forces the eager tiled KDTree fallback and checks the tie outcome.
  • Full test_proximity.py suite passes (288 tests).

…2856)

When two or more targets are equidistant from a pixel, allocation and
direction now pick the target with the lowest flat (row-major) index on
every backend. This is the policy the brute-force CPU path and the CUDA
kernel already used; the NumPy line-sweep and the Dask KDTree path
disagreed with it.

- Route ALLOCATION and DIRECTION through the exact brute-force search on
  the numpy and dask+numpy backends. The line-sweep broke ties as a side
  effect of its four-pass propagation order. PROXIMITY keeps the faster
  line-sweep since the distance is identical for tied targets.
- Pin the Dask KDTree query to the lowest target index on a tie.
  cKDTree.query does not promise which equidistant neighbour it returns;
  the new helper queries the two nearest and keeps the smaller index when
  they tie. Targets are stored in row-major order, so the smaller index is
  the lowest flat index.
- Document the tie-break policy in the allocation and direction docstrings.
- Add cross-backend tests pinning the tie outcome for allocation and
  direction, plus a test for the eager tiled KDTree fallback.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Make allocation/direction tie-breaking deterministic across backends

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • Performance tradeoff on the unbounded numpy/dask+numpy path: _process_numpy now sends ALLOCATION and DIRECTION through _process_numpy_bruteforce, which is O(N*T): every pixel scans every target. The line-sweep it replaces is roughly O(N). For a raster with many target pixels and no max_distance bound, that is a real slowdown. It buys a deterministic cross-backend tie-break and matches what the CUDA kernel already does, so the tradeoff is reasonable, but a one-line code comment would tell the next reader the cost was deliberate. (xrspatial/proximity.py, _process_numpy)

Nits (optional improvements)

  • _kdtree_query_lowest_index only resolves 2-way ties, since it queries k=2. A pixel equidistant to 3 or more targets could still land on a non-minimal index if cKDTree returns the 2nd and 3rd nearest as the top two. The global KDTree path stores targets in row-major order and cKDTree tends to return the lower index first, so this is unlikely to bite in practice, but a short comment noting the k=2 limitation would set expectations. (xrspatial/proximity.py, _kdtree_query_lowest_index)

What looks good

  • The tie-break policy (lowest flat row-major index) matches the brute-force and CUDA backends that already existed, so this aligns the two stragglers rather than inventing a new rule.
  • Both KDTree entry points (global lazy _kdtree_chunk_fn and eager tiled _tiled_chunk_query) go through the same helper, so the fallback path is covered.
  • The np.isfinite(dists2[:, 1]) guard avoids treating an out-of-range second neighbour (inf distance under distance_upper_bound) as a tie.
  • Tests pin the actual tie outcome across all four backends for both allocation and direction, plus a dedicated test for the tiled fallback. The direction test pins to the numpy reference rather than a hand-written grid, which is reasonable for an angle.
  • Docstrings on both public functions now state the policy.

Checklist

  • Algorithm matches reference/policy (lowest flat index, matches brute-force and CUDA)
  • All implemented backends produce consistent results
  • NaN / out-of-range handling is correct (isfinite guard on the tie mask)
  • Edge cases covered by tests (tie on a 3x3, tiled fallback)
  • Dask chunk boundaries handled correctly (targets collected in row-major order)
  • Performance: brute-force on unbounded numpy alloc/direction is slower than the line-sweep; deliberate tradeoff for determinism
  • Benchmark not needed (bug fix, no new function)
  • README feature matrix unchanged (no new function, no backend change)
  • Docstrings present and accurate

- Note in _process_numpy that brute-force is O(N*T) versus the line-sweep's
  O(N), a deliberate trade for a cross-backend tie-break.
- Note in _kdtree_query_lowest_index that the k=2 query resolves 2-way ties
  and relies on row-major target order for 3-or-more-way ties.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review

Both items from the first pass are addressed in 8ddf938 (comments only, no behavior change):

  • The brute-force performance trade is now spelled out in _process_numpy: O(N*T) versus the line-sweep's O(N), chosen so allocation and direction break ties the same way on every backend.
  • _kdtree_query_lowest_index now states that the k=2 query covers 2-way ties and leans on the row-major target order for the rarer 3-or-more-way case.

No new findings. Behavior is unchanged from the first pass; the proximity suite still passes (288 tests). No blockers.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 2, 2026
@brendancol brendancol merged commit f63fcd5 into main Jun 3, 2026
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allocation/direction tie-breaking differs across backends

1 participant