Reject non-monotonic 1D coords in proximity/allocation/direction#2875
Conversation
The endpoint-based max distance, the dask map_overlap halo, the NumPy line-sweep, and the tiled KDTree convergence check all assume monotonic 1D coordinates. Non-monotonic coords were accepted silently and returned wrong results. Validate both axes once in the shared _process entrypoint and raise a clear ValueError naming the offending axis. A single-element axis is still allowed. Document the requirement in the docstrings and reference page, and add cross-backend tests for the raised error, the descending-axis accept path, and the single-element accept path.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Reject non-monotonic 1D coords in proximity/allocation/direction
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
- proximity.py:271-277. The check also rejects equal adjacent coordinates (a duplicate axis value makes both
ascendinganddescendingfalse) and coords that contain NaN. Both are reasonable to reject, but the error always says "non-monotonic", which is a bit off for the duplicate/NaN case. Not worth a code change unless you want a more specific message. Flagging it so it is a conscious call.
Nits (optional improvements)
- test_proximity.py:1700.
test_single_element_axis_allowedcovers numpy and dask+numpy but not cupy. The single-element path runs before dispatch so it is backend-agnostic and this is fine, but a cupy parametrization would match the other two new tests.
What looks good
- The validation lives in
_process, the single chokepoint for all three public functions, so one check covers all four backends. Right place for it: it runs after dask coords are computed to numpy and before any backend-specific path. - Length-1 axes are explicitly allowed, matching the existing
_halo_depthsingle-coordinate handling. - Tests cover both axes, all three functions, all four backends (GPU skipped when unavailable), plus the descending-accept and single-element-accept paths.
- Docstrings and the reference page document the new contract.
Checklist
- Reject is the defensible stance (the whole module assumes monotonic coords)
- All backends consistent (check runs pre-dispatch)
- NaN handling correct (NaN coords rejected, NaN raster data unaffected)
- Edge cases covered (single-element axis, descending axis)
- No premature materialization (coords already computed upstream)
- Benchmark not needed; README matrix not applicable; docstrings accurate
#2851) - Spell out that duplicate and NaN coordinate values also fail the strict monotonic check, so the message is accurate for those cases. - Parametrize test_single_element_axis_allowed over cupy and dask+cupy to match the other new tests.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after 5c0cd23)
Both findings from the first pass are addressed:
- Suggestion (error message): the message now spells out that duplicate and NaN coordinate values also fail the strict-monotonic check, so it reads accurately for those cases. proximity.py:278-283.
- Nit (test coverage):
test_single_element_axis_allowedis now parametrized over cupy and dask+cupy, matching the other two new tests. test_proximity.py:1700.
No new findings. No blockers. Ready to merge once CI is green.
|
CI note: the |
# Conflicts: # xrspatial/tests/test_proximity.py
Closes #2851
What
proximity,allocation, anddirectionassume monotonic 1D x and y coordinates but never enforced it. Several internal steps rely on that assumption:max_possible_distanceis taken from the coordinate endpoints only.map_overlaphalo and the NumPy line-sweep treat array adjacency as spatial adjacency._min_boundary_distance).None of those hold for a non-monotonic axis, so such input was accepted silently and produced wrong proximity/allocation/direction versus the brute-force NumPy result.
Non-monotonic coords are outside the module's contract, so this rejects them up front rather than trying to make the tiled path provably correct on data the rest of the module already cannot handle.
Change
_processentrypoint and raise aValueErrornaming the non-monotonic axis. A single-element axis is still allowed (no order to violate).Backends
The check lives in
_process, the single chokepoint for all three functions, so it applies to numpy, cupy, dask+numpy, and dask+cupy alike.Test plan
ValueErrorfor all three functions across all four backends (GPU/dask+cupy skip when unavailable).test_proximity.pysuite passes (309 tests).