Keep MFD dask output assembly lazy by brendancol · Pull Request #2886 · xarray-contrib/xarray-spatial

brendancol · 2026-06-02T22:00:17Z

Closes #2865.

What changed

The dask assembly step in flow_accumulation_mfd, flow_length_mfd, and watershed_mfd used to .compute() each tile and wrap the numpy result with da.from_array / da.block, which materialized the whole output raster during the function call.
Rebuilt the assembly with da.map_blocks over the lazy fractions array. The converged boundary snapshot and fraction strips are small, so they ride along in a closure and the per-tile kernel runs at compute time. The returned DataArray is now genuinely lazy.
Added a rechunk guard so the 8 direction bands stay in one chunk (every tile kernel needs all 8), and aligned the pour-points chunks to the fractions tile grid in watershed_mfd.

The boundary-convergence sweep that runs before assembly still reads tile data eagerly. That is inherent to the algorithm and is tracked separately in #2885, along with the same assembly rework for the stream and hand MFD modules.

Backend coverage

numpy and cupy paths are unchanged. The lazy rework applies to dask+numpy and dask+cupy (the cupy variant converts to numpy, runs the same iterative path, and converts back).

Test plan

flow_accumulation_mfd, flow_length_mfd, watershed_mfd dask results match numpy
New test asserts the assembly kernel runs at compute time, not during the call
New test covers an input chunked along the 8-band axis (rechunk guard)
New test covers pour points chunked differently from fractions
Full hydro suite passes (860 tests)

The dask assembly step in flow_accumulation_mfd, flow_length_mfd, and watershed_mfd computed each tile and wrapped the numpy result with da.from_array / da.block, materializing the full output raster during the API call. Rebuild the result with da.map_blocks over the lazy fractions array, passing the small converged boundary snapshot into the block function so the per-tile kernel runs at compute time. The boundary-convergence sweep still reads tile data eagerly by design; lazifying that phase is tracked separately.

brendancol

PR Review: Keep MFD dask output assembly lazy

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None blocking. The change is well scoped and the tests cover the laziness contract directly.

Nits (optional improvements)

The three _tile closures in flow_accumulation_mfd.py, flow_length_mfd.py, and watershed_mfd.py are nearly identical, including the searchsorted tile-index lookup. A shared helper could remove the duplication, but the duplication is small and the sibling rockout PRs touch these same files, so factoring it out now would raise the conflict risk. Reasonable to leave as is for this PR.

What looks good

The map_blocks rework keeps the returned DataArray lazy while leaving the numpy and cupy paths untouched. The boundary-convergence sweep stays eager by necessity, and that limitation is documented in the docstrings and tracked in the follow-up issue #2885.
block_info[0]['array-location'] reads from the input fractions coordinates, which is the right reference for the tile index even after drop_axis=0 collapses the band axis. The watershed path confirms this works with a second 2-D input array.
The rechunk guard for the 8-band axis and the pour-points chunk alignment are both covered by new tests, including a deliberate chunk mismatch.
Correctness is checked against the numpy path across single-chunk, many-chunk, and band-axis-chunked inputs, with equal_nan comparisons.

Checklist

Algorithm matches reference: unchanged, kernels reused as is
Backends consistent: numpy/cupy unchanged; dask+numpy verified against numpy in tests. dask+cupy shares the same iterative path but needs a GPU runner to verify directly.
NaN handling correct: unchanged from prior tile kernels; equal_nan tests pass
Edge cases covered: single tile, many tiles, band-axis chunked, pour-points mismatch
Dask chunk boundaries handled: rechunk guard plus pour-points alignment
No premature materialization: that is the point of this PR; assembly is now lazy
Benchmark: not needed, no new public API
README: not applicable, no new function
Docstrings present and accurate: assembly helpers document the laziness contract

brendancol commented Jun 2, 2026

View reviewed changes

github-actions Bot added the performance PR touches performance-sensitive code label Jun 2, 2026

brendancol merged commit 8b89e6e into main Jun 3, 2026
10 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep MFD dask output assembly lazy#2886

Keep MFD dask output assembly lazy#2886
brendancol merged 1 commit into
mainfrom
issue-2865

brendancol commented Jun 2, 2026

Uh oh!

brendancol left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Jun 2, 2026

What changed

Backend coverage

Test plan

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

PR Review: Keep MFD dask output assembly lazy

Blockers (must fix before merge)

Suggestions (should fix, not blocking)

Nits (optional improvements)

What looks good

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant