Skip to content

geotiff: align direct backend mask_nodata default with open_geotiff#2980

Merged
brendancol merged 2 commits into
mainfrom
issue-2976
Jun 6, 2026
Merged

geotiff: align direct backend mask_nodata default with open_geotiff#2980
brendancol merged 2 commits into
mainfrom
issue-2976

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2976

What

The three direct GeoTIFF backend entry points defaulted to mask_nodata=True while the public open_geotiff defaults to masked=False. A bare backend call returned a different dtype and NaN-substituted values than the public path, silently.

  • Flip _read_geotiff_dask, _read_geotiff_gpu, and _read_vrt to mask_nodata=False so a direct call with no mask_nodata argument matches open_geotiff (unmasked, rioxarray-compatible).
  • The public wrapper and the internal GPU-to-dask / dask-to-VRT callers all forward mask_nodata explicitly, so only direct callers are affected.

Backend coverage

The dask and VRT entry points cover numpy and cupy graphs; the GPU entry point covers eager and dask+cupy. The default flip applies the same way on all of them.

Test plan

  • New regression tests: each direct backend with no mask_nodata produces the same unmasked result (dtype + sentinel preserved, no NaN) as open_geotiff. The GPU test is guarded by requires_gpu.
  • Updated existing tests that relied on the old masked default. Where a test verifies the masking path it now passes mask_nodata=True; where a test pinned the bare-call behavior it asserts the new unmasked default.
  • Full geotiff suite green locally (6155 passed, GPU included).
  • flake8 clean on touched files.

…2976)

The three direct backend entry points (_read_geotiff_dask,
_read_geotiff_gpu, _read_vrt) defaulted to mask_nodata=True while the
public open_geotiff defaults to masked=False. A bare backend call
returned a different dtype and NaN-substituted values than the public
path. Flip the three defaults to False so a direct call with no
mask_nodata argument matches open_geotiff (unmasked, rioxarray-style).

The public wrapper and the internal GPU-to-dask / dask-to-VRT callers
all forward mask_nodata explicitly, so only direct callers are affected.

Tests: add direct-backend-vs-open_geotiff parity regression tests; update
existing tests that relied on the old masked default to request masking
explicitly (mask_nodata=True) or assert the new unmasked default.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 5, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: align direct backend mask_nodata default with open_geotiff

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

One note, not blocking. The internal helpers that aren't public entry points kept mask_nodata: bool = True: _read_geotiff_gpu_chunked_gds in gpu.py, plus _vrt_chunk_read and the chunked dispatch in vrt.py. That's correct. They always get an explicit mask_nodata forwarded from the entry point and aren't a caller's direct surface. The tests in test_reader.py and test_kernels_and_kwargs.py now pass mask_nodata=True to both sides when they compare an entry point against an internal helper, so those comparisons stay apples-to-apples. Flagging it only so the next reader doesn't mistake the asymmetry for an oversight.

Nits (optional improvements)

None.

What looks good

  • The source change is small: three default flips and the matching default True to default False docstring edits. Everything else in the diff is test updates.
  • The public wrapper forwards mask_nodata=masked on every dispatch path (GPU, dask, VRT, eager), and the two internal cross-backend callers (gpu.py:1422 to _read_geotiff_dask, dask.py:227 to _read_vrt) forward it explicitly. So the default only affects direct callers. Confirmed by reading the call sites.
  • The new regression tests check direct-backend-vs-open_geotiff parity for all three backends (GPU guarded by requires_gpu): dtype, sentinel survival, no NaN, bit-equal pixels.
  • Existing tests that pinned the old masked default were split the right way. Masking-path tests now request mask_nodata=True; default-behavior tests assert the new unmasked default and were renamed for #2976.
  • Full geotiff suite passes locally including GPU (6155 passed).

Checklist

  • Behavior change matches the issue: direct backend default now matches open_geotiff
  • All three backends produce consistent unmasked results with no mask_nodata arg
  • NaN / sentinel handling correct: masking still available via mask_nodata=True
  • Edge cases covered: int sentinel hit, float sentinel, dtype preservation, dask + GPU graphs
  • Internal helper defaults left untouched (correct: they receive explicit values)
  • No premature materialization or new copies (no compute path changed)
  • Benchmark not needed (no perf-critical path changed)
  • README feature matrix not applicable (no new public function)
  • Docstrings updated to default False on all three backends

# Conflicts:
#	xrspatial/geotiff/tests/read/test_rioxarray_compat_2961.py
#	xrspatial/geotiff/tests/vrt/test_metadata.py
@brendancol brendancol merged commit 857517f into main Jun 6, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Direct reader backends default to masking nodata while open_geotiff defaults to unmasked

1 participant