geotiff: align open_geotiff parameters with rioxarray's open_rasterio (#2961)#2963
Merged
Conversation
Rename mask_nodata->masked (default flips True->False) and name->default_name with deprecation aliases; add mask_and_scale, parse_coordinates, lock, cache. mask_and_scale/parse_coordinates implemented on eager+dask CPU paths; gated for gpu/vrt. lock/cache accept+warn.
brendancol
commented
Jun 5, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: align open_geotiff parameters with rioxarray's open_rasterio
Reviewed the four source files in full (__init__.py, _attrs.py, _backends/dask.py, _runtime.py) plus the test and doc changes. Full geotiff suite is green locally (6092 passed) with GPU present, and the new compat tests cover the renames, the default flip, both-passed errors, mask_and_scale eager/dask, parse_coordinates, the lock/cache warnings, and the GPU/VRT gates.
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
- Round-trip double-apply with
mask_and_scale. In_attrs.py(_finalize_eager_read) and_backends/dask.py, after applyingdata * scale + offsetthe source'sattrs['gdal_metadata']still carries the rawSCALE/OFFSETitems. If a caller reads withmask_and_scale=Trueand then writes the result withto_geotiff, the writer re-embeds those items, so a latermask_and_scale=Trueread applies the scale a second time. rioxarray sidesteps this by moving the values toencoding. Either dropSCALE/OFFSETfrom the propagatedgdal_metadataonce they have been applied, or document the round-trip caveat on themask_and_scaleparameter. - Resolution edge case in
open_geotiff(__init__.py). The "both passed" guard isif mask_nodata is not <sentinel>: if masked is not False: raise. Becausemaskedhas a real default ofFalse, an explicitmasked=Falsecannot be told apart from the default, soopen_geotiff(path, masked=False, mask_nodata=True)does not raise and silently resolves to masking on. This matches the documented stance inread_geotiff_gpu, but a one-line comment in the resolution block noting the case would save the next reader the trace.
Nits (optional improvements)
-
_extract_scale_offset(_attrs.py) applies one scale/offset pair to all bands and falls back to band 0 for per-band metadata. The docstring already says this; consider a warning when per-band values actually disagree so a multi-band source with mismatched scaling does not silently use band 0. -
__init__.pycomputes_is_vrt_source_earlyfor the new gate and then_is_vrt_sourceagain a few lines later. Harmless, but the second could reuse the first.
What looks good
- The deprecation aliases reuse the existing sentinel pattern from the
gpu->on_gpu_failurerename rather than inventing a new mechanism, including the both-passedTypeError. mask_and_scaleis verified to match between the eager and dask paths, and thedtype=<int>+ scale combination raises through the existing float-to-int guard.- The default flip is the risk here, and the test migration is thorough: masking-behavior tests now opt in with
masked=True, and the two tests that pinned the old default were rewritten to pin the new one. - Scoping
mask_and_scale/parse_coordinates=Falseto the CPU eager and dask paths with an explicitValueErrorfor GPU and VRT keeps the change inside the parity-gated tiers instead of threading untested code through the GPU and VRT readers.
Checklist
- Behavior matches rioxarray's
open_rasteriofor the aligned parameters - Renames thread through all four backends; new behavioral options gated where unsupported
- NaN / sentinel masking correct under the new default
- Edge cases covered (no-metadata no-op, dtype clash, both-passed, deprecation warnings)
- Dask path matches eager for
mask_and_scale - No premature materialization (dask scale/offset stays lazy)
- README feature matrix not applicable (no new function)
- Docstrings present and updated
brendancol
commented
Jun 5, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after commit a59264c)
Disposition of the findings from the first pass:
- Fixed: round-trip double-apply with
mask_and_scale. The reader sets bothattrs['gdal_metadata']andattrs['gdal_metadata_xml'], andto_geotiffprefers the XML, so stripping only the dict would not have prevented re-embedding and stripping the serialized XML pulls the writer path into this read-focused change. Documented the caveat on themask_and_scaleparameter instead (drop the scale/offset tags andscale_factor/add_offsetbefore writing if a clean round-trip matters). Filed as the proportionate fix; a writer-side strip can be a follow-up if it comes up in practice. - Fixed: added a comment in the
open_geotiffresolution block spelling out themasked=False+mask_nodata=Trueedge case. - Fixed (nit): unified the duplicated
_is_vrt_sourcecomputation; the gate and the later VRT branch now share one binding. - Dismissed (nit): warning when per-band
SCALE/OFFSETdisagree. The band-0 fallback is already documented, the case is uncommon, and emitting warnings from the low-level attrs helper would expand the change without much payoff.
Full xrspatial/geotiff/tests suite is green after the follow-up (6114 passed, 81 skipped, 1 xfailed) with GPU present.
# Conflicts: # xrspatial/geotiff/__init__.py # xrspatial/geotiff/_backends/dask.py # xrspatial/geotiff/tests/read/test_nodata.py # xrspatial/geotiff/tests/release_gates/test_stable_features.py # xrspatial/geotiff/tests/vrt/test_metadata.py
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2961
What changed
Aligns
open_geotiff's read parameters with rioxarray'sopen_rasterioso callers can move between the two with minimal edits.mask_nodata->maskedand flipped the default fromTruetoFalseto matchopen_rasterio. A bareopen_geotiff(path)no longer promotes the nodata sentinel to NaN; passmasked=Truefor the old behavior.mask_nodatastays as a deprecated alias (DeprecationWarning; passing both raisesTypeError).name->default_name, withnamekept as a deprecated alias on the same terms.mask_and_scale: reads GDALSCALE/OFFSETfromattrs['gdal_metadata'], returnsdata * scale + offset, masks the sentinel, and recordsattrs['scale_factor']/attrs['add_offset']. No-op when the source carries no scale/offset.parse_coordinates:Falseskips thex/ycoordinate arrays while keepingattrs['transform']/attrs['crs'].lock/cachefor signature compatibility. xrspatial's reader re-opens the source per window, so these have no analog; passing a non-default value emits aGeoTIFFFallbackWarningrather than silently ignoring the kwarg.chunksalready matched rioxarray and is unchanged, as are the xrspatial-only parameters (gpu,window,bbox,max_pixels,allow_*, tier flags).The deprecation aliases reuse the existing sentinel pattern from
read_geotiff_gpu'sgpu->on_gpu_failurerename.Scope note
maskedanddefault_namethread through all four backends (eager, dask, GPU, VRT).mask_and_scaleandparse_coordinatesare implemented on the CPU eager and dask paths; combining either withgpu=Trueor a.vrtsource raisesValueErrorwith a message pointing at the supported path, matching how the dispatcher already gateson_gpu_failure/missing_sources/max_cloud_bytesper backend. This keeps the change inside the tiers where behavior is parity-gated rather than threading untested code through the GPU and VRT-mosaic readers.Backend coverage
mask_and_scale/parse_coordinates=Falserejected with a clear errorTest plan
xrspatial/geotiff/tests/read/test_rioxarray_compat_2961.py: default flip, deprecation warnings, both-passedTypeError, alias equivalence,mask_and_scale(eager + dask, no-metadata no-op,dtype=intraises),parse_coordinates,lock/cachewarnings, GPU/VRT gating.masked=True; canonical names replace deprecated-alias usage).xrspatial/geotiff/testssuite green (GPU available in the dev env, so cupy/dask+cupy/gpu params ran).xrspatial/tests/test_reproject.py,test_rasterize.py,test_accessor.pygreen (the reproject datum-grid fallback now matches its rasterio primary path, which also does not mask).The user-guide notebook and README feature-matrix steps were skipped: this is parameter alignment on an existing function, not a new spatial operation or backend-support change.