Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,15 +162,14 @@ VRT is supported as a conservative advanced feature for simple GeoTIFF mosaics,
| Name | Description | NumPy | Dask | CuPy GPU | Dask+CuPy GPU | Cloud |
|:-----|:------------|:-----:|:----:|:--------:|:-------------:|:-----:|
| [open_geotiff](xrspatial/geotiff/__init__.py) | Read GeoTIFF / COG / VRT | ✅ | ✅ | 🧪 | 🧪 | 🔼 |
| [to_geotiff](xrspatial/geotiff/__init__.py) | Write DataArray as GeoTIFF / COG | ✅ | ✅ | 🧪 | 🧪 | 🔼 |
| [build_vrt](xrspatial/geotiff/__init__.py) | Generate VRT mosaic from existing GeoTIFFs | 🔼 | | | | |
| [to_geotiff](xrspatial/geotiff/__init__.py) | Write DataArray as GeoTIFF / COG / VRT | ✅ | ✅ | 🧪 | 🧪 | 🔼 |

`open_geotiff` and `to_geotiff` select the backend from their parameters
(`gpu=`, `chunks=`, `.vrt` path); GPU read/write is reached with `gpu=True`,
not a separate function:

```python
from xrspatial.geotiff import build_vrt, open_geotiff, to_geotiff
from xrspatial.geotiff import open_geotiff, to_geotiff

open_geotiff('dem.tif') # NumPy
open_geotiff('dem.tif', chunks=512) # Dask
Expand All @@ -187,7 +186,7 @@ to_geotiff(data, 'cog.tif', cog=True) # COG with auto overviews
to_geotiff(data, 'cog.tif', cog=True, # COG with explicit levels
overview_levels=[2, 4, 8],
overview_resampling='nearest')
build_vrt('mosaic.vrt', ['tile1.tif', 'tile2.tif']) # mosaic existing tiles
to_geotiff(data, 'mosaic.vrt') # write a tiled VRT mosaic

open_geotiff('dem.tif', dtype='float32') # half memory
open_geotiff('dem.tif', dtype='float32', chunks=512) # Dask + half memory
Expand Down
29 changes: 14 additions & 15 deletions docs/source/reference/geotiff.rst
Original file line number Diff line number Diff line change
Expand Up @@ -220,13 +220,13 @@ Writing
=======
``to_geotiff`` is the single write entry point (``gpu=True`` or CuPy data
selects the GPU path; a ``.vrt`` output path writes tiles plus an index).
``build_vrt`` mosaics a list of existing GeoTIFF files into a VRT.
Writing to a ``.vrt`` path is how you produce a VRT mosaic; the underlying
index emitter is internal.

.. autosummary::
:toctree: _autosummary

xrspatial.geotiff.to_geotiff
xrspatial.geotiff.build_vrt

COG validator CI gate
=====================
Expand Down Expand Up @@ -400,9 +400,9 @@ VRT support matrix (issue #2321)

VRT reads sit at the ``advanced`` tier in
:data:`xrspatial.geotiff.SUPPORTED_FEATURES` (``reader.vrt``).
``open_geotiff`` (on a ``.vrt`` source), ``to_geotiff`` (to a ``.vrt``
output), and ``build_vrt`` all target the same narrow subset of GDAL's VRT
spec. The reference below is the canonical contract; the docstrings echo it.
``open_geotiff`` (on a ``.vrt`` source) and ``to_geotiff`` (to a ``.vrt``
output) both target the same narrow subset of GDAL's VRT spec. The
reference below is the canonical contract; the docstrings echo it.

Supported
---------
Expand Down Expand Up @@ -452,23 +452,22 @@ Non-goals (intentionally unsupported)
Safe usage
----------

A simple mosaic over two compatible GeoTIFF tiles, read eagerly with
the fail-closed defaults:
Write a chunked DataArray to a ``.vrt`` path (``to_geotiff`` tiles the
array and emits the index), then read it back eagerly with the
fail-closed defaults:

.. code-block:: python

from xrspatial.geotiff import build_vrt, open_geotiff
from xrspatial.geotiff import open_geotiff, to_geotiff

# Write a VRT that mosaics two tiles. Both tiles share CRS,
# pixel size, dtype, and band count.
vrt_path = build_vrt(
'mosaic.vrt',
source_files=['tile_west.tif', 'tile_east.tif'],
)
# ``da`` is a 2D dask-backed DataArray with crs / transform set.
# Writing to a .vrt produces a directory of tiled GeoTIFFs plus
# the VRT index that references them.
to_geotiff(da, 'mosaic.vrt')

# Read with the defaults: missing_sources='raise',
# band_nodata=None (fail closed on disagreeing per-band sentinels).
da = open_geotiff(vrt_path)
back = open_geotiff('mosaic.vrt')

Intentionally raises
--------------------
Expand Down
4 changes: 2 additions & 2 deletions docs/source/reference/geotiff_internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ public API. Files referenced live under `xrspatial/geotiff/`.
| -------------------- | --------------------------------- | ---------------------- |
| `to_geotiff` | `xrspatial/geotiff/_writers/eager.py` | NumPy / Dask DataArray (auto-dispatches to GPU when input is CuPy-backed) |
| `_write_geotiff_gpu` | `xrspatial/geotiff/_writers/gpu.py` | CuPy DataArray |
| `build_vrt` | `xrspatial/geotiff/_writers/vrt.py` | list of GeoTIFF paths (XML emitter) |
| `_build_vrt` | `xrspatial/geotiff/_writers/vrt.py` | list of GeoTIFF paths (XML emitter; internal, reached by `to_geotiff`'s `.vrt` path) |

## Contract steps

Expand Down Expand Up @@ -142,7 +142,7 @@ write analogue; `to_geotiff` and `_write_geotiff_gpu` always emit
Orientation = 1 and rely on the writer assembler (`_writer.write`) for
photometric handling.

| Step | `to_geotiff` (CPU eager / dask) | `_write_geotiff_gpu` | `build_vrt` |
| Step | `to_geotiff` (CPU eager / dask) | `_write_geotiff_gpu` | `_build_vrt` |
| ---- | ------------------------------- | ------------------- | ----------- |
| 1. source / kwarg validation | shared (`_validate_tile_size_arg`, `_validate_3d_writer_dims`, `_validate_writer_spatial_shape`, `_validate_nodata_arg`, `_validate_no_rotated_affine`); duplicated inline compression / `compression_level` / `cog` / `overview_levels` / `bigtiff` / `streaming_buffer_bytes` / `max_z_error` / `photometric` / `allow_internal_only_jpeg` / `allow_experimental_codecs` value rejections | shared (`_validate_tile_size_arg`, `_validate_3d_writer_dims`, `_validate_writer_spatial_shape`, `_validate_nodata_arg`, `_validate_no_rotated_affine`); duplicated inline GPU-specific kwarg rejections (`predictor`, `compression`, `cog`, etc.) | shared (`_validate_nodata_arg`); duplicated inline `path` / `vrt_path` shim, `crs` / `crs_wkt` shim, source path validation |
| 2. metadata parse | N/A (no source to parse; reads attrs off the DataArray) | N/A | duplicated (reads geokeys from the first source file to inherit CRS / nodata; lives in `_vrt.write_vrt`) |
Expand Down
5 changes: 3 additions & 2 deletions docs/source/reference/release_gate_geotiff.rst
Original file line number Diff line number Diff line change
Expand Up @@ -557,9 +557,10 @@ VRT supported subset
- ``xrspatial/geotiff/tests/release_gates/test_stable_features.py``
(VRT presence meta-gate)
- `#2321`_
* - ``build_vrt``
* - VRT write (``.vrt`` output)
- advanced
- Writer rejects source-incompatibility cases at the writer boundary.
- Writer rejects source-incompatibility cases at the writer boundary
(``to_geotiff`` to a ``.vrt`` path via the internal ``_build_vrt``).
- ``xrspatial/geotiff/tests/vrt/test_validation.py``
- `#2342`_

Expand Down
3 changes: 0 additions & 3 deletions docs/source/user_guide/geotiff_safe_io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,6 @@ the read and write paths:
CuPy-backed data) for the GPU writer (tier: ``experimental``);
use the CPU path for anything you round-trip through external
tools.
* - :func:`xrspatial.geotiff.build_vrt`
- Emit a GDAL ``.vrt`` over a list of existing local GeoTIFF
sources. Tier: ``advanced``.

A dask-backed read is just ``open_geotiff(source, chunks=...)`` -- there
is no separate ``read_geotiff_dask`` name on the public surface. The
Expand Down
67 changes: 26 additions & 41 deletions examples/user_guide/39_GeoTIFF_IO.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,25 +13,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tier note\n",
"\n",
"`open_geotiff` and `to_geotiff` against local files, the lossless codecs (`none`, `deflate`, `lzw`, `zstd`, `packbits`), and axis-aligned 2D / 3D rasters are tagged `stable` in `xrspatial.geotiff.SUPPORTED_FEATURES`. Dask reads (`reader.dask`) and dask streaming writes (covered by `writer.local_file`) are stable too. The VRT mosaic section at the bottom exercises `build_vrt` and reads the mosaic back with `open_geotiff`, which sit at the `advanced` tier (`reader.vrt`): the supported subset is a flat mosaic of compatible GeoTIFF tiles, not the full GDAL VRT spec.\n",
"\n",
"**See also:** the GeoTIFF / COG reference page at `docs/source/reference/geotiff.rst` lists every feature in `xrspatial.geotiff.SUPPORTED_FEATURES` against its tier (`stable`, `advanced`, `experimental`, `internal_only`) and links the release gate that locks each promise.\n"
"### Tier note\n\n`open_geotiff` and `to_geotiff` against local files, the lossless codecs (`none`, `deflate`, `lzw`, `zstd`, `packbits`), and axis-aligned 2D / 3D rasters are tagged `stable` in `xrspatial.geotiff.SUPPORTED_FEATURES`. Dask reads (`reader.dask`) and dask streaming writes (covered by `writer.local_file`) are stable too. The VRT mosaic section at the bottom writes a `.vrt` with `to_geotiff` and reads the mosaic back with `open_geotiff`, which sit at the `advanced` tier (`reader.vrt`): the supported subset is a flat mosaic of compatible GeoTIFF tiles, not the full GDAL VRT spec.\n\n**See also:** the GeoTIFF / COG reference page at `docs/source/reference/geotiff.rst` lists every feature in `xrspatial.geotiff.SUPPORTED_FEATURES` against its tier (`stable`, `advanced`, `experimental`, `internal_only`) and links the release gate that locks each promise.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### What you'll build\n",
"\n",
"1. [Write and read back a GeoTIFF](#Write-and-read-back) with `to_geotiff` and `open_geotiff`\n",
"2. [Write from a DataArray accessor](#Accessor-write) using `da.xrs.to_geotiff()`\n",
"3. [Windowed read via Dataset accessor](#Windowed-read-via-Dataset) using `ds.xrs.open_geotiff()` to crop a large file to an existing spatial extent\n",
"4. [Stitch tiles with build_vrt](#VRT-mosaic) to build a virtual mosaic from multiple GeoTIFFs\n",
"\n",
"![GeoTIFF I/O preview](images/geotiff_io_preview.png)"
"### What you'll build\n\n1. [Write and read back a GeoTIFF](#Write-and-read-back) with `to_geotiff` and `open_geotiff`\n2. [Write from a DataArray accessor](#Accessor-write) using `da.xrs.to_geotiff()`\n3. [Windowed read via Dataset accessor](#Windowed-read-via-Dataset) using `ds.xrs.open_geotiff()` to crop a large file to an existing spatial extent\n4. [Write a VRT mosaic with to_geotiff](#VRT-mosaic) by writing a chunked DataArray to a `.vrt` path\n\n![GeoTIFF I/O preview](images/geotiff_io_preview.png)"
]
},
{
Expand Down Expand Up @@ -64,7 +53,7 @@
"import matplotlib.pyplot as plt\n",
"\n",
"import xrspatial\n",
"from xrspatial.geotiff import open_geotiff, to_geotiff, build_vrt"
"from xrspatial.geotiff import open_geotiff, to_geotiff"
]
},
{
Expand Down Expand Up @@ -272,7 +261,7 @@
"}\n",
"\n",
".xr-group-name::before {\n",
" content: \"\ud83d\udcc1\";\n",
" content: \"📁\";\n",
" padding-right: 0.3em;\n",
"}\n",
"\n",
Expand Down Expand Up @@ -335,7 +324,7 @@
"\n",
".xr-section-summary-in + label:before {\n",
" display: inline-block;\n",
" content: \"\u25ba\";\n",
" content: \"\";\n",
" font-size: 11px;\n",
" width: 15px;\n",
" text-align: center;\n",
Expand All @@ -346,7 +335,7 @@
"}\n",
"\n",
".xr-section-summary-in:checked + label:before {\n",
" content: \"\u25bc\";\n",
" content: \"\";\n",
"}\n",
"\n",
".xr-section-summary-in:checked + label > span {\n",
Expand Down Expand Up @@ -918,7 +907,7 @@
"source": [
"## VRT mosaic\n",
"\n",
"`build_vrt` writes a lightweight XML file that stitches multiple GeoTIFFs into one virtual raster. The tiles aren't copied, just referenced."
"Writing a chunked DataArray to a `.vrt` path tiles it into separate GeoTIFFs and writes a lightweight XML index that stitches them into one virtual raster. The tiles aren't copied into the index, just referenced."
]
},
{
Expand All @@ -937,48 +926,44 @@
"name": "stdout",
"output_type": "stream",
"text": [
"nw: (100, 150) -> 54,041 bytes\n",
"ne: (100, 150) -> 54,052 bytes\n",
"sw: (100, 150) -> 54,090 bytes\n",
"se: (100, 150) -> 54,051 bytes\n",
"VRT: 2,452 bytes\n",
"tile_00_00.tif: 53,127 bytes\n",
"tile_00_01.tif: 53,140 bytes\n",
"tile_01_00.tif: 53,174 bytes\n",
"tile_01_01.tif: 53,162 bytes\n",
"\n",
"VRT: 2,178 bytes\n",
"Mosaic shape: (200, 300)\n",
"Matches original: True\n"
]
}
],
"source": [
"# Split into 4 tiles and write each\n",
"tiles = [\n",
" ('nw', da[:100, :150]),\n",
" ('ne', da[:100, 150:]),\n",
" ('sw', da[100:, :150]),\n",
" ('se', da[100:, 150:]),\n",
"]\n",
"tile_paths = []\n",
"for name, tile in tiles:\n",
" p = os.path.join(tmpdir, f'tile_{name}.tif')\n",
" to_geotiff(tile, p, compression='deflate')\n",
" tile_paths.append(p)\n",
" print(f'{name}: {tile.shape} -> {os.path.getsize(p):,} bytes')\n",
"# Writing to a .vrt path tiles the array and emits an XML index that\n",
"# references the tiles. Chunk the array first so the mosaic spans\n",
"# multiple tiles (a 2x2 grid here).\n",
"chunked = da.chunk({'y': 100, 'x': 150})\n",
"\n",
"# Stitch into a VRT\n",
"vrt_path = os.path.join(tmpdir, 'mosaic.vrt')\n",
"build_vrt(vrt_path, tile_paths)\n",
"print(f'\\nVRT: {os.path.getsize(vrt_path):,} bytes')\n",
"to_geotiff(chunked, vrt_path)\n",
"print(f'VRT: {os.path.getsize(vrt_path):,} bytes')\n",
"\n",
"# The tiles land in a sibling *_tiles directory next to the .vrt\n",
"tiles_dir = os.path.join(tmpdir, 'mosaic_tiles')\n",
"for name in sorted(os.listdir(tiles_dir)):\n",
" size = os.path.getsize(os.path.join(tiles_dir, name))\n",
" print(f'{name}: {size:,} bytes')\n",
"\n",
"# Read the mosaic back\n",
"mosaic = open_geotiff(vrt_path)\n",
"print(f'Mosaic shape: {mosaic.shape}')\n",
"print(f'\\nMosaic shape: {mosaic.shape}')\n",
"print(f'Matches original: {np.allclose(mosaic.values, da.values)}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The VRT is a few hundred bytes of XML. `open_geotiff` assembles the tiles when you read it."
"The VRT is a small XML index. `open_geotiff` assembles the tiles when you read it."
]
},
{
Expand Down
31 changes: 14 additions & 17 deletions xrspatial/geotiff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,10 @@
through the GPU (nvCOMP) path, a ``.vrt`` output path writes a directory
of tiled GeoTIFFs plus a VRT index, and the default is an eager CPU
write.
build_vrt(path, source_files, ...)
Generate a VRT mosaic XML from a list of existing GeoTIFF files. This
is the one read/write helper that does not fold into ``to_geotiff``
because it has no DataArray to write -- it indexes files that already
exist. ``vrt_path`` is kept as a deprecated alias for ``path``; passing
both ``path`` and ``vrt_path`` raises ``TypeError``.

VRT mosaics are written by passing a ``.vrt`` path to ``to_geotiff``; the
underlying index emitter (``_build_vrt``) is internal and not part of the
public surface.

The backend functions ``_read_geotiff_gpu``, ``_read_geotiff_dask``,
``_read_vrt``, and ``_write_geotiff_gpu`` are private. ``open_geotiff`` and
Expand Down Expand Up @@ -95,7 +93,10 @@
# resolves for tests that monkeypatch it and callers bypassing auto-dispatch.
# ``to_geotiff`` reaches it via ``_writers.eager``; not called here directly.
from ._writers.gpu import _write_geotiff_gpu # noqa: F401
from ._writers.vrt import build_vrt
# Re-export only: the internal VRT-index emitter. ``to_geotiff``'s ``.vrt``
# path reaches it via ``_writers.eager``; bound here so tests and internal
# callers can import ``xrspatial.geotiff._build_vrt``. Not public API.
from ._writers.vrt import _build_vrt # noqa: F401

# All names below are part of the supported public API. ``plot_geotiff``
# is intentionally omitted: it is deprecated in favour of ``da.xrs.plot()``
Expand Down Expand Up @@ -126,7 +127,6 @@
'UnsafeURLError',
'UnsupportedGeoTIFFFeatureError',
'VRTStableSourcesOnlyError',
'build_vrt',
'open_geotiff',
'to_geotiff',
]
Expand Down Expand Up @@ -847,15 +847,12 @@ def open_geotiff(source: str | BinaryIO, *,

Examples
--------
Safe VRT usage. Mosaic two compatible tiles and read with the
fail-closed defaults:

>>> from xrspatial.geotiff import open_geotiff, build_vrt
>>> vrt_path = build_vrt( # doctest: +SKIP
... 'mosaic.vrt',
... source_files=['tile_west.tif', 'tile_east.tif'],
... )
>>> da = open_geotiff(vrt_path) # doctest: +SKIP
Safe VRT usage. Write a ``.vrt`` mosaic with ``to_geotiff`` and read
it back with the fail-closed defaults:

>>> from xrspatial.geotiff import open_geotiff, to_geotiff
>>> to_geotiff(data, 'mosaic.vrt') # doctest: +SKIP
>>> da = open_geotiff('mosaic.vrt') # doctest: +SKIP

Intentionally raises. A VRT whose source tiles disagree on their
per-band nodata sentinels is rejected by the default
Expand Down
4 changes: 2 additions & 2 deletions xrspatial/geotiff/_backends/vrt.py
Original file line number Diff line number Diff line change
Expand Up @@ -341,8 +341,8 @@ def _read_vrt(source: str, *,
Safe usage. Mosaic two compatible tiles and read with the
fail-closed defaults:

>>> from xrspatial.geotiff import open_geotiff, build_vrt
>>> vrt_path = build_vrt( # doctest: +SKIP
>>> from xrspatial.geotiff import _build_vrt
>>> vrt_path = _build_vrt( # doctest: +SKIP
... 'mosaic.vrt',
... source_files=['tile_west.tif', 'tile_east.tif'],
... )
Expand Down
8 changes: 4 additions & 4 deletions xrspatial/geotiff/_crs.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
``_wkt_to_epsg`` and ``_resolve_crs_to_wkt`` are pure leaves over
``pyproj`` (lazy-imported inside) and the strict-mode / fallback-warning
machinery from ``_runtime``. They are called from ``to_geotiff``,
``_write_geotiff_gpu``, and ``build_vrt`` to normalise the EPSG / WKT /
``_write_geotiff_gpu``, and ``_build_vrt`` to normalise the EPSG / WKT /
PROJ kwarg they each accept.
"""
from __future__ import annotations
Expand Down Expand Up @@ -231,10 +231,10 @@ def _resolve_crs_to_wkt(crs) -> str | None:
Mirrors ``to_geotiff`` / ``_write_geotiff_gpu``'s ``crs`` kwarg semantics
so callers can pass an int EPSG code, a WKT string, or a PROJ string
interchangeably. Returns the canonical WKT string (or ``None`` if
``crs`` is ``None``) for forwarding to ``_vrt.build_vrt``, which only
``crs`` is ``None``) for forwarding to ``_vrt.write_vrt``, which only
speaks WKT.

Used by ``build_vrt`` to close the parameter-naming
Used by ``_build_vrt`` to close the parameter-naming
drift versus the eager and GPU writer entry points.

Parameters
Expand Down Expand Up @@ -275,7 +275,7 @@ def _resolve_crs_to_wkt(crs) -> str | None:
f"got {type(crs).__name__}")
if isinstance(crs, str):
# Empty string is a common "no CRS" sentinel from upstream
# GeoTIFFs; preserve the existing _vrt.build_vrt semantics (it
# GeoTIFFs; preserve the existing _vrt.write_vrt semantics (it
# falls back to the first source's CRS for empty strings too).
if not crs:
return None
Expand Down
Loading
Loading