We already have partial support for Iceberg v3 in the codebase (e.g., Row Lineage and partition statistics). However, a number of key v3 features are still missing.
This issue serves as a parent tracking issue for the remaining v3 work we plan to (partially) include in the 0.3.0 release.
If you are interested in working on any of the items below, please open a dedicated sub-issue and link it back to this one.
1. Schema-Level Support for New v3 Types
Add support for all newly introduced v3 types at the schema level, including:
timestamp_ns, timestamptz_ns
unknown
variant
geometry, geography
2. Column Default Values (Schema Evolution)
Enable schema evolution with default values:
- Support
initial-default and write-default
- Apply defaults correctly during read and write paths
3. Multi-Argument Transforms
Upgrade partitioning and sort order logic to support multi-column transforms:
- Extend transform representation
- Update parsing and validation logic
4. Binary Deletion Vectors
Add support for deletion vectors (DV):
- Puffin-based DV parsing
- Scan-time application of DV
- Writer-side support for generating DV
5. Table-Level Encryption Metadata (Not sure whether this should be included in 0.4.0)
Support encryption-related metadata defined in v3:
- Parse and preserve encryption metadata
6. Puffin-Based Statistics
Support table and partition statistics stored in Puffin files:
- Metadata parsing
- Optional integration into scan planning
If anything is missing or you have suggestions, feel free to comment or open additional sub-issues.
We already have partial support for Iceberg v3 in the codebase (e.g., Row Lineage and partition statistics). However, a number of key v3 features are still missing.
This issue serves as a parent tracking issue for the remaining v3 work we plan to (partially) include in the 0.3.0 release.
If you are interested in working on any of the items below, please open a dedicated sub-issue and link it back to this one.
1. Schema-Level Support for New v3 Types
Add support for all newly introduced v3 types at the schema level, including:
timestamp_ns,timestamptz_nsunknownvariantgeometry,geography2. Column Default Values (Schema Evolution)
Enable schema evolution with default values:
initial-defaultandwrite-default3. Multi-Argument Transforms
Upgrade partitioning and sort order logic to support multi-column transforms:
4. Binary Deletion Vectors
Add support for deletion vectors (DV):
5. Table-Level Encryption Metadata (Not sure whether this should be included in 0.4.0)
Support encryption-related metadata defined in v3:
6. Puffin-Based Statistics
Support table and partition statistics stored in Puffin files:
If anything is missing or you have suggestions, feel free to comment or open additional sub-issues.