Skip to content

feat: add options support to Store for format-specific write options#58

Merged
sfishel18 merged 3 commits intowherobots:mainfrom
james-willis:feat/store-options-v2
Mar 3, 2026
Merged

feat: add options support to Store for format-specific write options#58
sfishel18 merged 3 commits intowherobots:mainfrom
james-willis:feat/store-options-v2

Conversation

@james-willis
Copy link
Contributor

@james-willis james-willis commented Mar 2, 2026

Summary

  • Adds an options: dict[str, str] | None field to the Store dataclass, allowing users to pass format-specific Spark DataFrameWriter options (e.g. header, delimiter, ignoreNullFields) when storing query results to cloud storage.
  • Options are defensively copied, empty dicts are normalized to None, and the field is omitted from the WebSocket request when not set (backward compatible).
  • Updates Store.for_download() to accept an optional options parameter.
  • Includes 15 pytest tests covering options handling, serialization, and JSON round-tripping.
  • Adds "Store options" section to README with usage examples.

PEP 249 (DB-API 2.0) compliance

This change does not alter the PEP 249 interface. The store keyword parameter on cursor.execute() was introduced upstream in #55/#56 as a vendor extension, which PEP 249 explicitly permits:

"During the lifetime of DB API 2.0, module authors have often extended their implementations beyond what is required by this DB API specification."

The standard execute(operation [, parameters]) signature is preserved — store is keyword-only and optional, so callers using the PEP 249 API are unaffected. This PR only adds an attribute (options) to the vendor-specific Store dataclass; no cursor, connection, or driver signatures change.

Usage

from wherobots.db import Store, StorageFormat

# CSV without headers and a custom delimiter
store = Store.for_download(
    format=StorageFormat.CSV,
    options={"header": "false", "delimiter": "|"},
)

# GeoJSON preserving null fields
store = Store.for_download(
    format=StorageFormat.GEOJSON,
    options={"ignoreNullFields": "false"},
)

Related

@james-willis james-willis requested a review from a team as a code owner March 2, 2026 21:42
@james-willis james-willis requested review from haizhou-zhao and removed request for a team March 2, 2026 21:42
Copy link

@salty-hambot salty-hambot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean addition — the options field is well-designed with defensive copying, empty-dict normalization, and backward-compatible wire format. PEP 249 compliance is unaffected: store was already a vendor extension kwarg on execute(), and this PR only adds a field to the vendor-specific Store dataclass. No cursor, connection, or driver signatures change.

Two suggestions worth considering as the Store surface area grows: extracting serialization into Store.to_dict() (eliminates duplication in both connection.py and tests), and eventually locking down mutability.

Pre-existing note (not caused by this PR): cursor.fetchone() / fetchall() will TypeError when store is configured, because __get_results() returns None and the callers index into it unconditionally. Worth a follow-up ticket.

@sfishel18 sfishel18 self-requested a review March 3, 2026 04:35
Address review feedback:
- Add to_dict() method on Store for WebSocket request serialization
- Use store.to_dict() in Connection.__execute_sql instead of inline dict
- Replace duplicated _serialize_store() helper in tests with to_dict()
- Remove test_options_dict_is_mutable to avoid enshrining mutability
@sfishel18 sfishel18 merged commit cf7a177 into wherobots:main Mar 3, 2026
6 checks passed
@james-willis james-willis deleted the feat/store-options-v2 branch March 4, 2026 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants