Skip to content

feat: GSheets compatibility testing suite#1624

Closed
svallory wants to merge 27 commits intohandsontable:masterfrom
g2i-ai:feat/gsheets-compat-testing
Closed

feat: GSheets compatibility testing suite#1624
svallory wants to merge 27 commits intohandsontable:masterfrom
g2i-ai:feat/gsheets-compat-testing

Conversation

@svallory
Copy link

Summary

  • Adds a complete Google Sheets vs HyperFormula compatibility test suite under test/gsheets-compat/
  • Populates expected values in formula-compat-tests.json from the existing formula-compat-gsheets.csv reference export (490/512 tests now have reference values)
  • Adds script/compat-report.ts — a standalone script that evaluates all formulas and reports overall compatibility % + untested functions without running Jest
  • Documents the full compat test workflow in DEV_DOCS.md

Current compatibility score: 60.8% (298/490 comparable functions match Google Sheets)

Test plan

  • npm run gsheets:compat-report — prints compatibility report with per-category breakdown and untested function list
  • npm run test:gsheets-compat — runs the Jest spec; fails only if a function in MUST_MATCH_FUNCTIONS mismatches
  • npm run gsheets:patch-values — re-patches expected values after a new GSheets CSV export

svallory and others added 27 commits February 21, 2026 22:36
* fix: track cell dependencies inside inline array literals

Add missing AstNodeType.ARRAY case to collectDependencies, so cell
references in inline arrays (={A1, B1}) are registered as dependencies
and updated when referenced cells change.

Also adds design docs and implementation plan for the upcoming Google
Sheets compatibility mode feature.

* chore: remove AI planning

Signed-off-by: Saulo Vallory <me@saulo.engineer>

---------

Signed-off-by: Saulo Vallory <me@saulo.engineer>
* fix: track cell dependencies inside inline array literals

Add missing AstNodeType.ARRAY case to collectDependencies, so cell
references in inline arrays (={A1, B1}) are registered as dependencies
and updated when referenced cells change.

Also adds design docs and implementation plan for the upcoming Google
Sheets compatibility mode feature.

* feat: add compatibilityMode config with Google Sheets preset

Add `compatibilityMode: 'default' | 'googleSheets'` config option that
activates Google Sheets-compatible defaults:
- dateFormats: ['MM/DD/YYYY', 'MM/DD/YY', 'YYYY/MM/DD']
- localeLang: 'en-US'
- currencySymbol: ['$', 'USD']

User-provided values always override the preset.

When enabled, auto-registers TRUE and FALSE named expressions (so
`=TRUE` works without parentheses, matching GSheets behavior). Guards
against overwriting user-defined named expressions with the same names.

* chore: remove AI planning

Signed-off-by: Saulo Vallory <me@saulo.engineer>

* fix: respect scope when auto-registering TRUE/FALSE

Only global named expressions now block Google Sheets TRUE/FALSE defaults, so sheet-scoped TRUE/FALSE entries no longer prevent global registration.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: apply Google Sheets defaults when switching mode

Ensure updateConfig({ compatibilityMode: 'googleSheets' }) reapplies Google Sheets defaults for dateFormats, localeLang, and currencySymbol unless explicitly provided by the caller.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: restore default preset when leaving googleSheets mode

Ensure updateConfig() re-applies default date, locale, and currency values when compatibilityMode changes from googleSheets to default unless users explicitly override them.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: drop internal TRUE/FALSE when leaving googleSheets mode

Mark compatibility-mode auto-registered TRUE/FALSE named expressions as internal and exclude them during rebuild when switching to default mode. This preserves expected default-mode NAME errors while keeping user-defined named expressions intact.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Signed-off-by: Saulo Vallory <me@saulo.engineer>
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: track cell dependencies inside inline array literals

Add missing AstNodeType.ARRAY case to collectDependencies, so cell
references in inline arrays (={A1, B1}) are registered as dependencies
and updated when referenced cells change.

Also adds design docs and implementation plan for the upcoming Google
Sheets compatibility mode feature.

* feat: add Google Sheets plugin override infrastructure

Create `src/interpreter/plugin/googleSheets/` folder for dedicated
GSheets function override plugins. When `compatibilityMode` is
`'googleSheets'`, FunctionRegistry loads these plugins after the
defaults, silently replacing overridden functions via Map.set.

The plugins array is currently empty — individual override plugins
(SPLIT, statistical, financial, etc.) will be added in subsequent PRs.

* chore: remove docs/plans

Signed-off-by: Saulo Vallory <me@saulo.engineer>

* fix: reapply compatibility presets on mode switches

Ensure compatibility preset values are re-applied when updateConfig changes compatibilityMode, while still honoring explicit overrides in the same update call.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: remove duplicate compatibility transition tests

Keep compatibility-mode coverage focused by deleting redundant updateConfig transition tests and documenting the cleanup in the changelog, while adding the development Karma config entry point.

---------

Signed-off-by: Saulo Vallory <me@saulo.engineer>
Co-authored-by: Cursor <cursoragent@cursor.com>
Implement Google Sheets SPLIT function signature:
  SPLIT(text, delimiter, [split_by_each], [remove_empty_text])

Differences from HyperFormula's built-in SPLIT(string, index):
- Returns a horizontal spilled array instead of a single word
- split_by_each (default true): treats each delimiter character as a
  separate delimiter vs. using the whole string
- remove_empty_text (default true): filters out empty strings from result
- Regex-special characters in delimiter are properly escaped

Uses splitArraySize for static size prediction required by HF's array
spill mechanism. Falls back to config.maxColumns for non-literal text
arguments (documented trade-off: over-prediction pads with EmptyValue).
Copy formula catalog, test manifest (with GSheets expected values),
and GSheets CSV export from ghee-sheets for independent compat testing.
papaparse is used for CSV generation and parsing in the compatibility
testing pipeline scripts.
Defines allowlist for gated functions and classification sets
for volatile and GSheets-only functions.
Value comparison with numeric tolerance, cell data builders for
HyperFormula sheet arrays, and compatibility report printer.

- Import DetailedCellError from src/CellValue (verified path)
- Use .value property for error string representation matching GSheets format
- parseA1: converts A1 notation to [col, row] indices
- buildSheetData: constructs 2D array with cell data and formula placement
- getFormulaAddress: returns A1 address where formula is placed
- cellValueToString: normalizes HyperFormula values for comparison
- valuesMatch: compares with case-insensitive and numeric tolerance
- printReport: formatted compatibility report with per-category breakdown
Evaluates all ~512 formula functions through HyperFormula with
compatibilityMode: 'googleSheets' and compares results against
Google Sheets reference values. Prints a compatibility report
table and fails on allowlisted function mismatches.

- Skips volatile functions (NOW, RAND, etc.) and GSheets-only
  functions (GOOGLEFINANCE, IMPORTDATA, etc.)
- Uses buildSheetData/getFormulaAddress helpers for cell data placement
- 60s timeout to handle large formula evaluation runs
- MUST_MATCH_FUNCTIONS gate: only fails on explicitly allowlisted functions
When no cellData is present, the formula was incorrectly placed at
row 2 while getFormulaAddress returned A1. Now formulas without
cellData are placed directly at A1, fixing evaluation for ~480
self-contained formulas.
Converts formula-functions.json into formula-compat-tests.json with
self-contained formulas using inline arrays and cell data maps.
Adapted from ghee-sheets with updated paths for HyperFormula.
Reads the Google Sheets CSV export and writes expectedValue fields
back into formula-compat-tests.json for use by the compat test.
Converts formula-compat-tests.json into a CSV file for import into
Google Sheets. Formulas in column C are evaluated by GSheets, and
cell data blocks are placed at row 700+.
- test:gsheets-compat: run the compatibility test
- gsheets:generate-tests: regenerate test manifest from function catalog
- gsheets:generate-csv: generate CSV for Google Sheets import
- gsheets:patch-values: patch expected values from GSheets CSV export
Add 80+ functions confirmed to match GSheets output across Math,
Logical, Text, Engineering, Statistical, Lookup, and Date categories.
The test will fail if any of these regress.
- Import DetailedCellError from '../../src' (public API) instead of
  the internal '../../src/CellValue' path
- Remove empty/zero cross-equivalence that could mask real mismatches
- Restrict case-insensitive comparison to boolean literals only
GSheets returns #N/A for AVERAGEIFS with inline arrays (same as
SUMIFS/MAXIFS/MINIFS) and #VALUE! for NETWORKDAYS/WORKDAY/NETWORKDAYS.INTL/
WORKDAY.INTL when holidays are passed as inline arrays.

Move all five functions to CELL_REF_FORMULAS so compatibility tests verify
actual function behavior against real cell ranges instead of comparing two
error values.

- AVERAGEIFS: uses C701:C710 and D701:D710 cell ranges
- NETWORKDAYS, NETWORKDAYS.INTL, WORKDAY, WORKDAY.INTL: new HOLIDAY_CELL_DATA
  block at rows 781-782 with date strings for the holidays parameter

Add generate-formula-compat-tests.spec.ts to assert that these functions
produce cell-ref-based test cases in the generated fixture.
The CSV generator was missing the HOLIDAY_CELL_DATA block (rows 781-782)
added for NETWORKDAYS/WORKDAY functions. Without it, cells A781:A782 would
be empty when imported into Google Sheets, producing incorrect reference
values for all four date functions.

Add holiday data block at rows 780-782 to match the cell layout defined
in generate-formula-compat-tests.ts. Also extend the CSV test to assert
this block is emitted at the correct row position.
Two related fixes:

1. parseGSheetsValue did not handle GSheets-formatted output like `$6.25`,
   `-$1,848.51`, or `9%`. These were falling through to the string path,
   causing financial functions (DB, DDB, FV, IPMT, IRR, MIRR, NPV, PMT,
   PPMT, PV, RATE, SLN, SYD, VDB) to store string expectedValues instead
   of numbers. Extract the function into script/parse-gsheets-value.ts with
   currency and percent parsing, and re-export it from patch-expected-values.ts.

2. formula-compat-gsheets.csv had stale inline-array formulas for NETWORKDAYS,
   NETWORKDAYS.INTL, WORKDAY, and WORKDAY.INTL with #VALUE! expected values.
   Update formula text to match the new cell-ref formulas (A781:A782) and
   clear expected values so they are treated as pending until re-run with
   real GSheets output.

Add patch-expected-values.spec.ts with 29 tests covering all GSheets output
formats including edge cases.
Two low-severity fixes:

1. valuesMatch used parseFloat() for numeric comparison, which accepts
   partial numeric prefixes: parseFloat("4-9i") and parseFloat("4+9i")
   both return 4, and parseFloat("7/20/1969") returns 7, causing complex
   number and date strings to falsely match. Switch to Number() which
   returns NaN for any non-numeric string, preventing false positives.

2. Remove the unused re-export of parseGSheetsValue from
   patch-expected-values.ts. The function is defined in and imported
   directly from parse-gsheets-value.ts; the re-export was dead code
   in a CLI script that should not export anything.

Add helpers.spec.ts with 15 tests covering exact match, boolean,
numeric tolerance, and the complex/date false-positive edge cases.
Number("010011") returns 10011, stripping leading zeros from base-conversion
function output (BASE, BIN2HEX, BIN2OCT, DEC2BIN, DEC2HEX, DEC2OCT, HEX2OCT,
OCT2BIN, OCT2HEX). These 9 functions return zero-padded strings that must stay
as strings to compare correctly against HyperFormula output.

Add early return for strings with length > 1 that start with "0", before
the plain numeric parsing step. Add 6 leading-zero test cases to
patch-expected-values.spec.ts.
The previous check (startsWith("0") && length > 1) incorrectly treated
decimal values like "0.5", "0.0051", "0.004677734981" as strings instead
of numbers, affecting ~20+ formula expected values in the CSV.

The correct guard for base-conversion output is: starts with "0" AND
the second character is not "." (i.e., not a decimal point). This
correctly preserves "010011" and "01100100" as strings while still
parsing "0.5" as the number 0.5.

Add 3 decimal edge-case tests to patch-expected-values.spec.ts.
Number("") evaluates to 0 in JavaScript, not NaN. Without protection,
valuesMatch("", "0") returns true because both sides pass the !isNaN check
and compare equal. This causes a false positive when HyperFormula returns
null (converted to "" by cellValueToString) and the GSheets expected value
is "0".

Treat empty strings as NaN before the numeric comparison branch, matching
the existing guard already present in parseGSheetsValue. Add 3 tests for
the empty-string/zero edge case in helpers.spec.ts.
The previous formulas appended 5 to INLINE_DATA.numeric ({10;20;...;55}),
producing 11 unique values with no statistical mode. Both HyperFormula
and GSheets return #N/A for all-unique data, making the compatibility
test meaningless — it only verifies error-vs-error, not actual behavior.

Replace 5 with 10 (already in the numeric set) so the mode is 10 and
the functions return a real numeric result. Add tests to
generate-formula-compat-tests.spec.ts asserting that MODE formulas
contain a duplicate value.
Run patch-expected-values.ts against the existing formula-compat-gsheets.csv
to populate expectedValue fields for 490 of 512 test cases. The remaining
22 are volatile, GSheets-only, or explicitly null in the GSheets export.
Adds script/compat-report.ts which evaluates all formulas through
HyperFormula and prints a per-category compatibility breakdown, overall
match percentage, and a list of functions with no GSheets reference value —
without requiring the full Jest test suite.

Registers the script as npm run gsheets:compat-report.
Add a comprehensive section explaining the full compat test pipeline:
pipeline diagram, file reference table, step-by-step run instructions,
how to update the GSheets reference CSV, and how to promote functions
to MUST_MATCH_FUNCTIONS.
@svallory
Copy link
Author

Opened against wrong repo by mistake.

@svallory svallory closed this Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant