Skip to content

test_visvalingam_whyatt_scales_subquadratic timing assertion fails on slow CI runners #2890

@brendancol

Description

@brendancol

Describe the bug

test_visvalingam_whyatt_scales_subquadratic in xrspatial/tests/test_polygonize.py is a wall-clock benchmark. It asserts the per-doubling runtime ratio of _visvalingam_whyatt stays under a hardcoded 3.0 (line 1230). On the current CI runners the ratio lands around 3.4-3.5x, so the assertion fails as a required check on the run (ubuntu-latest, 3.14) fast lane.

The margin is thin enough that a rerun does not reliably clear it. In a batch of five parallel proximity PRs (June 2026) the test failed on four. gh run rerun --failed cleared it on #2864 and #2881, but on #2875 it failed three runs in a row (3.47x, 3.43x, 3.46x). The ubuntu fail-fast also cancels the macos and windows jobs, which hides otherwise-green matrix entries.

None of those PRs touch polygonize. The test blocks unrelated work.

Expected behavior

The benchmark should check sub-quadratic scaling without failing on slow or loaded runners. A red run should mean an actual O(n^2) regression, not runner contention.

Reproduction

Run on a loaded runner (or locally under load):

pytest xrspatial/tests/test_polygonize.py::TestPolygonize::test_visvalingam_whyatt_scales_subquadratic

Observed failure:

AssertionError: Time ratio for 2000 -> 4000 is 3.46x, expected < 3x (O(n^2) would give ~4x)

Suggested fix

The 3.0x threshold sits too close to the observed 3.4-3.5x to be reliable. Options, roughly in order of preference:

  • Fit log(time) vs log(n) and assert the slope is below a threshold (e.g. < 1.5) instead of checking per-step ratios. A single noisy timing barely moves the slope.
  • Take the median of several repeated measurements per size rather than one timed loop.
  • Loosen the threshold and add a comment explaining the margin.
  • Mark the test with a rerun plugin so one slow runner does not gate merge.

Additional context

The test was added in #2539 to guard against the original O(n^2) min-area scan. The scaling check is worth keeping; the wall-clock assertion is the fragile part. Refs: xrspatial/tests/test_polygonize.py:1190-1233. Surfaced while merging proximity PRs #2864, #2868, #2875, #2881.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinginfrastructureCI, benchmarks, and toolingtestsTest coverage and parity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions