feat(rewrite): address assertion rewriting blind spots with systematic tests by RonnyPfannschmidt · Pull Request #14448 · pytest-dev/pytest

RonnyPfannschmidt · 2026-05-08T16:15:39Z

Summary

Note: This is an AI-assisted experiment (Cursor + Claude Opus 4) that executed a plan after assessing initial details about assertion rewriting blind spots. More work is needed to make this clean and production-ready — this PR has known weaknesses and is meant as a starting point for discussion, not a final implementation.

This PR builds on top of the walrus operator double-evaluation fix (#14445) and addresses additional blind spots in the assertion rewriter:

Add a systematic test infrastructure for assertion rewriting coverage (testing/test_assertrewrite_coverage.py)
Implement visit_Subscript for container[key] introspection (shows container and key separately in failure messages)
Implement visit_IfExp for ternary expression introspection (shows condition value while preserving short-circuit semantics)
Flatten method call display to show where result = obj.method(args) instead of nesting the bound-method intermediate

Before/After

Subscript (new)

# Before: assert 1 == 99
# After:  assert 1 == 99
#          +  where 1 = {'a': 1, 'b': 2}['a']

IfExp (new)

# Before: assert 0 == 99
# After:  assert 0 == 99
#          +  where 0 = (... if True else ...)

Method calls (improved)

# Before: assert 42 == 100
#          +  where 42 = compute()
#          +    where compute = Obj().compute
# After:  assert 42 == 100
#          +  where 42 = Obj().compute()

Known weaknesses / areas for improvement

The visit_IfExp only introspects the condition, not the branch values (to preserve short-circuit) — a more sophisticated implementation could emit an if/else block to get branch values too
Container literal introspection (list, dict, set literals) is still unaddressed (xfail)
Callable variables still show <function ...> repr instead of variable name (xfail)
The method call flattening may interact unexpectedly with deeply chained attribute access patterns not yet covered
No changelog fragment yet — this needs proper evaluation before merging
Includes the Walrus expression duplicate evaluation failures with rewrite #14445 walrus fix commits (those should be merged first or this PR rebased after)

Test plan

75 tests pass in testing/test_assertrewrite_coverage.py (61 pass + 14 edge cases)
3 remaining xfail for lower-priority blind spots (container literals, callable variable repr)
All 132 existing testing/test_assertrewrite.py tests pass (no regressions)
All 162 testing/test_assertion.py tests pass (no regressions)
Single-evaluation guarantees verified for all new visitors
Short-circuit semantics preserved for IfExp
Pre-commit hooks pass (ruff, mypy, codespell)

Made with Cursor

Fixes pytest-dev#14445 - assertion rewriting evaluated NamedExpr (:=) expressions multiple times, causing side effects to fire repeatedly. The root cause was the `variables_overwrite` mechanism which stored and re-evaluated NamedExpr AST nodes in subsequent assertions, in `_call_reprcompare`'s results tuple, and in explanation formatting. The fix: - visit_NamedExpr: reference the target variable in explanations instead of re-evaluating the full expression - visit_Compare: assign left-side NamedExpr to a temp before right-side hoisting; freeze left_res when a comparator walrus targets the same name; replace NamedExpr entries in `results` with target variables - visit_BoolOp: capture short-circuit condition in a stable temp for the explanation path; remove walrus target rename logic - visit_Call: remove variables_overwrite substitution (walrus now properly assigns to user variables in its natural evaluation position) - Remove variables_overwrite, scope tracking, Sentinel class Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

Add tests for two remaining walrus double-evaluation scenarios: - Bare NamedExpr as BoolOp operand evaluated twice via condition check - Same walrus target in chained comparison evaluated multiple times Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

Use the already-assigned res_var to build the short-circuit condition instead of the raw visitor result, preventing bare NamedExpr operands from being evaluated a second time when checking truthiness. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

In a chained comparison like `(x := f()) < (x := g()) < (x := h())`, each NamedExpr comparator is now assigned to a temp variable so it evaluates exactly once. Previously the raw NamedExpr node would be reused as left_res in the next iteration, causing double evaluation. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

Create testing/test_assertrewrite_coverage.py with reusable helpers for verifying assertion rewriting behavior across all expression types: - get_failure_message: compile rewritten source and extract failure text - assert_introspects: verify failure messages contain expected intermediates - assert_single_evaluation: verify no double-evaluation of side effects - assert_passes_when_true: verify rewritten asserts don't false-positive - assert_semantically_equivalent: verify rewrite preserves pass/fail semantics Includes smoke tests validating the helpers themselves. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Add parametrized test classes that document current assertion rewriting behavior for each AST expression type: - Compare, BoolOp, UnaryOp, BinOp: verified working with correct output - Call: works but shows <function ...> repr for local/variable callables - Attribute, Name, Walrus: verified working - Subscript, IfExp, ContainerLiteral: marked xfail (blind spots) - MethodCall: shows noisy bound-method intermediate (xfail) - Comprehension, FString: semantics preserved, result shown in compare The xfail tests document expected improvements and will be unmarked as each blind spot is addressed in subsequent commits. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Add TestSingleEvaluation class that systematically verifies no expression is evaluated multiple times during assertion rewriting. Covers: - Calls in compare, boolean, unary, binop contexts - Attribute/property access - Subscript (dict __getitem__) - Walrus operator in compare, boolean, and chained compare - Method calls - Nested calls (inner + outer counted separately) - Multiple comparators in chained comparisons - IfExp conditions - Comprehension generators All tests pass, confirming the pytest-dev#14445 fix holds across all expression types. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Implement a dedicated visitor for ast.Subscript in AssertionRewriter that decomposes container[key] expressions into separate container and key introspection. This produces failure messages like: assert 1 == 99 + where 1 = {'a': 1, 'b': 2}['a'] Previously, subscript expressions hit generic_visit and only showed the final value without decomposition into container and key. Slices (a[1:3]) still fall back to generic_visit since decomposing start/stop/step is rarely useful in assertion messages. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Implement a dedicated visitor for ast.IfExp that introspects the condition value while preserving short-circuit semantics. Produces failure messages like: assert 0 == 99 + where 0 = (... if True else ...) The condition is rewritten for introspection (showing its evaluated value), but branches are kept as-is to preserve Python's short-circuit behavior — only the selected branch is evaluated. Previously, IfExp hit generic_visit showing only the final result without any insight into which branch was taken or why. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Refactor visit_Call to detect obj.method() patterns (where call.func is an ast.Attribute) and produce a flat explanation format: assert 42 == 100 + where 42 = Obj().compute() Instead of the previous nested format: assert 42 == 100 + where 42 = compute() + where compute = Obj().compute The bound-method intermediate line was noisy and unhelpful — users want to see what object the method was called on and what it returned, not the method object itself. Regular function calls (non-attribute) are unchanged. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Add TestEdgeCases class combining the new visitors (Subscript, IfExp, method call) with existing ones to verify correct behavior in complex scenarios: - Subscript with variable keys, call keys, and nested subscripts - Method calls with arguments, chained calls, and global objects - IfExp with call conditions - Walrus operator in subscript keys - Single-evaluation guarantees for all new visitors - Custom assert messages still work with new decomposition - Complex assertions combining multiple visitor types Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

Copilot

Pull request overview

This PR extends pytest’s assertion rewriting to reduce “blind spots” and regressions (notably around walrus operator single-evaluation), while introducing a new systematic test suite to track introspection/semantics coverage across expression types.

Changes:

Add targeted regression tests for #14445 (walrus operator double-evaluation) in testing/test_assertrewrite.py.
Introduce a new systematic assertion-rewriting coverage test module (testing/test_assertrewrite_coverage.py) with helpers for introspection, semantic equivalence, and single-evaluation checks.
Update the assertion rewriter to improve introspection for subscripts (container[key]), if-expressions (a if cond else b), and to flatten method-call explanations.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
testing/test_assertrewrite.py	Updates expectations for walrus/boolop output and adds #14445 regression tests.
testing/test_assertrewrite_coverage.py	New structured coverage suite for assertion rewriting behaviors (introspection/semantics/single-eval).
src/_pytest/assertion/rewrite.py	Implements new visitors (Subscript/IfExp), flattens method call formatting, and refines walrus-related evaluation behavior.
changelog/14445.bugfix.rst	Adds changelog entry documenting the walrus double-evaluation fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    # Run without rewriting
+    plain_code = compile(src, "<test-plain>", "exec")
+    plain_ns: dict[str, object] = {}
+    if extra_ns is not None:
+        plain_ns.update(extra_ns)
+    exec(plain_code, plain_ns)
+    plain_func = cast(Callable[[], None], plain_ns["check"])
+    plain_raised = False
+    try:
+        plain_func()
+    except AssertionError:
+        plain_raised = True
+
+    # Run with rewriting
+    mod = _rewrite_source(src)
+    rewritten_code = compile(mod, "<test-rewritten>", "exec")
+    rewritten_ns: dict[str, object] = {}
+    if extra_ns is not None:
+        rewritten_ns.update(extra_ns)
+    exec(rewritten_code, rewritten_ns)


Update raises_group.py::test_assert_matches to expect the new flat method call format (where False = RaisesExc(TypeError).matches(...)) instead of the old nested format with a separate bound-method line. Also address Copilot review: use copy.deepcopy in assert_semantically_equivalent to isolate mutable state between the plain and rewritten execution runs. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>

RonnyPfannschmidt and others added 12 commits May 8, 2026 10:37

Add changelog fragment for pytest-dev#14445

9f371c7

Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>

Copilot AI review requested due to automatic review settings May 8, 2026 16:15

psf-chronographer Bot added the bot:chronographer:provided (automation) changelog entry is part of PR label May 8, 2026

Copilot started reviewing on behalf of RonnyPfannschmidt May 8, 2026 16:16 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

RonnyPfannschmidt mentioned this pull request May 9, 2026

Fix walrus double eval #14446

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(rewrite): address assertion rewriting blind spots with systematic tests#14448

feat(rewrite): address assertion rewriting blind spots with systematic tests#14448
RonnyPfannschmidt wants to merge 13 commits intopytest-dev:mainfrom
RonnyPfannschmidt:ronny/assert-rewrite-blind-spots

RonnyPfannschmidt commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

RonnyPfannschmidt commented May 8, 2026

Summary

Before/After

Subscript (new)

IfExp (new)

Method calls (improved)

Known weaknesses / areas for improvement

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants