feat(rewrite): address assertion rewriting blind spots with systematic tests#14448
Open
RonnyPfannschmidt wants to merge 13 commits intopytest-dev:mainfrom
Open
feat(rewrite): address assertion rewriting blind spots with systematic tests#14448RonnyPfannschmidt wants to merge 13 commits intopytest-dev:mainfrom
RonnyPfannschmidt wants to merge 13 commits intopytest-dev:mainfrom
Conversation
Fixes pytest-dev#14445 - assertion rewriting evaluated NamedExpr (:=) expressions multiple times, causing side effects to fire repeatedly. The root cause was the `variables_overwrite` mechanism which stored and re-evaluated NamedExpr AST nodes in subsequent assertions, in `_call_reprcompare`'s results tuple, and in explanation formatting. The fix: - visit_NamedExpr: reference the target variable in explanations instead of re-evaluating the full expression - visit_Compare: assign left-side NamedExpr to a temp before right-side hoisting; freeze left_res when a comparator walrus targets the same name; replace NamedExpr entries in `results` with target variables - visit_BoolOp: capture short-circuit condition in a stable temp for the explanation path; remove walrus target rename logic - visit_Call: remove variables_overwrite substitution (walrus now properly assigns to user variables in its natural evaluation position) - Remove variables_overwrite, scope tracking, Sentinel class Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>
Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>
Add tests for two remaining walrus double-evaluation scenarios: - Bare NamedExpr as BoolOp operand evaluated twice via condition check - Same walrus target in chained comparison evaluated multiple times Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>
Use the already-assigned res_var to build the short-circuit condition instead of the raw visitor result, preventing bare NamedExpr operands from being evaluated a second time when checking truthiness. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>
In a chained comparison like `(x := f()) < (x := g()) < (x := h())`, each NamedExpr comparator is now assigned to a temp variable so it evaluates exactly once. Previously the raw NamedExpr node would be reused as left_res in the next iteration, causing double evaluation. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Sonnet 4 <claude@anthropic.com>
Create testing/test_assertrewrite_coverage.py with reusable helpers for verifying assertion rewriting behavior across all expression types: - get_failure_message: compile rewritten source and extract failure text - assert_introspects: verify failure messages contain expected intermediates - assert_single_evaluation: verify no double-evaluation of side effects - assert_passes_when_true: verify rewritten asserts don't false-positive - assert_semantically_equivalent: verify rewrite preserves pass/fail semantics Includes smoke tests validating the helpers themselves. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Add parametrized test classes that document current assertion rewriting behavior for each AST expression type: - Compare, BoolOp, UnaryOp, BinOp: verified working with correct output - Call: works but shows <function ...> repr for local/variable callables - Attribute, Name, Walrus: verified working - Subscript, IfExp, ContainerLiteral: marked xfail (blind spots) - MethodCall: shows noisy bound-method intermediate (xfail) - Comprehension, FString: semantics preserved, result shown in compare The xfail tests document expected improvements and will be unmarked as each blind spot is addressed in subsequent commits. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Add TestSingleEvaluation class that systematically verifies no expression is evaluated multiple times during assertion rewriting. Covers: - Calls in compare, boolean, unary, binop contexts - Attribute/property access - Subscript (dict __getitem__) - Walrus operator in compare, boolean, and chained compare - Method calls - Nested calls (inner + outer counted separately) - Multiple comparators in chained comparisons - IfExp conditions - Comprehension generators All tests pass, confirming the pytest-dev#14445 fix holds across all expression types. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Implement a dedicated visitor for ast.Subscript in AssertionRewriter
that decomposes container[key] expressions into separate container and
key introspection. This produces failure messages like:
assert 1 == 99
+ where 1 = {'a': 1, 'b': 2}['a']
Previously, subscript expressions hit generic_visit and only showed
the final value without decomposition into container and key.
Slices (a[1:3]) still fall back to generic_visit since decomposing
start/stop/step is rarely useful in assertion messages.
Co-authored-by: Cursor AI <ai@cursor.sh>
Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Implement a dedicated visitor for ast.IfExp that introspects the
condition value while preserving short-circuit semantics. Produces
failure messages like:
assert 0 == 99
+ where 0 = (... if True else ...)
The condition is rewritten for introspection (showing its evaluated
value), but branches are kept as-is to preserve Python's short-circuit
behavior — only the selected branch is evaluated.
Previously, IfExp hit generic_visit showing only the final result
without any insight into which branch was taken or why.
Co-authored-by: Cursor AI <ai@cursor.sh>
Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Refactor visit_Call to detect obj.method() patterns (where call.func is
an ast.Attribute) and produce a flat explanation format:
assert 42 == 100
+ where 42 = Obj().compute()
Instead of the previous nested format:
assert 42 == 100
+ where 42 = compute()
+ where compute = Obj().compute
The bound-method intermediate line was noisy and unhelpful — users want
to see what object the method was called on and what it returned, not
the method object itself.
Regular function calls (non-attribute) are unchanged.
Co-authored-by: Cursor AI <ai@cursor.sh>
Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
Add TestEdgeCases class combining the new visitors (Subscript, IfExp, method call) with existing ones to verify correct behavior in complex scenarios: - Subscript with variable keys, call keys, and nested subscripts - Method calls with arguments, chained calls, and global objects - IfExp with call conditions - Walrus operator in subscript keys - Single-evaluation guarantees for all new visitors - Custom assert messages still work with new decomposition - Complex assertions combining multiple visitor types Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR extends pytest’s assertion rewriting to reduce “blind spots” and regressions (notably around walrus operator single-evaluation), while introducing a new systematic test suite to track introspection/semantics coverage across expression types.
Changes:
- Add targeted regression tests for #14445 (walrus operator double-evaluation) in
testing/test_assertrewrite.py. - Introduce a new systematic assertion-rewriting coverage test module (
testing/test_assertrewrite_coverage.py) with helpers for introspection, semantic equivalence, and single-evaluation checks. - Update the assertion rewriter to improve introspection for subscripts (
container[key]), if-expressions (a if cond else b), and to flatten method-call explanations.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| testing/test_assertrewrite.py | Updates expectations for walrus/boolop output and adds #14445 regression tests. |
| testing/test_assertrewrite_coverage.py | New structured coverage suite for assertion rewriting behaviors (introspection/semantics/single-eval). |
| src/_pytest/assertion/rewrite.py | Implements new visitors (Subscript/IfExp), flattens method call formatting, and refines walrus-related evaluation behavior. |
| changelog/14445.bugfix.rst | Adds changelog entry documenting the walrus double-evaluation fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+188
to
+207
| # Run without rewriting | ||
| plain_code = compile(src, "<test-plain>", "exec") | ||
| plain_ns: dict[str, object] = {} | ||
| if extra_ns is not None: | ||
| plain_ns.update(extra_ns) | ||
| exec(plain_code, plain_ns) | ||
| plain_func = cast(Callable[[], None], plain_ns["check"]) | ||
| plain_raised = False | ||
| try: | ||
| plain_func() | ||
| except AssertionError: | ||
| plain_raised = True | ||
|
|
||
| # Run with rewriting | ||
| mod = _rewrite_source(src) | ||
| rewritten_code = compile(mod, "<test-rewritten>", "exec") | ||
| rewritten_ns: dict[str, object] = {} | ||
| if extra_ns is not None: | ||
| rewritten_ns.update(extra_ns) | ||
| exec(rewritten_code, rewritten_ns) |
Update raises_group.py::test_assert_matches to expect the new flat method call format (where False = RaisesExc(TypeError).matches(...)) instead of the old nested format with a separate bound-method line. Also address Copilot review: use copy.deepcopy in assert_semantically_equivalent to isolate mutable state between the plain and rewritten execution runs. Co-authored-by: Cursor AI <ai@cursor.sh> Co-authored-by: Anthropic Claude Opus 4 <claude@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Note: This is an AI-assisted experiment (Cursor + Claude Opus 4) that executed a plan after assessing initial details about assertion rewriting blind spots. More work is needed to make this clean and production-ready — this PR has known weaknesses and is meant as a starting point for discussion, not a final implementation.
This PR builds on top of the walrus operator double-evaluation fix (#14445) and addresses additional blind spots in the assertion rewriter:
testing/test_assertrewrite_coverage.py)visit_Subscriptforcontainer[key]introspection (shows container and key separately in failure messages)visit_IfExpfor ternary expression introspection (shows condition value while preserving short-circuit semantics)where result = obj.method(args)instead of nesting the bound-method intermediateBefore/After
Subscript (new)
IfExp (new)
Method calls (improved)
Known weaknesses / areas for improvement
visit_IfExponly introspects the condition, not the branch values (to preserve short-circuit) — a more sophisticated implementation could emit an if/else block to get branch values too<function ...>repr instead of variable name (xfail)Test plan
testing/test_assertrewrite_coverage.py(61 pass + 14 edge cases)testing/test_assertrewrite.pytests pass (no regressions)testing/test_assertion.pytests pass (no regressions)Made with Cursor