Skip to content

Pass irep_idt by value (C++ Core Guidelines F.16) + CI checker#9014

Open
tautschnig wants to merge 24 commits into
diffblue:developfrom
tautschnig:f16-checker
Open

Pass irep_idt by value (C++ Core Guidelines F.16) + CI checker#9014
tautschnig wants to merge 24 commits into
diffblue:developfrom
tautschnig:f16-checker

Conversation

@tautschnig

@tautschnig tautschnig commented May 26, 2026

Copy link
Copy Markdown
Collaborator

irep_idt is a dstringt, which holds a single unsigned table index. Passing it by const irep_idt & was historically defensible because irep_idt could be aliased to std::string via USE_STD_STRING -- but that compatibility was removed in 4d935bf14f ("Remove support for irep_idt-as-std::string"). With dstringt as the only target, by-value aligns with [C++ Core Guidelines F.16].

Why this is worth doing (correctness, not speed)

@kroening asked whether this yields a speed-up. Honestly: no measurable one, and that's fine -- the motivation is correctness and style, not performance.

  • A latent dangling-reference bug, fixed. The F.16 conversion exposed an existing dangling-reference chain in require_goto_statements (a jbmc unit-test helper) that happened to work by accident on develop: a helper returned a reference into the entry-point instructions vector, which a caller bound to a const & whose lifetime no longer covered the use once the initialising parameter became by-value. This is fixed here (folded into 36dd78bc33), and is the strongest concrete argument for the change. The same class of latent issue is what gcc-14's -Wdangling-reference flags (see 5ef06bd40a).
  • Core-Guideline alignment / clarity. const irep_idt & advertises an indirection that no longer exists.

Microbenchmark

Passing irep_idt to a noinline function ~3×10⁸ times, comparing the two calling conventions on otherwise-identical work (return a == b;):

by-value: ~443 ms
by-ref:   ~443 ms   (difference < 0.3%, within run-to-run noise)

So at the call-convention level the change is performance-neutral; the cost of the by-reference indirection is lost in the noise (and is dominated by anything the callee actually does, e.g. a symbol_table lookup). The case rests on the correctness fix above and on Core-Guideline alignment.

What's in this PR

  • A CI check (scripts/check_pass_by_value.cpp + scripts/run_pass_by_value_check.sh + .github/workflows/syntax-checks.yaml, modelled on the existing check-irep-copies job) that flags any new parmVarDecl of const T & where T is in a configurable cheap-types list (dstringt for now). Matching is on the fully-qualified record name (getQualifiedNameAsString()), so it won't over-match unrelated types sharing a simple name. --fix rewrites findings in place.
  • Checker self-tests under regression/cbmc-pass-by-value/ (positive, by-value, non-cheap-by-reference, a namespaced-dstringt negative that locks in the qualified-name behaviour, and a --fix rewrite test), run from the same CI job.
  • Runner robustness: the scan now also covers unit/ and jbmc/unit/; a liveness canary (scripts/pass_by_value_canary.cpp) asserts the freshly built tool finds a known violation before scanning (so a broken toolchain can't pass silently against the empty baseline); and per-TU parse errors are counted and surfaced rather than discarded. (Note: a handful of generated/amalgamated parser TUs under unit/ don't parse standalone under clang and are reported as parse errors; they contribute no findings, which the surfaced count makes visible.)
  • The 2,049 existing violations cleaned up, applied module-by-module via the same tool with --fix and formatted with git-clang-format-15 so unrelated lines stay untouched.

Stats

527 files, +2,996/−2,694. Findings: 2,049 → 0.

Testing

unit (551 cases), java-unit (110 cases), and ctest -L CORE (96 tests) all pass. git-clang-format-15 clean. The checker self-test corpus passes, and the checker reports 0 findings across src/, jbmc/src/, unit/ and jbmc/unit/.

Notes for reviewers

  • The first six commits (checker + tool refinements) are independently useful; happy to split if preferred.
  • Twelve manual fixes were folded into the matching per-module commits: eleven const irep_idt & member declarations that would have become dangling references after their initialising parameter switched to by-value, and the require_goto_statements helper described above.
  • std::function<…(const irep_idt &…)…> typedefs were updated to std::function<…(irep_idt …)…> in lockstep with the lambdas/callables bound to them.
  • The -Wdangling-reference workaround (5ef06bd40a) binds results to values rather than using a pragma or -Wno-dangling-reference; the commit message now documents why (source-local, compiler-agnostic, copies are cheap refcount bumps).

Limitations / follow-ups

  • The matcher does not catch anonymous parameters in non-instantiated typedefs of std::function (Clang doesn't synthesise a ParmVarDecl for them). None exist on develop; a FunctionProtoTypeLoc walker would close the gap if any reappear.
  • Extending CHEAP_TYPES to nested STL types (e.g. goto_programt::const_targett) requires the fully-qualified, libstdc++-specific record name, not just the simple name -- documented in the source. A fuller positive/negative fixture corpus per category is a reasonable follow-up.

ABI

Source-compatible for callers; out-of-tree consumers of libcbmc need a recompile (the mangled names of f(irep_idt) and f(const irep_idt &) differ).

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (checker self-tests under regression/cbmc-pass-by-value/).
  • My commit message includes data points confirming performance improvements (if claimed). (No perf improvement is claimed; the microbenchmark above shows the change is performance-neutral.)
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@tautschnig tautschnig self-assigned this May 26, 2026
Copilot AI review requested due to automatic review settings May 26, 2026 20:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

@codecov

codecov Bot commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.85714% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.64%. Comparing base (4033e01) to head (486b27f).

Files with missing lines Patch % Lines
src/goto-instrument/contracts/cfg_info.h 62.50% 3 Missing ⚠️
jbmc/src/java_bytecode/remove_exceptions.cpp 71.42% 2 Missing ⚠️
src/goto-checker/properties.cpp 33.33% 2 Missing ⚠️
...strument/contracts/dynamic-frames/dfcc_library.cpp 60.00% 2 Missing ⚠️
...c/java_bytecode/java_single_path_symex_checker.cpp 0.00% 1 Missing ⚠️
jbmc/src/java_bytecode/lazy_goto_functions_map.h 80.00% 1 Missing ⚠️
src/analyses/flow_insensitive_analysis.cpp 0.00% 1 Missing ⚠️
src/analyses/goto_rw.cpp 0.00% 1 Missing ⚠️
src/analyses/goto_rw.h 0.00% 1 Missing ⚠️
src/cpp/cpp_scope.cpp 66.66% 1 Missing ⚠️
... and 5 more
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #9014   +/-   ##
========================================
  Coverage    80.63%   80.64%           
========================================
  Files         1713     1713           
  Lines       189403   189478   +75     
  Branches        73       73           
========================================
+ Hits        152732   152796   +64     
- Misses       36671    36682   +11     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kroening

Copy link
Copy Markdown
Collaborator

Does this actually yield a speed-up?

@tautschnig tautschnig force-pushed the f16-checker branch 5 times, most recently from 61d5136 to 8d5bf63 Compare May 29, 2026 19:29
@tautschnig

Copy link
Copy Markdown
Collaborator Author

Does this actually yield a speed-up?

Now confirmed by measurement (and this is now part of the changes): there is no speed-up, so we have several options:

  1. Completely discard this.
  2. Pick up fabeccb, which selectively moves to pass-by-value where there seems to be a danger of dangling references.
  3. Add the checker and CI job to move towards core-guidelines alignment eventually (even if there is no performance difference).
  4. Pick up all the proposed changes of this PR.

tautschnig and others added 9 commits June 12, 2026 11:36
After the F.16 cleanup, several functions now take irep_idt by value
instead of by const reference. gcc-14's -Wdangling-reference heuristic
then warns at call sites where the result is bound to a const
reference, because it can't see that those functions return a
reference into long-lived storage (a static array, the symbol table,
or a parent expression in the namespace) regardless of the by-value
arguments.

Bind the result to a value (or, where appropriate, restructure the
expression to avoid a named reference binding altogether):

  - get_sort_and_join_table_entry: returns into a static table.
    saj_tablet is 11 irep_idts (44 bytes); the copy is cheap.
  - get_mode_from_identifier: returns into a symbol or to ID_unknown.
  - get_field_init_expr / get_field_init_type: return into the
    shadow_memory state. The intermediate binding in
    get_field_init_type is removed entirely by chaining.
  - dfcc_utilst::get_function_symbol: returns into the symbol table.
  - get_contract: returns into the symbol table; the range-based
    `for(auto &target : get_contract(...).c_assigns())` is split
    into a named contract value to keep the iterator's referent alive
    (and silence the heuristic).
  - to_history_expr(...).expression(): returns into the parent expr.
  - require_goto_statements::require_declaration_of_name and
    get_unique_non_null_expression_assigned_to_symbol (jbmc unit
    helpers): return into the entry-point instructions vector.

The value-binding form was chosen over the alternatives -- a localised
`#pragma GCC diagnostic ignored "-Wdangling-reference"`, a project-wide
`-Wno-dangling-reference`, or a `[[gnu::no_dangling]]` attribute on the
returning functions -- to keep the change source-local and free of
compiler-specific machinery. The cost is negligible: irept-based types
(symbolt, exprt, code_with_contract_typet, ...) share their internal
storage by reference count, so each "copy" introduced here is a small,
fixed handful of refcount bumps rather than a deep copy.

There is no semantic change.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Adds scripts/check_pass_by_value.cpp, a Clang LibTooling AST-matcher
program that flags function parameters of cheap-to-copy types declared
as 'const T &'. This is a violation of C++ Core Guidelines F.16
("for in parameters, pass cheaply-copied types by value and others by
reference to const").

The set of cheap types is configured in CHEAP_TYPES at the top of the
file. It currently lists dstringt (the canonical name for irep_idt,
which holds a single unsigned table index, so passing it by value is
strictly cheaper than passing by const reference).

Skips:
  * parameters of copy/move constructors and copy/move assignment
    operators (the language requires their signature)
  * parameters of function-template instantiations
  * member functions of class-template specialisations
  * parameters whose source-as-written type is a (substituted)
    template type parameter or otherwise dependent

Build with the same clang-18 flags used by check_irep_moves.cpp; the
resulting scripts/check-pass-by-value binary is .gitignored.

Add --fix mode to the F.16 pass-by-value checker

With --fix, the tool runs as a clang::tooling::RefactoringTool and
emits a Replacement for each finding that rewrites the parameter's
type from `const T &` to `T `, preserving the author's spelling
(`irep_idt` vs. `dstringt`).  The replacement range spans from the
parameter declaration's BeginLoc (start of `const`) to the parameter
name's Location, which both covers the `const` keyword and absorbs
any whitespace between `&` and the name.

Limitations: only ParmVarDecls are touched.  Function-pointer
typedefs, std::function template arguments, and friend declarations
that mention `const irep_idt &` are not rewritten and remain in the
baseline as a follow-up.

Improve --fix robustness: --path-filter, manual save, own-param check

Three fixes uncovered while applying the rewriter to src/util:

* Add --path-filter so we can run on the full TU set (needed to ensure
  every header gets visited) but only modify files whose path contains
  the given substring.  Headers are sometimes only included from .cpps
  outside their owning module.

* Apply replacements manually rather than via runAndSave: the latter
  bails out without saving when any TU has a parse error, which we hit
  for optional solver back-ends with missing third-party headers.

* Skip ParmVarDecls that are not in their enclosing FunctionDecl's
  getParamDecl() list.  These show up for parameter names inside
  function-type template arguments such as
  `std::function<void(const irep_idt &k)>`, and rewriting them would
  desynchronise the in-class declaration from its out-of-class
  definition.

F.16 checker: warn on non-fixable parameter sites

The earlier --fix mode was deliberately strict about which parameters
to rewrite, returning early on parameters that are not directly owned
by their enclosing FunctionDecl (e.g. parameter names inside a
function-type template argument like
\`std::function<void(const irep_idt &k)>\`).  That early-return also
suppressed the warning, which means CI would not catch regressions
where someone added a new such site.

Split the logic so:

* The "is this fixable in place" decision (parameter is in FD's own
  getParamDecl() list) only suppresses the Replacement, not the
  warning.
* All other guards (system header, copy/move ctor or assignment,
  function-template instantiation, member of class-template
  specialisation) continue to suppress the warning entirely.
* The dependent-type filter now walks the type-sugar chain rather
  than only the top-level type, so e.g. \`const value_type &\` --
  where value_type is a typedef inside a class template instantiated
  with a cheap key type -- is still recognised as template-derived
  and skipped.

Also documents one remaining limitation: anonymous parameters in
non-instantiated typedefs of std::function (e.g.
\`using cb = std::function<void(const irep_idt &)>\` at namespace
scope) don't produce a ParmVarDecl in Clang's AST and so are not
caught.  A future TypeLoc-based walker could close this gap.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Records the 2049 currently-known sites where parameters of cheap-to-copy
types (dstringt / irep_idt) are passed by const reference.  These are
all real F.16 violations to be fixed incrementally; the baseline
mechanism allows CI to gate against introducing new ones without
requiring the wholesale fix in a single change.

Regenerate after a clean-up commit with:

    scripts/run_pass_by_value_check.sh build --regenerate-baseline

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Adds scripts/run_pass_by_value_check.sh, which:

* (re)builds scripts/check-pass-by-value if needed (mirrors the build
  recipe used for scripts/check-irep-moves on origin/irept-copies),
* runs the tool over src/ and jbmc/src/ in parallel via xargs, with
  each worker writing to its own per-pid temp file to avoid stdout
  interleaving across workers,
* normalises absolute paths in the output to repo-relative paths so
  that the baseline is portable,
* diffs the result against scripts/pass_by_value_baseline.txt:
    * new entries cause an exit code of 1 with a clear error message,
    * removed entries are reported as a notice (no failure) with the
      regenerate command,
* supports --regenerate-baseline as the second positional argument.

Wall-clock cost on a 36-core box is ~55 s.

CI: gate PRs on F.16 pass-by-value regressions

Adds a check-pass-by-value job to .github/workflows/syntax-checks.yaml
that builds scripts/check-pass-by-value and runs the runner script on
every pull request to develop.  PRs introducing new F.16 violations
(parameters of cheap-to-copy types passed by const reference) will
fail this job; PRs that reduce the number of violations will pass and
emit a notice inviting the contributor to regenerate the baseline.

Mirrors the existing check-irep-copies job in shape and runtime budget.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Document, reproducibly, the performance question raised on the F.16
cleanup: does passing irep_idt by value rather than by const reference
yield a speed-up?

The benchmark calls two otherwise-identical noinline functions -- one
taking irep_idt by value, one by const reference -- ~3e8 times over an
array of real irep_idts and times each loop. noinline forces the calling
convention to be honoured (by value passes the 4-byte index in a
register; by const reference passes a pointer the callee dereferences)
rather than inlining the difference away.

Result (ubuntu-24.04, g++ -O2, 5 runs, stable):

  by-value: ~443 ms
  by-ref:   ~443 ms     (difference < 0.3 %, within run-to-run noise)

So the change is performance-neutral at the calling-convention level; the
justification is correctness (it exposed and fixed a latent
dangling-reference bug) and C++ Core Guidelines F.16 alignment, not a
measurable speed-up. The file header records the methodology and the
build/run command.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
tautschnig and others added 15 commits June 12, 2026 11:36
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Manual fix folded in: jbmc/unit/java-testing-utils/require_goto_statements.cpp
helpers `get_unique_non_null_expression_assigned_to_symbol` and
`try_get_unique_symbol_assigned_to_symbol` previously returned
`const exprt &` and `const symbol_exprt *` respectively, both pointing
into a temporary `pointer_assignment_locationt` returned by value from
`find_pointer_assignments`. On `develop` this was already UB but
happened to point into memory that survived long enough for the
caller `get_ultimate_source_symbol` to read the identifier. After the
F.16 cleanup, the caller's `const auto &expr =` was changed to
`const auto expr =` (correctly forcing a copy of the dangling
reference), but the returned `const symbol_exprt *` then pointed
into a function-local variable, dangling immediately on return and
causing a SIGSEGV under java-unit on the
`generic_bases_test.cpp:48` scenario in the coverage CI runner.
Switch the helper to return `std::optional<irep_idt>` (the only
piece of data the caller uses) and have
`get_unique_non_null_expression_assigned_to_symbol` return `exprt`
by value, eliminating the chain of dangling references.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
After the cleanup commits, scripts/check-pass-by-value reports zero
violations across src/, jbmc/src/, unit/ and jbmc/unit/.  The baseline
is regenerated to an empty file, so the CI check effectively becomes
"fail on any new pass-by-value violation of a cheap-to-copy type".

Also: the runner no longer treats an empty tool output as an error.
That state was previously unreachable (the baseline always had
~2,000 entries) and now is the steady state.

The baseline mechanism itself is retained for forward compatibility:
when CHEAP_TYPES is extended (for example to add
goto_programt::const_targett), the new findings can be temporarily
baselined while the corresponding cleanup PR is staged.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants