feat(migration): Add options inventory script#123
Conversation
|
|
||
| if args.db_name: | ||
| from django.conf import settings as _s | ||
| _s.DATABASES['default']['NAME'] = args.db_name |
There was a problem hiding this comment.
Module-level side effects prevent safe import
Low Severity
Argument parsing (parse_args()), os.chdir(), sys.path manipulation, and Django configure() all execute at module level rather than inside a function guarded by if __name__ == '__main__':. Importing this module triggers directory changes, environment mutations, and full Django initialization as side effects. The sibling script option_inventory.py correctly places all setup inside main().
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 18da02c. Configure here.
| n_runtime = len(runtime_keys) + len(active_unknown_db) | ||
| total = len(registrations) + n_runtime |
There was a problem hiding this comment.
Bug: The 'runtime-only' filter in the HTML report shows a combined count but only filters for one category of options, leading to a mismatch between the displayed count and the results.
Severity: LOW
Suggested Fix
Modify the filter logic to correctly display all items that contribute to the 'runtime-only' count. One approach is to have the 'runtime-only' button set a status that the JavaScript filter can use to show both RUNTIME_ONLY and DB_ONLY items. For example, the filter could check for both statuses when the 'runtime-only' filter is active.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L959-L960
Potential issue: In the generated HTML report, the 'runtime-only' statistic combines the
count of options found only at runtime (`runtime_keys`) and options found only in the
database (`active_unknown_db`). The variable `n_runtime` stores this sum. However,
clicking the corresponding UI pill or filter button executes a JavaScript function
`setStatus('RUNTIME_ONLY')`. The filtering logic in `updateVisible` performs an exact
match for `data-status="RUNTIME_ONLY"`, which excludes items with
`data-status="DB_ONLY"`. This results in the number of displayed items being less than
the count shown on the filter button, creating a confusing user experience.
Also affects:
scripts/migration/option_inventory.py:1044~1044scripts/migration/option_inventory.py:1063~1063
| first_line = _git_log_first(repo_root, search, None) | ||
| if first_line: | ||
| break | ||
|
|
There was a problem hiding this comment.
Bug: The script can fetch the wrong commit when an option key is mentioned in an older commit, leading to inaccurate historical data in reports.
Severity: MEDIUM
Suggested Fix
To prevent matching unrelated historical mentions, the whole-repository fallback search should be made more specific. Instead of just searching for "key", the search should be scoped to find commits that actually modify a registration file or contain the register("key") pattern. This could involve post-processing the git log results to verify that the found commit actually contains a valid registration change.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L1209-L1212
Potential issue: The `fetch_first_commit` function uses a broad, whole-repository
fallback search for an option's key (e.g., `"key"`). This search uses `git log --reverse
-S`, which finds the oldest commit in the repository's history where the key string
appears. If the key was mentioned in an unrelated file (like documentation or a comment)
before it was formally registered, the script will identify this older, incorrect commit
as the origin. This results in displaying inaccurate metadata (author, date, SHA, commit
message) in the generated HTML and plan reports, potentially misleading engineers about
the option's history.
| local_vals: dict[str, ast.expr] = {} | ||
| for stmt in ast.walk(func_node): | ||
| if isinstance(stmt, ast.Assign) and len(stmt.targets) == 1: | ||
| t = stmt.targets[0] | ||
| if isinstance(t, ast.Name): | ||
| local_vals[t.id] = stmt.value |
There was a problem hiding this comment.
Bug: The use of ast.walk() to find variable assignments doesn't respect scope, allowing nested function assignments to shadow outer ones, leading to incorrect dynamic option analysis.
Severity: LOW
Suggested Fix
Modify the AST traversal to be scope-aware. Instead of ast.walk(), iterate only through the direct children of the function's body to avoid descending into nested function definitions.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L306-L311
Potential issue: The script uses `ast.walk(func_node)` to populate a `local_vals`
dictionary with variable assignments. This traversal includes nested functions, causing
assignments in inner scopes to overwrite those from outer scopes. If a variable used in
`options.get()` is shadowed in a nested function, the script may fail to extract the
correct f-string prefix. This could lead to an option being incorrectly classified as
`SAFE_CANDIDATE` when it should be `NEVER_DELETE`, potentially marking an active option
as safe for deletion.
| f'jsonPayload.event="option.seen" AND timestamp>="{start_ts}" AND timestamp<"{end_ts}"', | ||
| f"--project={PROJECT}", | ||
| "--format=json", | ||
| "--limit=10000", |
There was a problem hiding this comment.
Bug: The script uses a hardcoded --limit=10000 for GCP log queries, which likely exceeds the API's page size limit, and lacks pagination, risking silent data truncation.
Severity: HIGH
Suggested Fix
Implement pagination for the gcloud logging read command. Reduce the --limit to the documented maximum (e.g., 1000) and use the next_page_token from the API response to fetch all results for each time window. Add a warning if the number of results in a page matches the limit, indicating potential truncation.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/collect-seen-options.py#L62
Potential issue: The script `collect-seen-options.py` queries Google Cloud logs with a
hardcoded `--limit=10000`. The Google Cloud Logging API's documented maximum page size
is 1000. The script does not implement pagination or check if the number of returned
entries equals the limit, which would indicate truncation. If a time window contains
more log entries than the effective limit, any options seen in the truncated logs will
be missed. This could cause an active option to be incorrectly classified as safe to
delete, leading to its removal and potential production failures.
| window 719/720 2026-05-26T23:27:28Z → 2026-05-26T23:37:28Z (+0 new, 783 total) | ||
| window 720/720 2026-05-26T23:37:28Z → 2026-05-26T23:47:28Z (+0 new, 783 total) | ||
|
|
||
| === 783 unique options written to /Users/josh/dev/sentry-options/scripts/migration/seen.txt === |
There was a problem hiding this comment.
Committed gcloud run log artifact
Low Severity
The PR adds a 722-line collect-seen-options-out.txt that is stderr captured from a local make collect-seen-options run (720 window progress lines plus a machine-specific path). Workflow docs only treat seen.txt as the committed output; this log is ephemeral tee output and should not live in the repo.
Reviewed by Cursor Bugbot for commit 013021a. Configure here.
| 'author': author, 'date': date, 'message': message, | ||
| 'diff_snippet': _extract_relevant_hunk(diff.stdout, key), | ||
| 'github_url': f'{base}/commit/{sha}' if base else None, | ||
| } | ||
|
|
||
|
|
||
| def fetch_investigate_commits( | ||
| registrations: list[dict], |
There was a problem hiding this comment.
Bug: The subprocess.run call in fetch_first_commit lacks exception handling, which can crash the script if the git show command times out or fails.
Severity: MEDIUM
Suggested Fix
Wrap the subprocess.run call in fetch_first_commit within a try/except block to catch potential exceptions like subprocess.TimeoutExpired and FileNotFoundError. Return None in the except block, similar to the pattern used for _git_log_first.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L1228-L1235
Potential issue: The `subprocess.run` call on line 1221 within the `fetch_first_commit`
function is not wrapped in a `try/except` block. This can lead to unhandled
`subprocess.TimeoutExpired` or `FileNotFoundError` exceptions. Since this function is
executed in a `ThreadPoolExecutor`, any such exception will be propagated by `pool.map`
and crash the entire script. This behavior is inconsistent with other subprocess calls
in the file, which are properly guarded. The script will crash when run with `--html` or
`--plan` if a `git show` command takes longer than 30 seconds or if `git` is not
available.
Also affects:
scripts/migration/option_inventory.py:1251~1252
| <span class="option-key">{esc(key)}</span> | ||
| {db_html}{hints_html}<span class="found-tag {found_cls}">{esc(found)}</span> | ||
| </summary><div class="option-body"> | ||
| <div class="option-location">{loc}</div> | ||
| <pre class="decl">{esc(reg["declaration"])}</pre> | ||
| {usages_html} | ||
| {commit_html} | ||
| </div></details></div>''') |
There was a problem hiding this comment.
Bug: The 'runtime-only' stat pill's count includes active_unknown_db items, but the corresponding filter incorrectly hides them, creating a mismatch between the displayed count and the filtered results.
Severity: LOW
Suggested Fix
To fix the discrepancy, update the JavaScript filter logic. When the currentStatus is 'RUNTIME_ONLY', the filter should show elements where el.dataset.status is either 'RUNTIME_ONLY' or 'DB_ONLY'. This will ensure the filtered view matches the count displayed in the stat pill.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L959-L966
Potential issue: In the generated HTML report, the 'runtime-only' stat pill displays a
count, `n_runtime`, which is calculated as the sum of `runtime_keys` and
`active_unknown_db`. However, clicking this pill triggers a JavaScript filter
`setStatus('RUNTIME_ONLY')`. This filter only shows elements with
`data-status="RUNTIME_ONLY"`. The cards generated from `active_unknown_db` are assigned
`data-status="DB_ONLY"` and are therefore hidden. This creates a misleading user
interface where the number in the stat pill does not match the number of items shown
after filtering, potentially confusing developers who rely on this tool for option
cleanup.
this removes 93 unused options identified via tracking read options via logging in #115610 (which this PR also reverts as I'm done collecting data) cross referenced with static analysis in getsentry/sentry-options#123 the details are in getsentry/sentry-options#123 - `scripts/migration/safe.txt` is essentially the set intersection of statically unreferenced options and unseen/unread options over 120 hours of GCP production logs. the unseen list was determined by subtracting options.seen unique events from all known options (options.all()) and I observed no newly seen options beyond the ~100 hour mark (`scripts/migration/collect-seen-options-out.txt`). getsentry options that were unused are removed here too: getsentry/getsentry#20449
| if key in seen: | ||
| skipped_seen.append(key) | ||
| else: | ||
| safe.append(key) |
There was a problem hiding this comment.
Deletion list ignores DB values
High Severity
safe-to-delete-options.py only excludes keys in seen.txt, not options with has_db_value in inventory.csv. Docs say DB-backed options are NEVER_DELETE, but classification never uses active_db_keys, so delete-options.py can still target keys with live DB values.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 8ea61a4. Configure here.
|
|
||
| # Bounded by an enumerated static dict (e.g. killswitch registry). | ||
| if attr.bounded_keys: | ||
| continue # assume this name draws from the registry |
There was a problem hiding this comment.
Killswitch registry skips unresolved
Medium Severity
For options.get(variable), if the file defines any ALL_*_OPTIONS registry, every unresolved variable is treated as registry-backed and the file may be omitted from truly_uncertain, pushing unrelated options toward SAFE_CANDIDATE.
Reviewed by Cursor Bugbot for commit 8ea61a4. Configure here.
|
|
||
| PROJECT = "internal-sentry" | ||
| INTERVAL = 600 # seconds per window | ||
| LOOKBACK = 432000 # 120 hours |
There was a problem hiding this comment.
Lookback window mismatch
Medium Severity
collect-seen-options.py uses a 120-hour LOOKBACK, while UNUSED-OPTIONS.md describes a 24-hour GCP log walk. Operators following the doc get different coverage than the script implements.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 8ea61a4. Configure here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 6 total unresolved issues (including 5 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5848d8c. Configure here.
| for entry in entries: | ||
| key = entry.get("jsonPayload", {}).get("option_key") | ||
| if key and key not in seen: | ||
| seen[key] = window_label |
There was a problem hiding this comment.
Log query cap drops option keys
Medium Severity
Each GCP window uses --limit=10000 with no pagination. Busy 10-minute intervals can return more than 10,000 option.seen entries, so some keys never enter seen.txt and may be wrongly treated as unused in the deletion pipeline.
Reviewed by Cursor Bugbot for commit 5848d8c. Configure here.
| return None | ||
|
|
||
| diff = subprocess.run( | ||
| ['git', '-C', repo_root, 'show', '--unified=4', '--no-color', sha, '--', source_short], | ||
| capture_output=True, text=True, timeout=30, env=_GIT_ENV, | ||
| ) | ||
| base = github_urls.get(repo_root) |
There was a problem hiding this comment.
Bug: The subprocess.run call for git show in fetch_first_commit is not wrapped in a try/except block, which can crash the script on timeout or other errors.
Severity: MEDIUM
Suggested Fix
Wrap the subprocess.run call in fetch_first_commit within a try...except Exception block. On exception, log the error and return None to prevent the entire script from crashing, similar to the error handling in the _git_log_first helper function.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/option_inventory.py#L1279-L1285
Potential issue: The `subprocess.run` call executing `git show` within the
`fetch_first_commit` function lacks exception handling. This function is executed in a
worker thread via `ThreadPoolExecutor.map`. A failure in the subprocess, such as a
`subprocess.TimeoutExpired` on a large repository, will raise an exception in the worker
thread. This exception is then re-raised in the main thread, causing the entire script
to crash and lose all previously completed work. This is likely to occur when using the
`--plan` or `--html` flags.
| result = subprocess.run( | ||
| ['uv', 'run', str(here / 'safe-to-delete-options.py')], | ||
| capture_output=True, text=True, cwd=here, | ||
| ) | ||
| keys = [line.split('\t')[0] for line in result.stdout.splitlines() if line] |
There was a problem hiding this comment.
Bug: The script does not check the return code of the safe-to-delete-options.py subprocess, leading to silent failures being misinterpreted as a successful run with no results.
Severity: MEDIUM
Suggested Fix
After the subprocess.run call, add a check for the result.returncode. If the return code is non-zero, raise an exception or print the contents of result.stderr and exit with an error code to make the failure visible.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/migration/delete-options.py#L120-L124
Potential issue: When `delete-options.py` is run interactively (without piped stdin), it
executes `safe-to-delete-options.py` as a subprocess to get a list of keys. However, the
script does not check the `returncode` of the subprocess. If the subprocess fails (e.g.,
because the required `inventory.csv` file is missing), it will exit with a non-zero
code, but the parent script will not detect the failure. It will interpret the empty
`stdout` as "no keys to delete" and exit successfully, masking the underlying error.


Searches the getsentry and sentry library for option usages.