Conversation
There was a problem hiding this comment.
Pull request overview
Reduces log noise around planner pressure streaming by avoiding duplicate stack traces across layers and by treating transient connection failures differently from other planner API errors.
Changes:
- Introduces
PlannerConnectionErrorto represent transient streaming connection failures (e.g., disconnects/timeouts). - Updates
PlannerClient.stream_pressure()to maprequestsconnection/timeouts toPlannerConnectionErrorand stop logging exceptions at the client layer. - Updates
PressureReconcilerto handlePlannerConnectionErrorwith a warning + fallback behavior (no stack trace), and adds unit tests for the new exception mapping.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| github-runner-manager/src/github_runner_manager/planner_client.py | Adds PlannerConnectionError and adjusts exception handling in stream_pressure() to avoid redundant exception logging. |
| github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py | Handles PlannerConnectionError separately from PlannerApiError to reduce noisy stack traces while applying fallback behavior. |
| github-runner-manager/tests/unit/test_planner_client.py | Adds unit tests validating error mapping from transient requests failures to PlannerConnectionError and other failures to PlannerApiError. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Outdated
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/planner_client.py
Outdated
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/planner_client.py
Outdated
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR reduces noisy/duplicated exception logging around planner pressure streaming by moving detailed logging responsibility to the caller, and introduces a distinct transient connection error path so connection drop/timeout scenarios don’t emit stack traces. It also improves shutdown responsiveness by replacing time.sleep(5) backoff with Event.wait(5).
Changes:
- Add
PlannerConnectionErrorand raise it for transientrequests.ConnectionError/requests.Timeoutfailures while streaming pressure. - Update
PressureReconcilerto handle transient planner connection issues with a warning + fallback behavior, and use_stop.wait(5)for backoff. - Add unit tests validating
PlannerClient.stream_pressureexception mapping for transient vs non-transient request failures.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
github-runner-manager/tests/unit/test_planner_client.py |
Adds tests for new exception behavior (PlannerConnectionError vs PlannerApiError). |
github-runner-manager/src/github_runner_manager/planner_client.py |
Introduces PlannerConnectionError and refines exception handling to avoid duplicate stack trace logging. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py |
Adds specific handling for transient planner disconnects and improves graceful shutdown/backoff behavior. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/planner_client.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR reduces planner pressure streaming log noise by separating transient connection failures from other planner API errors, avoiding duplicate stack traces, and improving shutdown responsiveness by using an interruptible wait instead of sleep.
Changes:
- Introduced
PlannerConnectionErrorto distinguish transient streaming disconnects/timeouts from other planner client failures. - Updated
PressureReconcilerto log transient stream failures as warnings (no stack trace) while preserving stack traces for non-transient planner errors, and replacedtime.sleep(5)with_stop.wait(5)for faster shutdown. - Added/updated unit tests for new exception behavior and bumped the project version + changelog.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
github-runner-manager/src/github_runner_manager/planner_client.py |
Adds PlannerConnectionError and reclassifies requests exceptions to reduce stack-trace noise for transient failures. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py |
Handles PlannerConnectionError separately with warning logs and uses _stop.wait(5) to avoid delaying shutdown. |
github-runner-manager/tests/unit/test_planner_client.py |
Adds unit coverage for the new exception mapping behavior in the planner client. |
github-runner-manager/tests/unit/manager/test_pressure_reconciler.py |
Updates reconciler tests to cover both transient and non-transient planner stream failures and to match the new wait-based backoff. |
github-runner-manager/pyproject.toml / docs/changelog.md |
Version bump and user-facing changelog entry describing the logging change. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Reduces log noise around planner pressure streaming by centralizing stack-trace logging, distinguishing transient connection issues from other planner failures, and improving shutdown responsiveness in the pressure reconciler.
Changes:
- Introduced
PlannerConnectionErrorto classify transient stream disconnects/timeouts and avoid stack traces for those cases. - Updated
PressureReconcilerto log transient stream interruptions as warnings (no stack trace), keep stack traces for non-transient planner errors, and use_stop.wait(5)instead oftime.sleep(5). - Added/updated unit tests for the new error classification paths; bumped project version and documented the change.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| github-runner-manager/src/github_runner_manager/planner_client.py | Adds PlannerConnectionError and adjusts exception wrapping in stream_pressure. |
| github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py | Changes logging behavior for transient vs non-transient errors; replaces sleep with stop-aware wait. |
| github-runner-manager/tests/unit/test_planner_client.py | Adds coverage for transient/request/mid-stream failures and expected wrapped exceptions. |
| github-runner-manager/tests/unit/manager/test_pressure_reconciler.py | Updates tests to cover both PlannerApiError and PlannerConnectionError fallback behavior; adapts shutdown/backoff mocking to Event.wait. |
| github-runner-manager/pyproject.toml | Bumps version from 0.15.0 to 0.15.1. |
| docs/changelog.md | Adds a user-facing changelog entry describing the reduced log noise. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR reduces log noise around planner pressure streaming by separating transient connection failures from other planner errors, ensuring stack traces are not duplicated across layers, and improving shutdown responsiveness in the pressure reconciler.
Changes:
- Introduced
PlannerConnectionErrorto classify transient planner streaming failures (disconnects/timeouts) separately from other planner API errors. - Removed planner-client-side exception logging to avoid duplicate stack traces; shifted behavior to structured handling/logging in
PressureReconciler. - Replaced fixed
sleepbackoff with_stop.wait()to allow graceful shutdown without unnecessary delays.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
github-runner-manager/src/github_runner_manager/planner_client.py |
Adds PlannerConnectionError and wraps transient vs non-transient requests failures without logging stack traces in the client. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py |
Handles PlannerConnectionError with warning-only logs and uses _stop.wait(5) for interruptible backoff. |
github-runner-manager/tests/unit/test_planner_client.py |
Adds unit coverage for transient/non-transient request failures, including mid-stream failures. |
github-runner-manager/tests/unit/manager/test_pressure_reconciler.py |
Expands coverage for fallback behavior and verifies shutdown prevents fallback scaling during error handling. |
github-runner-manager/pyproject.toml |
Bumps application version to 0.15.1. |
docs/changelog.md |
Documents the user-visible logging behavior change. |
There was a problem hiding this comment.
Pull request overview
This PR reduces planner pressure-stream log noise by classifying transient connection/stream interruptions separately from other planner API failures, and improves shutdown responsiveness by replacing fixed sleeps with interruptible waits.
Changes:
- Introduced
PlannerConnectionErrorfor transientrequests/stream failures and updatedPlannerClient.stream_pressureto wrap errors without logging stack traces. - Updated
PressureReconcilerto log transient planner disconnects as warnings (no stack trace), keep error-level stack traces for non-transient failures, and use_stop.wait(5)for backoff. - Added/updated unit tests for the new error classification and shutdown behavior; bumped patch version and added a changelog entry.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
github-runner-manager/src/github_runner_manager/planner_client.py |
Adds PlannerConnectionError and refines exception wrapping in stream_pressure to avoid duplicate stack traces. |
github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py |
Differentiates logging/backoff behavior for transient vs non-transient planner failures; uses interruptible wait for graceful shutdown. |
github-runner-manager/tests/unit/test_planner_client.py |
Adds coverage for transient request failures and mid-stream failures mapping to the correct wrapper exception type. |
github-runner-manager/tests/unit/manager/test_pressure_reconciler.py |
Extends fallback/shutdown tests to cover both PlannerApiError and PlannerConnectionError paths and verifies no scale-up after stop is requested. |
github-runner-manager/pyproject.toml |
Bumps version to 0.15.1. |
docs/changelog.md |
Documents the user-facing logging behavior change. |
There was a problem hiding this comment.
Pull request overview
This PR reduces log noise around planner pressure streaming by introducing a distinct connection-related exception type and adjusting the pressure reconciler’s logging/backoff behavior, while also improving shutdown responsiveness.
Changes:
- Introduced
PlannerConnectionErrorto distinguish transient stream/connection failures from other planner API errors. - Updated
PressureReconcilerto log transient planner disconnects as warnings (no stack traces), keep stack traces for non-transient errors, and replacetime.sleep(5)with_stop.wait(5)for faster shutdown. - Added/expanded unit tests for transient vs non-transient failures (including mid-stream failures), and bumped version + changelog.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| github-runner-manager/src/github_runner_manager/planner_client.py | Adds PlannerConnectionError and classifies transient request/stream failures separately from other RequestExceptions. |
| github-runner-manager/src/github_runner_manager/manager/pressure_reconciler.py | Uses warning-level logging (no stack trace) for transient stream interruptions, retains exception logging for non-transient errors, and uses _stop.wait(5) for backoff/shutdown. |
| github-runner-manager/tests/unit/test_planner_client.py | Adds coverage for transient request failures and mid-stream errors mapping to the correct wrapper exception type. |
| github-runner-manager/tests/unit/manager/test_pressure_reconciler.py | Extends tests to cover both planner error types and validates no scaling after stop is requested during error handling. |
| github-runner-manager/pyproject.toml | Bumps project version to 0.15.1. |
| docs/changelog.md | Documents the user-visible logging behavior change. |
| if self._stop.is_set(): | ||
| return | ||
| self._handle_create_runners(fallback) | ||
| self._stop.wait(5) |
There was a problem hiding this comment.
small improvement to teardown - use _stop.wait instead of time.sleep and don't create runners if stop is set
TICS Quality Gate✔️ Passedgithub-runner-operatorSee the results in the TICS Viewer The following files have been checked for this project
|
Overview
Avoid repeating logging the same exception stack trace in the planner client and pressure reconciler and avoid logging an exception stack trace for connection issues.
Also add small optimisation for graceful shutdown (use _stop.wait(5) instead of time.sleep(5) )
Rationale
Currently, the logs are quite noisy due to redundant information. Furthermore, long-lived http connections in unstable environments can be flaky, having exception stack traces doesnt bring a lot of information there.
time.sleep could delay shutdown for some seconds unnecessarily
Checklist
urgent,trivial,complex).github-runner-manager/pyproject.toml.