-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Enhancement Proposal
Summary
Running the manager directly (tox/pytest, CI) exposes several assumptions baked into the app about users, paths, and Python environment. These cause permission and import failures outside the charm. The app should run cleanly as the invoking user with sane defaults, while the charm applies privileged policies.
Observed problems
Writes reconcile.id to the invoking user’s home (e.g., /home/ubuntu/reconcile.id), failing when run under a different effective user.
Attempts to create/use /var/log/reactive_runner and chown it, failing without root.
Reactive subprocesses use system python and can’t import github_runner_manager despite --python-path; sudo loses tox virtualenv context.
Config file in pytest tempdir not readable when spawning under another user.
Requiring the runner-manager user in app-level tests adds brittle permission and group juggling.
Repro (CI/integration)
Start CLI with tox venv python and reactive mode, dispatch a workflow that spawns a reactive runner.
Errors seen:
PermissionError: [Errno 13] Permission denied: '/var/log/reactive_runner'
PermissionError: [Errno 1] Operation not permitted (chown/chmod/chown during setup)
ModuleNotFoundError: No module named 'github_runner_manager' from reactive subprocess
Error: Invalid value for '--config-file': '/tmp/.../config.yaml': Permission denied
PermissionError: [Errno 13] Permission denied: '/home/ubuntu/reconcile.id'
Root cause
App hardcodes system-level paths (/var/log/reactive_runner) and writes state (reconcile.id) to $HOME.
Assumes privilege to chown/chgrp and to run/react as a dedicated system user.
Reactive subprocess resolves python via PATH/system python rather than the parent interpreter.
Using sudo drops tox venv/PYTHONPATH context unless carefully preserved.
Goal
Make the application runnable as a regular user with no sudo or user switching, with defaults that “just work” in tests/CI and local dev. Keep charm-specific behavior (runner-manager user, /var/log) at the charm layer.
Proposal
State directory (locks, ids)
Add optional --state-dir and env GITHUB_RUNNER_MANAGER_STATE_DIR.
Default (when not set): follow XDG base dirs
XDG_RUNTIME_DIR (if set) → TMPDIR → XDG_STATE_HOME/github-runner-manager → ~/.local/state/github-runner-manager
Write reconcile.id (and any future lock/ledger files) under state dir.
Charm passes a managed state dir; tests rely on the default.
Logging defaults and overrides
Metrics: default to stderr; keep existing METRICS_LOG_PATH support. Charm sets /var/log/github-runner-metrics.log.
Reactive runner logs: default to a user-local directory under state dir (e.g., $STATE_DIR/reactive_runner). Add optional --reactive-log-dir (and env) for charm to set /var/log/reactive_runner.
App must not chown/chgrp unless running as root; otherwise create dirs with current user permissions only.
Python interpreter and import path
Reactive subprocesses should default to using sys.executable from the parent process (not /usr/bin/python3).
Keep existing --python-path as an optional sys.path augment; do not rely on sudo -E or PATH hacks.
This preserves a single source of truth (tox -e . editable install) and works in CI.
User and privilege model
- App: run as the effective user without assuming runner-manager or root. Do not attempt user creation or group modification.
- Charm: continue to:
- Create runner-manager system user and home
- Manage /var/log permissions and ownership
- Run the service under runner-manager via systemd
- Pass explicit --state-dir and log dir paths
- SSH key location
- Keep SSH keys under the effective user’s ~/.ssh by default (current behavior aligns with “run as invoking user”).
- Charm continues to run as runner-manager, so keys are written under ~runner-manager/.ssh there.
Backward compatibility
- If --state-dir is not provided, use new XDG default silently (no breaking CLI change).
- Keep current metrics/log flags working; new defaults only affect direct, unflagged runs.
- Charm paths and behavior remain unchanged; charm passes flags and owns permissions.
Acceptance criteria
- Running manager via tox in CI spawns reactive runners without:
- PermissionError on /var/log/reactive_runner
- reconcile.id writing outside a writable state dir
- ModuleNotFoundError in reactive subprocess
- No sudo required in app-level tests; a single tox editable install suffices.
- Charm continues to work unchanged by passing its preferred paths and running as runner-manager.
Tasks
- Add --state-dir CLI option and GITHUB_RUNNER_MANAGER_STATE_DIR env support; pick XDG defaults when unset.
- Write reconcile.id into state dir.
- Default reactive logs into $STATE_DIR/reactive_runner; add optional --reactive-log-dir override.
- Use sys.executable for reactive subprocesses; maintain --python-path behavior.
- Guard chown/chgrp/log-dir permissions: only attempt when running as root.
- Update docs: standalone vs charm deployment paths and flags.
- Add tests: unit for state dir resolution, integration for reactive subprocess import using sys.executable.
Why this approach
- Separates app (unprivileged, portable) from charm (privileged, policy). This mirrors best practice and removes CI/test fragility while preserving production behavior under the charm.