Scripts for the Mac mini training/build pipeline and the local monitoring that watches it. This is the source of truth for Mini runtime scripts only. Canonical Mini control-plane parity now lives in scripts/automation/sync-codex-mini.sh.
| Script | Schedule | Purpose |
|---|---|---|
mini-prepare-automation-root.sh |
On demand | Creates/updates clean automation clones under ~/SaneApps-automation |
mini-install-nightly-agent.sh |
On demand | Installs/updates the nightly LaunchAgent |
mini-install-training-agents.sh |
On demand | Installs/updates weekly + challenger training LaunchAgents |
mini-memory-guard.sh |
5:40 AM daily | Mini hygiene + safe reboot gate (only when idle and needed) |
mini-install-memory-guard.sh |
On demand | Installs/updates memory guard LaunchAgent |
install-training-daily-check-agent.sh |
On demand (local Mac) | Installs/updates the daily local alert for Mini training results |
bootstrap-build-server.sh |
On demand | Proves headless signing, keychain unlock, and ASC auth before App Store work |
mini-gui-run.sh |
Manual / wrapper | Runs a shell command inside the Mini's logged-in GUI Terminal session |
mini-license-test.sh |
Manual deep probe | Runs the SaneBar end-to-end license lifecycle on the Mini |
mini-train.sh |
Manual / wrapper | MLX LoRA fine-tuning pipeline (sweeps, validation, reporting) |
mini-train-all.sh |
1 AM Sunday | Weekly production training for SaneAI |
mini-train-challengers.sh |
1 AM daily | Daily challenger training for SaneAI |
mini-nightly.sh |
8:45 AM daily | Nightly builds + tests for all SaneApps repos |
training-daily-check.py |
9:15 AM daily (local Mac) | Pulls the latest Mini training state, writes a local summary, and raises a macOS notification |
# Deploy all mini scripts to the build server
bash scripts/mini/deploy.sh
# Refreshes agents even if automation-root prep warns, but exits nonzero if prep failed
# If the local machine does not have a `mini` ssh alias or default key, override both explicitly
MINI_HOST=sj@Stephans-Mac-mini.local \
MINI_SSH_OPTS='-i ~/.ssh/id_ed25519_codex_loopback' \
bash scripts/mini/deploy.sh
# Sync the active Codex automation + skill profile to Mini
bash scripts/automation/sync-codex-mini.sh mini --no-restart
# Legacy compatibility wrapper (prints guidance or routes to the canonical path)
bash scripts/mini/sync-claude-config.sh --dry-run
# Or deploy a single script
scp scripts/mini/mini-train.sh mini:~/SaneApps/infra/scripts/Legacy note:
scripts/mini/sync-claude-config.shis a deprecation wrapper, not a separate sync system.- Canonical Mini control-plane parity is
scripts/automation/sync-codex-mini.sh. deploy.shmanages Mini runtime scripts only and should not be used to recreate a second config-sync lane.
Default root behavior:
- Mini training runners and mini LaunchAgent installers now auto-prefer
~/SaneApps-automationwhen that clone exists. - Explicit
SANE_ROOT=...still wins if you set it yourself. - Outputs still write to
~/SaneApps/outputsunlessSANE_OUTPUT_DIRis overridden.
Before any headless App Store release from the mini, run:
bash ~/SaneApps/infra/SaneProcess/scripts/mini/bootstrap-build-server.shWhat it proves:
- the login keychain can be unlocked in a headless shell
- the signing keys have the right partition-list access for
codesignand Xcode - App Store Connect JWT auth works
- iOS signing is probe-tested when an Apple Development or Distribution identity is installed
If this script fails, stop and fix the machine first. Do not push through with raw xcodebuild.
If App Store signing works in the Mini GUI session but fails in plain ssh shells with errSecInternalComponent, use:
ssh mini '~/SaneApps/infra/SaneProcess/scripts/mini/mini-gui-run.sh \
--title "SaneSales archive" \
--log-file /tmp/sanesales-archive.log \
--close-window \
-- "cd ~/SaneApps/apps/SaneSales && xcodebuild archive ..."'What it does:
- opens a real Terminal window in the logged-in Mini GUI session
- runs the command there
- tees output to the requested log file
- waits for completion
- closes its own Terminal window by default
Use this for App Store archive/export/upload recovery on the Mini. Do not leave throwaway Terminal windows open.
LaunchAgent (1 AM daily)
→ mini-train-challengers.sh SaneAI
→ mini-prepare-automation-root.sh (fail fast if clean automation root cannot be refreshed)
→ mini-train.sh SaneAI --challenger
→ runs against clean automation root (`~/SaneApps-automation`)
→ nightly SmolLM3-only challenger lane on the 8 GB Mini
→ skips Sundays so weekly SaneAI owns that window
→ no artificial runtime cap; hard stop at 8:30 AM
→ stall guard only fires when both logs and process CPU stop moving
→ evaluates the latest saved checkpoint when the hard stop interrupts a sweep
→ default sweep target comes from the challenger YAML (currently `50` iters for SmolLM3)
→ challenger report + comparison report
LaunchAgent (1 AM Sunday)
→ mini-train-all.sh
→ mini-prepare-automation-root.sh (fail fast if clean automation root cannot be refreshed)
→ merge_training_data.py (if exists, forced to read from clean automation root)
→ mini-train.sh SaneAI
→ runs against clean automation root (`~/SaneApps-automation`)
→ git fetch + honest repo-state report
→ sed (per-sweep LR + warmup config)
→ mlx_lm lora --train (default weekly target now comes from YAML, currently `100` iters)
→ Python validation with workflow-first scoring (commentary x4, broader workflow packs x2, guardrails x2, core x1)
→ primary gate requires commentary workflow suite to clear its threshold
→ archives a timestamped report + appends metrics history TSV
→ Summary report → ~/SaneApps/outputs/training_report_SaneAI.md
LaunchAgent (8:45 AM daily)
→ mini-nightly.sh
→ runs against clean automation root (`~/SaneApps-automation`)
→ git fetch + truthful dirty/behind report for all repos
→ xcodebuild (build + test each app)
→ System health (disk, memory, uptime)
→ Report → ~/SaneApps/outputs/nightly_report.md
LaunchAgent (9:15 AM daily on local Mac)
→ training-daily-check.py --host mini
→ pulls latest Mini metrics, readiness, and active alert files over SSH
→ writes local summary report
→ raises a macOS notification when training is stale, blocked, or failing
LaunchAgent (5:40 AM)
→ mini-memory-guard.sh
→ health snapshot + stale-process cleanup
→ optional reboot only in safe window and only when mini is idle
- Bash 3.2 — mini runs macOS default bash. No
+=()arrays, no<<<herestrings. Use file-based alternatives. - 8GB RAM — training uses ~3.7GB peak. One sweep at a time.
- Lock files — Mini training now uses one shared
mkdir-based MLX lock with 8-hour stale detection so production and challenger lanes cannot overlap on the 8 GB GPU. - Logs — LaunchAgent stderr appends (never truncates).
mini-train-all.shrotates at 1MB. - Isolation enabled — deploy refreshes
~/SaneApps-automation, launch agents pointSANE_ROOTthere, and each scheduled training lane now re-runsmini-prepare-automation-root.shbefore training so stale dirty clones fail fast instead of silently training on drifted state. - Managed overlays only — automation-root prep is allowed to reset hydrated training overlays (
train.jsonl, eval packs, challenger configs, generated fixtures) before syncing. Any other dirt still fails the prep step. - Training data hydration —
mini-prepare-automation-root.shcopies local-onlytrain.jsonl/valid.jsonldatasets for SaneSync, SaneClip, SaneAI, and SaneVideo into the clean clones before training. - Dataset regression guard —
mini-train.shnow fails before spending GPU time if the current train/valid counts shrink too far versus the latest successful run for that lane. - Current bakeoff mode — the daily challenger agent is pinned to
smollm3-3bonSaneAIbecausellama32-3breproducibly OOMs on the 8 GB Mini, runs until08:30, and skips Sundays so the weeklySaneAIrun gets the full window. - Production Mini baseline —
lora_config_mini.yamlnow points atsmollm3-3bas the scheduled production model on the 8 GB Mini;llama32-3bremains a manual off-Mini experiment until it is requalified. - Unsafe-model preflight —
mini-train.shnow blocksmlx-community/Llama-3.2-3B-Instruct-4bitbefore launch on the 8 GB Mini unlessALLOW_UNSAFE_TRAINING=trueis set, so the weekly lane fails cleanly with a report/alert instead of crashing Python on Metal OOM. - Clean-start training —
mini-train.shnow drains stalemlx_lm/evaluate_model.pyprocesses before each run and purges inactive memory so one crashed/manual lane does not poison the next scheduled lane. - Progress tracking — every training run now archives a timestamped report under
outputs/history/<App>/and appends a TSV metrics row so week-over-week comparisons survive report overwrites. - Interrupted run recovery —
mini-train.shnow evaluates the latest saved checkpoint when the hard stop interrupts a sweep, so overnight runs still produce scored signal instead of defaulting to0%. - Realistic sweep sizing —
mini-train.shnow takes its default sweep length from the config file instead of hardcoded1000/2000defaults, and rescales warmup alongside decay steps so shortened overnight sweeps do not spend most of their life in warmup. - Workflow focus — nightly
SaneAItraining keeps the unified SaneSync/SaneClip corpus but now weights SaneVideo workflow data so the shared model learns the broader commentary/repurposing surface. - Workflow-first scoring — training and nightly reports now treat
commentary_workflowas the primary gate and weight it above legacy action JSON accuracy, while still scoring the broader SaneVideo workflow packs and schema guardrails. Hybrid suites are diagnostic only and should not be used for promotion. - 8 GB stable baseline —
SaneAIproduction + challenger configs should useval_batches: 1on the Mini.val_batches: 10is no longer stable with the workflow-expanded corpus and reproducibly trips Metal OOM. - 8 GB sequence ceiling — the audited merged corpus peaks at
1665tokens on the SmolLM3 tokenizer and1580on the cached Llama tokenizer, so the Mini configs now usemax_seq_length: 1664instead of carrying wasted1792/2048headroom. - Checkpoint cadence — the Mini configs save every
25steps, with current default sweep targets of50iterations for the nightly SmolLM challenger lane and100iterations for the weekly SmolLM production lane. - 8 GB eval baseline — keep
EVAL_MAX_TOKENS=128on the Mini and clear the MLX Metal cache between eval cases. The strict workflow JSON eval cases now requestmax_tokens: 256individually, but the Mini still caps them viaEVAL_MAX_TOKENS_CAPso long JSON is less likely to be truncated without globally widening every suite. - SaneVideo fixtures —
mini-prepare-automation-root.shhydrates ignoredTests/Assetsmedia in the clean clone whenffmpegis available on the Mini. - Bad training is a hard failure —
mini-train.shnow fails the sweep if the train log showsnanloss orTrained Tokens 0, and emits a training alert instead of treating that as success. - Cleanup hygiene —
mini-memory-guard.shnow prunes training artifacts under both~/SaneAppsand~/SaneApps-automation, rotates challenger/weekly/guard logs, and trims the training alert history log.
Only use this path on the Mini:
- Deploy from
scripts/mini/inSaneProcess. - Train against
SANE_ROOT=~/SaneApps-automation. - Write reports and alerts under
~/SaneApps/outputs. - Do not run scheduled training against the human repo at
~/SaneApps.
Use this to prove the runtime, wrapper, automation-root prep, reporting, and alert plumbing after any training change:
ssh mini '
TRAIN_SWEEP_ITERS=2 \
TRAIN_HARD_STOP_TIME=23:59 \
TRAIN_POLL_INTERVAL_SEC=5 \
TRAIN_STALL_TIMEOUT_MIN=15 \
CHALLENGER_SELECTION_MODE=alternate \
CHALLENGER_ROTATION_ORDER=smollm3-3b \
EVAL_SUITES=commentary_workflow,core \
EVAL_MAX_CASES=6 \
EVAL_MAX_TOKENS=128 \
TRAIN_ALERT_NOTIFY=false \
SANE_ROOT=$HOME/SaneApps-automation \
SANE_OUTPUT_DIR=$HOME/SaneApps/outputs/automation-smoke/manual \
/bin/bash $HOME/SaneApps/infra/SaneProcess/scripts/mini/mini-train-challengers.sh SaneAI
'Smoke must prove all of this:
- the automation root refresh runs cleanly before training
- a new sweep directory is created
- the report is archived under
outputs/history/ - no
nanloss appears - no
Trained Tokens 0appears - no current failure alert is left behind
- the post-train eval completes quickly because it is capped to a small smoke suite
Use this after smoke passes:
ssh mini '
TRAIN_SWEEP_ITERS=25 \
MAX_TRAIN_RUNTIME_MIN=30 \
TRAIN_HARD_STOP_TIME=23:59 \
TRAIN_POLL_INTERVAL_SEC=15 \
CHALLENGER_SELECTION_MODE=alternate \
CHALLENGER_ROTATION_ORDER=smollm3-3b \
EVAL_MAX_TOKENS=128 \
SANE_ROOT=$HOME/SaneApps-automation \
SANE_OUTPUT_DIR=$HOME/SaneApps/outputs/automation-e2e \
/bin/bash $HOME/SaneApps/infra/SaneProcess/scripts/mini/mini-train-challengers.sh SaneAI
'Bounded e2e is only considered healthy if:
- the process stays alive past the first validation
- the report records the real exit reason
- alerts are written for failures
- the next nightly report surfaces active training alerts
~/Library/LaunchAgents/com.saneapps.training-challengers.plist → mini-train-challengers.sh (1 AM daily)
~/Library/LaunchAgents/com.saneapps.training-weekly.plist → mini-train-all.sh (1 AM Sunday)
~/Library/LaunchAgents/com.saneapps.nightly.plist → mini-nightly.sh (8:45 AM)
~/Library/LaunchAgents/com.saneapps.memory-guard.plist → mini-memory-guard.sh (5:40 AM)
~/Library/LaunchAgents/com.saneapps.training-daily-check.plist → training-daily-check.py (9:15 AM)
~/SaneApps/outputs/training_report_SaneAI.md # Training results + validation
~/SaneApps/outputs/nightly_report.md # Build + test results
~/SaneApps/outputs/training.stderr.log # Training stderr (rotated at 1MB)
~/SaneApps/outputs/training.stdout.log # Training stdout (appended)