Monitor — sparring notes & build plan.

What we've argued out so far, and what to build first.

session 2 · 2026-04-14 · sparring mode ON · "make it so" not yet said

This is the state of the framework after sparring. 14 signals parsed from Pierre, integrated into personas / pipeline invariants / revised build sequence. Three pushbacks still on the table. First sprint proposed — one block only.

🎧

Listen on the drive

~7 minute read · Edge TTS Ryan (en-GB)

download mp3 (2.7 MB)

The 14 signals I heard

Tailscale check first. Safety before talk. Safe → can work.
Skills in prime: 50 words each, not one-liners. 20 skills × 50w ≈ 1k tokens. Affordable.
Disk/memory per-process. Dockers and containers tracked individually, not just global.
Personas as ownership layer. Emoji-tagged. Know who to "blame" or consult when something breaks.
Pipeline invariants = outcomes, not schedules. Groceries isn't "plan by Sunday", it's "need → bought → eaten."
Health checks sparse. Daily or event-driven. Not every 20 seconds. Two users. Wouldn't notice 3 days down.
Session close via "done" / "shut down" / "go to sleep". Do the work first, then clear.
Log every turn raw + classified. Librarian runs background classification.
Auditor persona for close. Performance review: errors, assumptions, gaps, proposals.
Invariants not absolute. Human judgement adjusts. "It'll be what it'll be."
Time/date hook. Sessions have no clue. Plus: prediction vs actual for calibration.
Narration ratio is variable. Not a fixed rule. Context-dependent.
GitHub push + rollback. System-level undo. "Seed of it."
Drift as persona. Little cop 👮 emoji.

Personas — starter set (5, not 20)

Don't proliferate on day one. Grow the company as use cases arrive. Every skill/hook declares its persona. Status bar shows active ones.

emoji	persona	owns
🕴️	Session Boss	Session invariant, prime, budget, mode
📚	Librarian	Per-turn logging, classification, brain writes
🧑‍⚖️	Auditor	Session close review, rating, repeated-mistake detection
👮	Cop	Drift detection (AI + user), enforcement
💾	Archivist	GitHub push, commits, rollback

Later additions as use cases demand

💰	Accountant	Resource tracking when pipelines grow
⏰	Timekeeper	If predictions become a thing
🔬	Workbench Boss	When data-workbench pipeline comes in
🛒	Grocer	When grocery pipeline comes in
📧	Postmaster	When email pipeline comes in

Pipeline invariants, reframed as outcome chains

Your correction: invariants are outcomes, not schedules. Each link is checked at runtime. If it breaks, ntfy the pipeline's persona.

🛒 Groceries

need identified→ captured→ buy list→ dispatched→ food arrives→ consumed

🔬 Research

question→ query formed→ answered→ learnings extracted→ stored in brain

📧 Email

email arrives→ triaged→ digest produced→ delivered→ decisions executed

🔬 Workbench

data→ cleaned→ analyzed→ reported→ acted on

Three pushbacks on looseness

1. "Log raw + classified every turn" is expensive at scale

Classification = extra LLM call. Per turn × 20 turns × 10 sessions/day = 200 extra calls daily. At Haiku cost tolerable (~$0.50/day); at Opus, ugly.

My call: raw always (cheap), classification async in batches at session close by the Librarian. Same output, no inline cost.

2. "GitHub push all the time" — define all-the-time

Per-turn push = 200/day. GitHub rate limits will bite. Options: per-commit-push (clean history but noisy); per-session-close-push (clean, batched); per-N-commits (middle).

My call: commit per turn (local) → push at session close. Turn-level rollback via git reset. GitHub stays clean. Override flag available.

3. "Invariants are what they'll be" — careful here

Fine at pipeline level (grocery flow can evolve). NOT fine at session level. If session purpose can "drift" into whatever, the anchor for drift detection disappears.

My call: session invariants hard (purpose stated, mode locked). Pipeline invariants soft (outcomes can be renegotiated as the company grows).

Revised build sequence — four blocks

BLOCK 1The spine nothing else works without this

Tailscale pre-flight check (10 min) — hook verifies tunnel before any brain operation. Fails loud.
Session start hook — declares 🕴️ Session Boss. Loads prime. Inits budget tracker. Opens HANDOFF.md for append. Verifies brain DB reachable.
Stop hook — runs after every turn. 📚 Librarian writes turn row to brain (raw only, classification deferred). Appends summary to HANDOFF.md. Updates budget. Flags write failure.

BLOCK 2The compounding without this, sessions are isolated

Auditor skill /done — invoked on session close. Fires classification of turns in batch. Pierre rates + AI rates. 🧑‍⚖️ Auditor finds repeated mistakes in brain DB. Proposes changes. Writes final handoff.
👮 Cop drift detector — background check per turn. AI drift + user drift logged separately as drift_events. 3+ events triggers checkpoint.

BLOCK 3The safety without this, mistakes are permanent

💾 Archivist — commit per turn, push at session close. /undo skill for turn-level rollback.

BLOCK 4The cosmetics nice, not blocking

Time hook — wall clock in statusLine. Claude says "I predict X minutes", logged, compared next time. Calibration loop.
Status bar control board — budget, turn count, active personas, brain write status, time.

First sprint — Block 1 only

Three files. All bash. All short. All read/write brain via existing ssh + docker exec pattern.

~/.claude/hooks/session-start.sh
~/.claude/hooks/stop.sh
~/.claude/hooks/pre-brain-write.sh

Budget: ~1 day of focused work.

Rationale: you can't build Auditor without turn logs existing. Can't build drift detection without session state. Can't build rollback without commits. The spine is the only thing with no dependency — everything else depends on it.

The question back to you

Build Block 1 first, or revise block membership? Still sparring. Say "make it so" to exit sparring and commit to Block 1.