- docs/v1-vs-v2/: full v1→v2 regression analysis (SUMMARY + 21 per-module docs + ACTION-ITEMS rollup with decisions + timezone recreation spec). - container/agent-runner/scripts/sdk-signal-probe.ts: empirical harness used to characterise Claude Agent SDK event/hook/stderr timing for the stuck-detection design in item 9. - src/channels/chat-sdk-bridge.ts: document the conversations Map staleness in a code comment; fix deferred to when dynamic group registration lands (ACTION-ITEMS item 17). No runtime behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 KiB
v1 → v2 Deep Dive: Aggregate Summary
Per-file deep-dives were produced for every file in src/v1/ and container/agent-runner/src/v1/. This document aggregates findings across all 21 modules.
Per-file docs
| Topic | File | v1 source(s) |
|---|---|---|
| Configuration | config.md | src/v1/config.ts |
| Environment helpers | env.md | src/v1/env.ts |
| Types | types.md | src/v1/types.ts |
| Logger | logger.md | src/v1/logger.ts |
| Timezone | timezone.md | src/v1/timezone.ts |
| Database layer | db.md | src/v1/db.ts |
| Container runner | container-runner.md | src/v1/container-runner.ts |
| Container runtime + mounts | container-runtime.md | src/v1/container-runtime.ts, mount-security.ts |
| Group folder | group-folder.md | src/v1/group-folder.ts |
| Group queue | group-queue.md | src/v1/group-queue.ts |
| Host index | index-host.md | src/v1/index.ts |
| IPC (host + container) | ipc.md | src/v1/ipc.ts, container/.../v1/ipc-mcp-stdio.ts |
| Remote control | remote-control.md | src/v1/remote-control.ts |
| Router | router.md | src/v1/router.ts + index.ts routing |
| Sender allowlist | sender-allowlist.md | src/v1/sender-allowlist.ts |
| Session cleanup | session-cleanup.md | src/v1/session-cleanup.ts |
| Task scheduler | task-scheduler.md | src/v1/task-scheduler.ts |
| Channels | channels.md | src/v1/channels/* |
| Agent-runner entry | container-index.md | container/.../v1/index.ts |
| Agent-runner MCP tools | container-mcp-tools.md | container/.../v1/mcp-tools.ts |
| Formatting test (orphan) | formatting-test.md | src/v1/formatting.test.ts |
The big shift
v2 rewrote the fundamental transport between host and container. The one-line version:
v1 = IPC files + stdin/stdout + in-memory GroupQueue + polling message loop. v2 = two SQLite DBs per session + event-driven routing + 60s host sweep.
Everything else flows from that. Removing IPC forced a rewrite of the router, the container-runner, the agent-runner entry, and the MCP-tool bridge. The 60s sweep absorbed the task scheduler, session cleanup, and pending-message recovery. The entity model (users/roles/messaging_groups) replaced the flat sender allowlist and chat-level config. Provider abstraction + Chat SDK bridge replaced hardcoded Claude SDK + per-channel adapters.
Net LOC: v1 (~7.4k host + monolithic container-runner) → v2 (~5.5k host, split modules). Fewer lines, cleaner boundaries, more coverage.
What's kept (identical or near-identical)
timezone.ts— byte-identicalgroup-folder.ts— byte-identical validation; v2 addsgroup-init.tsfor filesystem scaffoldcontainer-runtime.ts— nearly identical (only logger import swapped)mount-security.ts— same structure, one field removed (see regressions)config.ts/env.ts— same structure, same.envsurface; several constants now dead codelogger.ts— same levels/colors/routing, but API shape changed (message-first instead of data-first)- MCP
send_messagetool — kept + enhanced with named destinations
What's new in v2
- Two-DB session model (
inbound.db+outbound.db) with even/odd seq parity, journal_mode=DELETE for cross-mount visibility - Entity model —
users,user_roles(owner/admin/scoped),agent_group_members,messaging_groups,messaging_group_agents,user_dms(cold-DM cache) - Host sweep (60s) — absorbs scheduler, cleanup, pending-message recovery, recurrence firing, stale detection, orphan cleanup
- Chat SDK bridge — unifies Discord/Slack/Teams/other adapters through
@anthropic-ai/chat - Provider abstraction — default Claude + opt-in OpenCode etc. via
providersbranch - OneCLI integration — credential gateway + approval flow (
src/onecli-approvals.ts) - 16 new MCP tools — scheduling (6), interactive (2), self-mod (3), agent mgmt (1), message manipulation (3), plus enhanced
send_message - Heartbeat file mtime — replaces IPC liveness
- Session persistence — session ID survives container restarts
- Dual-rate polling — 1000ms idle / 500ms active inside container
- Idle stream termination — 20s timeout prevents zombie queries
- Processing ACK — reverse channel (outbound → inbound) for idempotence
- Migration system — 9 numbered migrations vs v1's ad-hoc ALTERs
- Webhook server (new for HTTP-based channels)
- Container typing indicator refresh via delivery
What's removed (deliberately)
- IPC transport (files, stdin/stdout JSON, MCP-over-stdio bridge) — replaced by DB polling
GroupQueuein-memory state machine — serialization viamessages_in.status- Output markers (
---NANOCLAW_OUTPUT_START/END---) — results land inmessages_out - State persistence (
router_state,lastAgentTimestampmap) — each message is independent - Per-exit container log files — only logger.debug to host log
- Flat sender allowlist (JSON config) — replaced by role-based access +
unknown_sender_policy - Remote control subsystem (
/remote-controlcommand → spawned CLI) - IPC watcher (dynamic group-add while running)
task_runsaudit table — no task execution log- Cron/interval task types as first-class entities — tasks are
messages_inrows withkind='task'+recurrence - Stdin protocol for agent input — container reads from inbound.db
Regressions worth fixing (ranked)
HIGH priority
-
Trigger-rule matching in
pickAgent(src/router.ts:198TODO). Without this, a messaging group wired to multiple agents fires ALL of them on every message. Schema (messaging_group_agents.trigger_rules) is ready; the check is ~10 lines. Likely broken-by-default for multi-agent setups. -
nonMainReadOnlymount isolation removed (src/mount-security.ts). Non-main/shared agent groups can now mount read-write on any path the allowlist permits. v1 enforced read-only-for-non-main regardless of allowlist. Security regression for multi-tenant setups. Restore: add field + restoreisMainparam flow. -
Pending-message recovery on startup (
src/v1/index.ts:465-473). v1 explicitly scanned for unprocessed messages on restart. v2 relies on the sweep to notice. Likely works in practice, but worth a test: kill container mid-message, restart host, verify redelivery within ≤5s.
MEDIUM priority
-
response_scopeenforcement (messaging_group_agents.response_scopestored but unused). Values'all' | 'triggered' | 'allowlisted'are saved but nothing reads them. -
request_approvalflow for unknown senders (src/router.ts:295TODO).unknown_sender_policy='request_approval'is scaffolded but doesn't actually produce an approval card. -
Per-group container timeout. v1's
containerConfig.timeoutoverride is gone; all groups shareIDLE_TIMEOUT. Slow-but-healthy agents get killed with fast agents' timeout. -
Container streaming output. v1's marker-based pre-completion delivery is gone. v2 must wait for outbound.db poll. Latency-sensitive UX regresses.
-
Per-exit container logs. v1 wrote timestamped per-exit log files with full I/O + mounts + stderr. v2 only has logger.debug. Zero-cost on success, high-value on crash. Restore at least for non-zero exit.
-
Explicit container kill on stale detection. v2's sweep marks messages for retry but doesn't stop the stale container. Only
cleanupOrphans()at startup removes them. AddstopContainer()when heartbeat stale AND processing stuck. -
Host-level retry with backoff on agent error. v1 had MAX_RETRIES=5 + exp. backoff on
processGroupMessagesfailure. v2 only retries on stale-heartbeat. Explicit agent-error retry could close the gap.
LOW priority
- Process ID in logger output — lost multi-process debugging info
- Task dedup via unique
(kind, series_id)index — v2 can have two pending rows with same series; best-effort via atomic status update - Silent-drop mode for noisy senders — v1's
mode:'drop'had a use case; orthogonal to privilege - Remote control — decide: restore as opt-in skill or document as removed
- Dead config constants (
POLL_INTERVAL,SCHEDULER_POLL_INTERVAL,IPC_POLL_INTERVAL) — delete fromsrc/config.ts - Configurable retention thresholds (
STALE_THRESHOLD_MS,MAX_TRIES) — move from constants toconfig.ts - Dynamic group-add (IPC watcher equivalent) — probably not worth; document that restart is required
Things kept as test-only regression risk
The orphan src/v1/formatting.test.ts asserted behaviors that aren't fully exercised in v2:
- Timezone-aware formatted timestamps — v1 emitted locale strings ("Jan 1, 2024, 1:30 PM"); v2 emits UTC HH:MM
<context timezone="..."/>header — gonereply_to="<id>"attribute — v2 only stores sender name + truncated preview- Trigger-pattern unit tests — no direct equivalent (logic moved to DB but isn't tested at the router level)
- Internal tag stripping tests — no isolated tests in agent-runner
These are specs worth porting to v2 tests once trigger matching is implemented.
Files entirely gone in v2
src/v1/ipc.ts+src/v1/ipc-auth.test.ts— IPC is deadcontainer/.../v1/ipc-mcp-stdio.ts— MCP-over-stdio bridge deadsrc/v1/group-queue.ts— serialization via DBsrc/v1/session-cleanup.ts— merged intohost-sweep.tssrc/v1/task-scheduler.ts— merged intohost-sweep.ts+ system actions indelivery.tssrc/v1/remote-control.ts— feature removedsrc/v1/sender-allowlist.ts— entity model supersedes
Net architectural assessment
v2 is strictly simpler, more consistent, and more robust in its happy path. The remaining TODOs (trigger matching, response_scope, request_approval) reflect scaffolding that was checked in ahead of the feature — none are deep design issues. The one actual regression is nonMainReadOnly mount isolation; it was a defense-in-depth feature and deserves to come back. The removal of per-exit container logs and streaming output markers are judgment calls that traded observability for simplicity — both can be restored cheaply if needed.
No file in v1 contains a behavior that v2 is architecturally unable to express. The outstanding work is feature-completion, not architecture.