Decouple container restart from config updates — config CLI ops now only
write to the DB; restart is a separate `ncl groups restart` command with
--rebuild and --message flags. Add on_wake column to messages_in so wake
messages are only picked up by a fresh container's first poll, preventing
dying containers from stealing them during the SIGTERM grace window.
killContainer accepts an onExit callback for race-free respawn. Agent-
called restart auto-scopes to the calling session.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 14 tests covering key routing and dispatch flows that previously had
zero direct coverage:
dispatchResultText:
- bare text produces no outbound (scratchpad only)
- unknown destination dropped, valid destination sent
- multiple <message> blocks each produce correct outbound
- internal tags stripped from scratchpad
originAttr / from= metadata:
- chat/task/webhook/system messages include from= when destination matches
- fallback to raw unknown:channel:platform when no match
- from= omitted when routing is null
resolveDestinationThread:
- null thread_id when no prior inbound from destination
- most recent thread_id wins with multiple inbound messages
Also fix merge issue: restore getAllDestinations import removed by our PR
but still needed by #2327's compaction reminder. Fix stale destinations
test assertion from #2328 ("no special wrapping needed" → "Every response
must be wrapped").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The poll loop had a bare-text routing fallback in dispatchResultText: when
the agent produced text without <message to="..."> wrapping, it would auto-
route to the session's originating channel (via a frozen RoutingContext) or
to the single configured destination. This caused three problems:
1. Routing drift: RoutingContext was extracted once from the initial batch
and never refreshed. When the initial batch was a null-routed cron task
and a real chat arrived mid-query, replies were silently dropped to
scratchpad because the frozen routing had all-null fields.
2. Cross-channel thread bleed: sendToDestination applied a single
routing.threadId to every outbound message regardless of destination.
In agent-shared sessions (multiple channels sharing one session), one
channel's thread ID was stamped onto messages to a different channel.
3. Inconsistent formatting: task, webhook, and system messages had no
origin metadata in their formatted output, so the agent couldn't tell
which destination they came from — even when the underlying messages_in
rows carried routing fields.
Changes:
- Remove the bare-text routing fallbacks in dispatchResultText (both the
routing-based and single-destination shortcuts). All agent output must
be wrapped in <message to="name">...</message>. Bare text is scratchpad.
- Update buildDestinationsSection() to require explicit wrapping for all
groups, including single-destination. No more "no special wrapping
needed" shortcut.
- Resolve thread_id per-destination via resolveDestinationThread(), which
queries messages_in for the most recent message matching the target
channel+platform. Falls back to null (top-level channel message) when
no prior inbound exists for that destination.
- Extract originAttr() helper in formatter.ts and apply it to all message
types. Tasks now render as <task from="dest" time="...">, webhooks as
<webhook from="dest" source="..." event="...">, system responses as
<system_response from="dest" ...>. The agent always sees where a
message originated.
- Add a PreCompact shell hook (compact-instructions.ts) that outputs
custom compaction instructions, telling the compactor to preserve
recent message XML structure and routing metadata in the summary.
Wired via settings.json in the .claude-shared scaffold, with a
migration path (ensurePreCompactHook) for existing groups.
Relation to open PRs:
- #2277 (mergeRouting) becomes unnecessary — the routing fallback it
patches no longer exists. Can be closed.
- #2327 (post-compaction destination reminder) is complementary — it
handles the post-compaction push, this handles pre-compaction
instructions. Both can merge independently.
- #2328 (default routing instruction) is complementary — it adds "reply
to the from= destination" guidance to the multi-destination section.
Compatible with the unified instruction format here.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A warm container picks up every pending messages_in row on each poll tick
and calls markProcessing → agent.query → markCompleted. Before this, that
included trigger=0 rows (ignored_message_policy='accumulate' context),
causing the agent to wake and potentially respond to messages the wiring
had explicitly opted out of engaging on — defeating accumulate's "store
as context, don't engage" contract.
Gate the main poll loop with `messages.some(m => m.trigger === 1)` —
mirrors host-side countDueMessages which is already gated. If the batch
has no wake-eligible row, sleep and leave them pending. They ride along
via the same getPendingMessages query the next time a real trigger=1
lands, which is the intended accumulate behavior.
The concurrent active-turn poll (line ~290) is unchanged on purpose —
once the agent has engaged, pushing in accumulate rows mid-turn as
additional context is desired.
initTestSessionDb was missing the trigger and series_id columns on
messages_in, out of sync with the live migration. Added both so the new
tests (and any future trigger-aware tests) can run.
Four tests cover the data contract: trigger=0 rows are returned by
getPendingMessages (so they ride along), the gate predicate correctly
identifies accumulate-only batches, mixed batches pass the gate, and the
schema default of 1 applies when the column is omitted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Container side:
- agent-runner switches to Bun. Drops better-sqlite3 (native compile gone),
drops tsc build step in-image AND the tsc-on-every-session-wake in the
entrypoint — bun runs src/index.ts directly. bun:sqlite replaces
better-sqlite3; cross-mount DB invariants (journal_mode=DELETE, busy_timeout)
preserved. Named params converted from @name to $name because bun:sqlite
does not auto-strip the prefix the way better-sqlite3 does.
- Tests ported from vitest to bun:test (only describe/it/expect/before/afterEach
used, API-compatible). vitest.config.ts excludes container/agent-runner/.
- bun.lock replaces pnpm-lock.yaml + pnpm-workspace.yaml under
container/agent-runner/. Host pnpm workspace does NOT include this tree.
Dockerfile improvements (independent of Bun but bundled while touching the file):
- tini as PID 1 for correct SIGTERM propagation (prevents half-written
outbound.db on shutdown).
- Extracted entrypoint.sh — readable and diffable vs the old inline printf.
- BuildKit cache mounts for apt + bun install + pnpm install.
- --no-install-recommends on apt, pinned CLAUDE_CODE_VERSION, AGENT_BROWSER,
VERCEL, BUN_VERSION.
- CJK fonts (~200MB) behind ARG INSTALL_CJK_FONTS=false; build.sh reads from
.env; setup/container.ts reads the same .env so /setup and manual rebuild
stay in sync.
- PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in case any postinstall tries to pull a
redundant Chromium.
- /home/node 755 (was 777).
Host side:
- src/container-runner.ts dynamic spawn command collapses from
`pnpm exec tsc --outDir /tmp/dist … && node /tmp/dist/index.js` to
`exec bun run /app/src/index.ts` — cold start ~200-500ms faster per wake.
CI:
- oven-sh/setup-bun@v2 alongside Node/pnpm. Adds explicit container
typecheck (was documented in CLAUDE.md, not enforced) and `bun test` for
agent-runner tests.
Reshape AgentProvider so provider-specific assumptions stop leaking into
the generic layer. No change to what reaches sdkQuery() — same values,
different plumbing.
- QueryInput: opaque `continuation` replaces `sessionId` + `resumeAt`;
`systemContext.instructions` replaces ambiguous `systemPrompt`;
`mcpServers`, `env`, `additionalDirectories` move to `ProviderOptions`
at construction time.
- AgentProvider gains `isSessionInvalid(err)` and
`supportsNativeSlashCommands` so the poll-loop stops regex-matching
Claude error strings and gates passthrough slash commands per provider.
- ClaudeProvider owns `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and the
stale-session regex internally.
- ProviderEvent.activity kept and documented as the liveness signal
(fires on every SDK message so the idle timer stays honest during
long tool runs); init carries `continuation` instead of `sessionId`.
- poll-loop drops mcpServers/env/systemPrompt from its config; admin
user id now passed explicitly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates SQLite write contention across the host-container mount
boundary by splitting the single session.db into two files, each with
exactly one writer:
inbound.db — host writes (messages_in, delivered tracking)
outbound.db — container writes (messages_out, processing_ack)
Key changes:
- Host uses even seq numbers, container uses odd (collision-free)
- Container heartbeat via file touch instead of DB UPDATE
- Scheduling MCP tools now emit system actions via messages_out
(host applies them to inbound.db during delivery)
- Host sweep reads processing_ack + heartbeat file for stale detection
- OneCLI ensureAgent() call added (was missing from v2, caused
applyContainerConfig to reject unknown agent identifiers)
Verified: tsc clean, 327 tests pass, real e2e through Docker works.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use DELETE journal mode for session DBs instead of WAL. WAL doesn't
sync reliably across Docker volume mounts (VirtioFS), causing dropped
writes and duplicate deliveries.
- Add 20s idle detection to end the query stream. The concurrent poll
tracks SDK activity via a new 'activity' provider event. When no SDK
events arrive for 20s and no messages are pending, the stream ends
and the poll loop continues.
- Add touchProcessing heartbeat so the host can distinguish active
agents from idle ones by checking status_changed recency.
- Catch query errors in the poll loop and write error responses to
messages_out instead of crashing the process.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AgentProvider abstraction with Claude and Mock implementations.
Poll loop reads messages_in, formats by kind, queries provider,
writes results to messages_out. Concurrent polling pushes follow-up
messages into active queries.
- providers/types.ts: AgentProvider, AgentQuery, ProviderEvent
- providers/claude.ts: wraps Agent SDK with MessageStream, hooks,
transcript archiving
- providers/mock.ts: canned responses with push() support
- providers/factory.ts: createProvider()
- formatter.ts: format by kind (chat/task/webhook/system), XML
escaping, routing extraction
- poll-loop.ts: poll → format → query → write, concurrent polling
- mcp-tools.ts: MCP server with send_message tool
- index-v2.ts: new entry point (config from env, enters poll loop)
- 11 new tests, all 288 tests pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>