Decouple container restart from config updates — config CLI ops now only
write to the DB; restart is a separate `ncl groups restart` command with
--rebuild and --message flags. Add on_wake column to messages_in so wake
messages are only picked up by a fresh container's first poll, preventing
dying containers from stealing them during the SIGTERM grace window.
killContainer accepts an onExit callback for race-free respawn. Agent-
called restart auto-scopes to the calling session.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Squash merge of PR #2267 by ddaniels.
When an agent group has more than one active session, A2A replies landed
in the newest session via findSessionByAgentGroup's ORDER BY created_at
DESC. The session that asked the question never saw the answer.
Adds origin-aware return-path routing with three layers:
1. Direct return-path: if the reply has in_reply_to, look up the
triggering inbound row's source_session_id and route there.
2. Peer-affinity fallback: find the most recent A2A inbound from this
peer and use its source_session_id.
3. Legacy fallback: newest active session (pre-migration compat).
Container-side: MCP send_message/send_file now thread the current
batch's in_reply_to through to outbound rows via current-batch.ts.
Also flips our A2A bug-documenting test (#2332) from asserting the
broken behavior to asserting the fixed behavior.
Co-Authored-By: Doug Daniels <ddaniels888@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 14 tests covering key routing and dispatch flows that previously had
zero direct coverage:
dispatchResultText:
- bare text produces no outbound (scratchpad only)
- unknown destination dropped, valid destination sent
- multiple <message> blocks each produce correct outbound
- internal tags stripped from scratchpad
originAttr / from= metadata:
- chat/task/webhook/system messages include from= when destination matches
- fallback to raw unknown:channel:platform when no match
- from= omitted when routing is null
resolveDestinationThread:
- null thread_id when no prior inbound from destination
- most recent thread_id wins with multiple inbound messages
Also fix merge issue: restore getAllDestinations import removed by our PR
but still needed by #2327's compaction reminder. Fix stale destinations
test assertion from #2328 ("no special wrapping needed" → "Every response
must be wrapped").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add integration test for per-destination thread_id resolution: seeds two
destinations with different thread IDs, verifies each outbound message
carries the correct thread_id (not a global one from the batch routing).
- Add log line in resolveDestinationThread catch block for debuggability.
- Remove stray "(ensurePreCompactHook is defined after the main function.)"
comment from group-init.ts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The poll loop had a bare-text routing fallback in dispatchResultText: when
the agent produced text without <message to="..."> wrapping, it would auto-
route to the session's originating channel (via a frozen RoutingContext) or
to the single configured destination. This caused three problems:
1. Routing drift: RoutingContext was extracted once from the initial batch
and never refreshed. When the initial batch was a null-routed cron task
and a real chat arrived mid-query, replies were silently dropped to
scratchpad because the frozen routing had all-null fields.
2. Cross-channel thread bleed: sendToDestination applied a single
routing.threadId to every outbound message regardless of destination.
In agent-shared sessions (multiple channels sharing one session), one
channel's thread ID was stamped onto messages to a different channel.
3. Inconsistent formatting: task, webhook, and system messages had no
origin metadata in their formatted output, so the agent couldn't tell
which destination they came from — even when the underlying messages_in
rows carried routing fields.
Changes:
- Remove the bare-text routing fallbacks in dispatchResultText (both the
routing-based and single-destination shortcuts). All agent output must
be wrapped in <message to="name">...</message>. Bare text is scratchpad.
- Update buildDestinationsSection() to require explicit wrapping for all
groups, including single-destination. No more "no special wrapping
needed" shortcut.
- Resolve thread_id per-destination via resolveDestinationThread(), which
queries messages_in for the most recent message matching the target
channel+platform. Falls back to null (top-level channel message) when
no prior inbound exists for that destination.
- Extract originAttr() helper in formatter.ts and apply it to all message
types. Tasks now render as <task from="dest" time="...">, webhooks as
<webhook from="dest" source="..." event="...">, system responses as
<system_response from="dest" ...>. The agent always sees where a
message originated.
- Add a PreCompact shell hook (compact-instructions.ts) that outputs
custom compaction instructions, telling the compactor to preserve
recent message XML structure and routing metadata in the summary.
Wired via settings.json in the .claude-shared scaffold, with a
migration path (ensurePreCompactHook) for existing groups.
Relation to open PRs:
- #2277 (mergeRouting) becomes unnecessary — the routing fallback it
patches no longer exists. Can be closed.
- #2327 (post-compaction destination reminder) is complementary — it
handles the post-compaction push, this handles pre-compaction
instructions. Both can merge independently.
- #2328 (default routing instruction) is complementary — it adds "reply
to the from= destination" guidance to the multi-destination section.
Compatible with the unified instruction format here.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closesqwibitai/nanoclaw#2325.
When the Claude Code SDK auto-compacts the conversation context, the
compaction summary tends to drop the agent's learned <message to="…">
wrapping discipline. The destinations table is still populated and the
system prompt still lists them, but the behavioral pattern degrades —
A2A sends and multi-channel routing silently revert to bare-text or
single-channel delivery for the rest of the session, until the next
/clear.
Three small changes wire a reminder back into the live query when this
fires:
- New `compacted` event on ProviderEvent. Distinct from `result` so it
doesn't mark the turn completed or get dispatched as a chat message
(which is also why "Context compacted (N tokens compacted)." stops
appearing as noise in user-facing chats — it was a side-effect of
reusing the result event path).
- ClaudeProvider yields `compacted` instead of `result` for the SDK's
compact_boundary system event.
- Poll-loop's event handler reacts by pushing a system-tagged reminder
back into the active query when there are >1 destinations. Single-
destination groups skip the push since they have a fallback that
works without wrapping.
Tests cover both branches (multi-destination → reminder fires;
single-destination → no reminder) using a CompactingProvider that
emits the new event mid-stream.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The follow-up poller filtered /clear out of every tick without acking
the row, and pushed every other slash command through plain
formatMessages() (XML wrapping). On a warm container the outer
while(true) loop never regains control, so:
- /clear sat pending in messages_in forever (no response at all)
- /compact, /cost, /context, /files, /remote-control arrived at the
SDK as XML-wrapped user text and were never dispatched as commands
Both modes are invisible to host monitoring: rows are either left
pending without a processing_ack claim, or marked completed normally;
heartbeat keeps firing inside the SDK event loop.
When the follow-up poller observes any slash command (admin or
passthrough — categorizeMessage decides), end the active query so the
current turn winds down cleanly and the outer loop wakes, re-fetches
the same pending set, and runs them through the canonical path
(/clear handler + formatMessagesWithCommands raw dispatch). Leave the
rows untouched so the outer-loop fetch sees the same set the poller
saw.
Cost: each slash command on a warm container forces close+reopen of
the SDK stream — a few seconds of subprocess startup. The Anthropic
prompt cache is server-side with a 5-min TTL keyed on prefix hash, so
stream lifecycle does not affect cache lifetime; close+reopen within
5 min still gets cache hits.
Also corrects the warm-stream rationale comment on processQuery, which
implied keeping the stream open preserved cache warmth — it doesn't.
Testing evidence — cache stays warm across stream close+reopen:
Turn 1 (warm session):
Usage: in=6 out=245 cache_create=92 cache_read=22996
Full cache hit (22996 tokens).
Turn 2 — /clear arrives:
Pending slash command — ending stream so outer loop can process
Clearing session (resetting continuation)
Usage: in=6 out=95 cache_create=9393 cache_read=13600
System prompt + tool defs (~13600 tokens) still hit cache;
conversation history is gone (continuation reset) so the new turn
writes fresh context.
Turn 3 — /cost arrives:
Pending slash command — ending stream so outer loop can process
Usage: in=0 out=0 cache_create=0 cache_read=0 wall=0.0s api=0.0s
/cost is a CLI built-in: dispatched locally by the SDK, no API
call. Pre-fix this would have arrived as XML-wrapped user text
and never dispatched — confirms the broader fix works.
Turn 4 (next chat after /cost):
Usage: in=6 out=142 cache_create=328 cache_read=22993
Full cache hit again (22993 tokens read, 328 written). Despite the
/cost-induced stream close+reopen, the server-side prompt cache
survived: the new sdkQuery() resumed the same continuation, the
request prefix matched the cached entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes on top of the follow-up pre-task-script work:
1. The void async IIFE inside the interval handler had no catch, so a
throw from the dynamic import or applyPreTaskScripts escaped as an
unhandled rejection — terminating the container. The initial-batch
path is wrapped by processQuery's outer try/catch; the follow-up
path needs its own. Now logs the error and lets the next tick retry.
2. Re-check `done` immediately before query.push. The flag can flip
true while applyPreTaskScripts is awaited (outer stream finishes
during the script execution); without the re-check we'd push into a
closed query. Claimed messages get released by the host's
processing-claim sweep — same recovery posture as the rest of the
poller.
Co-Authored-By: Michael Zazon <mzazon@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tasks arriving during an active query were pushed into the stream as
follow-ups without running their `script` gate — so a wakeAgent=false
pre-script that was supposed to suppress the tick silently leaked
through and woke the agent every time. Evidence: monitoring cron
firing every 10 min with [task-script] log lines never showing.
Run applyPreTaskScripts on the follow-up batch too: wakeAgent=false
tasks get marked completed and dropped; wakeAgent=true tasks have
scriptOutput enriched exactly like the initial-batch path. Added a
pollInFlight guard to serialize async runs and avoid overlapping
script executions when the interval fires while one is still going.
Wrapped in a MODULE-HOOK:scheduling-pre-task-followup marker block
to match the existing initial-batch hook convention.
Before, every provider stored its opaque continuation id under the
single outbound.db key `sdk_session_id`. Flipping a session's
agent_provider (e.g. Codex → Claude) meant the new provider read the
old provider's id at wake, handed it to its own SDK, and got a
"No conversation found" error that cost the user one sacrificed
message before the stale-session recovery path cleared the id.
This reshapes session_state so continuations are keyed
`continuation:<provider>` instead. Consequences:
- Per-provider continuations coexist. Flipping Claude → Codex → Claude
resumes the Claude thread exactly where it left off, with the
intervening Codex thread also still on file.
- No provider ever reads another provider's id. Switching costs no
sacrificed message and emits no transient error.
- Legacy installs are migrated forward on first startup:
migrateLegacyContinuation() adopts any pre-existing `sdk_session_id`
row into the current provider's slot (best guess — it was whichever
provider ran last), then deletes the legacy row unconditionally so
it can't poison a future provider's read.
runPollLoop now takes providerName alongside the provider instance,
and threads it through processQuery to setContinuation on init.
Tests: 9 new tests covering set/get isolation across providers,
clear-specificity, legacy-adoption, legacy-always-deleted,
prefer-existing-slot-over-legacy, and idempotency of a second
migration call.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The concurrent poll in processQuery filtered out messages with
mismatched thread_ids, causing a deadlock when the initial batch
(e.g. a host-generated welcome trigger with null thread_id) completed
but follow-ups arrived with a different thread_id (e.g. a Discord DM).
The query stayed open waiting for matching-thread pushes that never
came, blocking the poll loop indefinitely.
Thread routing is the router's concern — per-thread sessions already
isolate threads into separate containers; shared sessions intentionally
merge everything. Removed the filter.
Also fixed processing_ack: a result event (with or without text) means
the turn is done, but markCompleted only ran when event.text was truthy.
When the agent responded via MCP send_message (empty result text), the
initial batch stayed in 'processing' for the query's lifetime, creating
false stuck signals in the host sweep. Now marks completed on any result
event.
Belt-and-suspenders: init-first-agent welcome trigger now sets threadId
to the DM platform_id instead of null.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the per-group agent-runner-src copy model with a single shared
read-only mount. Source and skills are now RO + shared; personality,
config, working files, and Claude state stay RW + per-group.
Key changes:
- Mount container/agent-runner/src/ RO at /app/src (all groups share one copy)
- Mount container/skills/ RO at /app/skills; per-group skill selection via
symlinks in .claude-shared/skills/ based on container.json "skills" field
- Mount container.json as nested RO bind on top of RW group dir
- Move all NANOCLAW_* env vars to container.json (runner reads at startup)
- New runner config.ts module replaces process.env reads
- Move command gate (filtered/admin) from container to host router
- Dockerfile: remove source COPY, split CLI installs (claude-code last),
move agent-runner deps above CLIs for better layer caching
- Add writeOutboundDirect for router denial responses
- Design doc at docs/shared-src.md
Not included (follow-up): DB migration to drop agent_provider columns,
cleanup of orphaned agent-runner-src directories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related bugs that surfaced together when a Discord response exceeded
2000 chars:
1. **Session id lost on mid-turn container exit.** `runPollLoop` was
calling `setStoredSessionId` only after `processQuery` returned. If
the container died between the SDK's `init` event (where session_id
arrives) and the stream completing, the id was never persisted. The
next wake called `getStoredSessionId()` → undefined and started a
fresh Claude session, dropping all prior context. Fix: persist
immediately in the `init` branch inside `processQuery`. The existing
post-query store becomes a harmless no-op.
2. **Silent truncation past adapter limits.** `chat-sdk-bridge.deliver`
handed full text straight to `adapter.postMessage`. Discord's adapter
hard-truncates at 2000 chars; Telegram's at 4096. Responses longer
than that were cut off without any signal to the user or host. Fix:
add `maxTextLength` to `ChatSdkBridgeConfig` and a `splitForLimit`
helper that breaks on paragraph → line → hard-char boundaries, then
posts chunks sequentially. Files ride on the first chunk; the
returned id is the first chunk's so edits and reactions still target
the reply head.
Channel adapter files (Discord, Telegram, …) live on the `channels`
branch — a companion PR wires `maxTextLength: 1900` for Discord and
`4000` for Telegram so the splitter actually engages in those installs.
Without wiring, behavior is unchanged.
When the unknown-channel approval flow completes, seed a /welcome task
into the newly-wired session so the agent greets the new user on first
contact. The replayed /start (Telegram's default first-message) is filtered
by the agent-runner's command-command filter, so without an explicit
onboarding trigger the first interaction produced nothing.
Pin the destination by its local_name from agent_destinations to avoid the
agent picking the wrong named destination (previously it greeted the owner,
whose DM is in CLAUDE.md).
Also guard dispatchResultText against echoing trailing status lines when
the agent has already sent messages explicitly via send_message. Otherwise
a task-triggered flow that calls send_message then emits "welcome message
sent" produces a duplicate chat to the recipient.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A warm container picks up every pending messages_in row on each poll tick
and calls markProcessing → agent.query → markCompleted. Before this, that
included trigger=0 rows (ignored_message_policy='accumulate' context),
causing the agent to wake and potentially respond to messages the wiring
had explicitly opted out of engaging on — defeating accumulate's "store
as context, don't engage" contract.
Gate the main poll loop with `messages.some(m => m.trigger === 1)` —
mirrors host-side countDueMessages which is already gated. If the batch
has no wake-eligible row, sleep and leave them pending. They ride along
via the same getPendingMessages query the next time a real trigger=1
lands, which is the intended accumulate behavior.
The concurrent active-turn poll (line ~290) is unchanged on purpose —
once the agent has engaged, pushing in accumulate rows mid-turn as
additional context is desired.
initTestSessionDb was missing the trigger and series_id columns on
messages_in, out of sync with the live migration. Added both so the new
tests (and any future trigger-aware tests) can run.
Four tests cover the data contract: trigger=0 rows are returned by
getPendingMessages (so they ride along), the gate predicate correctly
identifies accumulate-only batches, mixed batches pass the gate, and the
schema default of 1 applies when the column is omitted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the two overlapping old mechanisms (30-min setTimeout kill in
container-runner, 10-min heartbeat STALE_THRESHOLD reset in host-sweep)
with message-scoped stuck detection anchored to the processing_ack claim
age + an absolute 30-min ceiling that extends for long-declared Bash
tools.
Old model problems:
- IDLE_TIMEOUT setTimeout fired on plain wall-clock time; slow-but-alive
agents got killed at 30min regardless of activity
- 10-min STALE_THRESHOLD in the sweep was unreliable — the heartbeat is
only touched on SDK events, so legitimate silent tool work (sleep 30,
long WebFetch, npm install) looked identical to a hung container
- Two overlapping sources of truth for "when to let go of a container"
New model:
- Host sweep is the single source of truth.
- Container exposes a new `container_state` single-row table in outbound.db
(schema added; container writes, host reads). PreToolUse hook writes
current_tool + tool_declared_timeout_ms (read from Bash's tool_input);
PostToolUse / PostToolUseFailure clear it.
- Sweep decides with a pure helper `decideStuckAction`:
* absolute ceiling — kill if heartbeat age > max(30min, bash_timeout)
* per-claim stuck — kill if any processing_ack row has claim_age >
max(60s, bash_timeout) AND heartbeat hasn't been touched since claim
* otherwise ok
Kill paths reset leftover processing rows with exponential backoff,
reusing the existing retry machinery.
Tool blocklist expanded:
- AskUserQuestion (SDK placeholder; we have mcp__nanoclaw__ask_user_question)
- EnterPlanMode, ExitPlanMode, EnterWorktree, ExitWorktree (Claude Code UI
affordances; would hang in headless containers)
PreToolUse hook is also defense-in-depth: if a disallowed tool name slips
through, it returns `{ decision: 'block' }` so the agent sees a clear
error instead of appearing stuck.
Removed:
- container-runner.ts: IDLE_TIMEOUT setTimeout, resetIdle callback on
activeContainers entry, resetContainerIdleTimer export.
- delivery.ts: the resetContainerIdleTimer call on successful delivery.
- poll-loop.ts: IDLE_END_MS + its setInterval. Keeping the query open is
cheaper than close+reopen (no cold prompt cache). Liveness is now a
host-side concern.
- host-sweep.ts: 10-min STALE_THRESHOLD_MS + getStuckProcessingIds in the
stale-detection path (still exported for kill reset).
Tests:
- src/host-sweep.test.ts — 9 tests for decideStuckAction covering: fresh
heartbeat, absolute ceiling, absent heartbeat, Bash-timeout extension
(both ceiling and per-claim), claim age below tolerance, heartbeat
touched after claim, unparseable timestamps.
Ref: docs/v1-vs-v2/ACTION-ITEMS.md items 9, 6a, 10.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The agent needs to perceive times in the user's timezone, not UTC.
Dropping this in the v1→v2 port produced a class of bugs where the agent
would schedule tasks for the wrong hour, suggest dinner at midnight, etc.
This restores v1 parity.
Container side:
- New container/agent-runner/src/timezone.ts mirrors src/timezone.ts with
isValidTimezone / resolveTimezone / formatLocalTime, plus:
* TIMEZONE constant resolved at load from process.env.TZ (host sets this
from src/container-runner.ts:254)
* parseZonedToUtc(input, tz) — treats a naive ISO as wall-clock time in
`tz`, returns the corresponding UTC Date. Strings with Z or offset
are passed through.
- formatter.ts:
* formatMessages() now prepends <context timezone="IANA"/>\n — matches
v1 src/v1/router.ts:20-22
* formatSingleChat uses formatLocalTime(ts, TIMEZONE) instead of a
home-rolled HH:MM 24h formatter → outputs like "Jun 15, 2026, 8:00 AM"
* reply_to="<id>" attribute + <quoted_message from="X">Y</quoted_message>
element — matches v1 format exactly; old <reply-to/> shape is gone
* stripInternalTags() exported for the dispatch path to reuse
- poll-loop.ts uses the exported stripInternalTags() instead of inline regex.
- mcp-tools/scheduling.ts:
* schedule_task/update_task descriptions now explicitly document that
processAfter accepts either UTC or naive local time (interpreted in
the user's TZ from the context header)
* handlers normalize through parseZonedToUtc() and store a UTC ISO
Host side:
- src/modules/scheduling/recurrence.ts passes { tz: TIMEZONE } to
CronExpressionParser.parse. Without this, "0 9 * * *" fires at 09:00
UTC instead of 09:00 user-local — this was the v1 behavior
(src/v1/task-scheduler.ts:20-49).
Tests:
- container/agent-runner/src/timezone.test.ts — mirror of src/timezone.test.ts
+ new parseZonedToUtc cases
- container/agent-runner/src/formatter.test.ts — context header, reply_to,
quoted_message, XML escaping, stripInternalTags (ported from v1
formatting.test.ts)
- src/modules/scheduling/recurrence.test.ts — cron TZ respected, completed
rows only cloned when recurrence is set
Ref: docs/v1-vs-v2/ACTION-ITEMS.md item 18 + timezone-formatting-v1-recreation.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Outbox extraction (delivery.ts → session-manager.ts)
File I/O for outbound attachments now lives in session-manager.ts alongside
the symmetric inbound extractAttachmentFiles. delivery.ts no longer touches
the filesystem — it hands buffers to the adapter and calls clearOutbox on
success.
- New `readOutboxFiles(agentGroupId, sessionId, messageId, filenames)` and
`clearOutbox(agentGroupId, sessionId, messageId)` in session-manager.ts.
- deliverMessage in delivery.ts loses ~35 lines of fs/path code and its
`fs`/`path` imports.
## Dead-code sweep
TypeScript's --noUnusedLocals surfaced several cruft imports. Fixed:
- src/container-runner.ts: drop unused `markContainerIdle` import; drop
unused `session` parameter from `buildContainerArgs` signature.
- src/delivery.ts: drop unused `getSession`, `writeSessionMessage`,
`wakeContainer` imports.
- src/host-sweep.ts: drop unused `updateSession`, `outboundDbPath` imports.
- container/agent-runner/src/poll-loop.ts: drop unused `config`,
`processingIds` params from `processQuery`.
- Test files: drop unused imports in channel-registry.test, db-v2.test,
host-core.test.
Skipped: `conversations` state in chat-sdk-bridge.ts (never read but
tangled with public `updateConversations` method; cleaning it risks a
merge conflict with the channels branch at the next sync).
## Validation
- `pnpm run build` clean
- `pnpm test` — 137 host tests pass
- `bun test` in container/agent-runner — 17 tests pass
- Service boots (`NanoClaw running`, `OneCLI approval handler started`)
and shuts down cleanly on SIGTERM
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the scheduling surface — 5 delivery actions (schedule_task,
cancel_task, pause_task, resume_task, update_task), handleRecurrence,
applyPreTaskScripts, and task DB helpers — out of core and into
src/modules/scheduling/ (host) and container/agent-runner/src/scheduling/
(container).
First PR to fill the MODULE-HOOK markers introduced in PR #2:
- src/host-sweep.ts MODULE-HOOK:scheduling-recurrence now dynamically
imports handleRecurrence from the module each sweep tick.
- container/agent-runner/src/poll-loop.ts MODULE-HOOK:scheduling-pre-task
dynamically imports applyPreTaskScripts before the provider call.
When the marker block is empty (scheduling uninstalled), `keep`
falls back to `normalMessages` so non-task messages still flow.
The 5 task cases are removed from delivery.ts's handleSystemAction
switch — the registry now routes them. Task DB helpers moved out of
src/db/session-db.ts (which kept `nextEvenSeq` as a named export so
the module can uphold the host-writes-even-seq invariant). Test suite
split to match: scheduling-specific tests live in the module.
No migration — tasks are messages_in rows with kind='task'.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Additive change — existing code paths still run via inline fallbacks.
Prepares core for per-module extractions in PR #3 onward.
Four registries added with empty defaults:
- delivery action handlers (delivery.ts)
- router inbound gate (router.ts)
- response dispatcher (index.ts)
- MCP tool self-registration (container/agent-runner/src/mcp-tools/server.ts)
Default modules moved to src/modules/ for signaling:
- src/modules/typing/ (extracted from delivery.ts)
- src/modules/mount-security/ (moved from src/mount-security.ts)
Both are imported directly by core — no hook, no registry. Removal
requires editing core imports.
Migrator now keys applied rows by name (uniqueness) so module
migrations can pick arbitrary version numbers. Stored version column
is auto-assigned as an applied-order sequence.
sqlite_master guards added around core calls into module-owned tables
(user_roles, agent_destinations, pending_questions). No-ops today;
load-bearing after the owning modules are extracted.
MODULE-HOOK markers placed at scheduling's two skill-edit sites
(host-sweep.ts recurrence call, poll-loop.ts pre-task gate). PR #4
replaces the marked blocks when scheduling moves to its module.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scheduled tasks can now carry a bash script that runs inside the container
before the agent is invoked. The script prints `{wakeAgent, data?}` on its
last stdout line; if `wakeAgent: false` (or the script errors) the task
row is marked completed and the agent is never queried, saving API calls
on no-op checks. On wake, the script's `data` is injected into the task
prompt. Semantics mirror V1: 30s bash timeout, 1MB buffer, last-line JSON,
error == skip.
Also blocks the Claude SDK's built-in scheduling tools (CronCreate,
CronDelete, CronList, ScheduleWakeup) via `disallowedTools` so tasks
actually flow through `mcp__nanoclaw__schedule_task` and get the script
gate. CLAUDE.md gains a soft pointer explaining why `schedule_task` is
the right path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- chat-sdk-bridge: forward thread.id to the router for DMs so sub-thread
context survives into delivery. Previously hardcoded to null, which
collapsed every reply to the DM top level.
- router: when a DM (is_group=0) is wired as `shared`, don't auto-escalate
to per-thread — keep one session for the whole DM and let thread_id
flow through to the adapter.
- agent-runner poll-loop: defer follow-up messages whose thread_id
differs from the active turn's routing. Mixing threads into one
streaming turn sent every reply to the first thread because routing
is captured at turn start.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the agent-group-centric "main group" concept with user-level
privileges and adds the cold-DM infrastructure needed for proactive
outbound messaging (pairing, approvals, welcome flows).
Privilege model
- New tables: users, user_roles (owner global-only; admin global or
scoped to an agent_group), agent_group_members (explicit non-
privileged access; admin/owner imply membership), user_dms (cold-DM
resolution cache).
- Removed agent_groups.is_admin, messaging_groups.admin_user_id. Replaced
with messaging_groups.unknown_sender_policy (strict | request_approval
| public) for per-chat unknown-sender gating.
- src/access.ts: canAccessAgentGroup, pickApprover, pickApprovalDelivery.
- src/router.ts: access gate on every inbound, honoring
unknown_sender_policy for unknown senders.
- src/channels/telegram.ts: pairing interceptor upserts the paired user
and promotes them to owner if hasAnyOwner() is false (first-pair-wins).
Cold DM infrastructure
- ChannelAdapter.openDM?(handle) — optional method. Chat-SDK-bridge wires
it to chat.openDM() for resolution-required channels (Discord, Slack,
Teams, Webex, gChat); direct-addressable channels (Telegram, WhatsApp,
iMessage, Matrix, Resend) fall through to the handle directly.
- src/user-dm.ts: ensureUserDm(userId) — resolves + caches via user_dms.
Approval routing
- onecli-approvals + delivery use pickApprover + pickApprovalDelivery:
scoped admins → global admins → owners (dedup), first reachable via
ensureUserDm, same-channel-kind tie-break. Approvals land in the
approver's DM, not the origin chat.
Delivery fixes
- delivery.ts ACL rejection now throws instead of returning undefined —
the outer loop previously marked rejected messages as delivered.
- Implicit-origin allow: session.messaging_group_id === target skips the
destination check.
- createMessagingGroupAgent auto-creates the companion agent_destinations
row (normalized local_name from the messaging group's name, collision-
broken within the agent's namespace).
Container
- container-runner.ts: /workspace/global always read-only; drops
NANOCLAW_IS_ADMIN; adds NANOCLAW_ADMIN_USER_IDS (owners + global admins
+ scoped admins for this agent group). Agent-runner poll-loop gates
slash commands against that set.
New skill: /init-first-agent
- Walks the operator through standing up the first agent for a channel:
channel pick → identity lookup (reads each channel SKILL.md's
## Channel Info > how-to-find-id) → DM platform_id resolution (direct-
addressable, cold-DM via "user DMs bot first + sqlite lookup", or
Telegram pair-code fallback) → run scripts/init-first-agent.ts →
verify via tail of nanoclaw.log.
- scripts/init-first-agent.ts: parameterized helper that upserts the
user + grants owner (if none), creates dm-with-<display-name> agent
group + initGroupFilesystem, reuses/creates the DM messaging_group,
wires it (auto-creates destination), resolves the session, and writes
a kind:'chat' / sender:'system' welcome message into inbound.db. Host
sweep wakes the container and the agent DMs the operator via the
normal delivery path.
/manage-channels rewrite
- Drops --is-main / --jid / main-vs-non-main isolation references.
- First-channel flow delegates to /init-first-agent.
- Explains createMessagingGroupAgent auto-creates destinations.
- Adds a privileged-users show section.
setup/
- register.ts: drop --is-main, --jid, --local-name, --trigger
requiresTrigger defaults; call initGroupFilesystem; normalize to
v2 schema (no is_admin, no admin_user_id, sets unknown_sender_policy
'strict'); let createMessagingGroupAgent handle the destination row.
- pair-telegram.ts: emit PAIRED_USER_ID (namespaced "telegram:<id>")
instead of ADMIN_USER_ID; update header comment.
- register.test.ts deleted — was v1-only, tested a registered_groups
table that no longer exists.
Docs
- v2-architecture-diagram.{md,html}: ER diagram updated to drop
is_admin/admin_user_id, add unknown_sender_policy, and include
users/user_roles/agent_group_members/user_dms.
- v2-architecture-draft.md: approval-routing paragraph rewritten for
pickApprover/pickApprovalDelivery/ensureUserDm; SQL schema block
updated; admin-verification paragraph references
NANOCLAW_ADMIN_USER_IDS.
- v2-setup-wiring.md: entity-model sketch rewritten.
- v2-checklist.md: marked privilege refactor / container filtering /
approval routing / unknown-sender gating done; removed obsolete
admin_user_id and main-vs-non-main items.
Scripts
- scripts/init-first-agent.ts (new) replaces scripts/welcome-owner-dm.ts
(removed; welcome-owner was a Discord-specific one-off).
- test-v2-host.ts, test-v2-channel-e2e.ts, seed-discord.ts: drop
is_admin + admin_user_id, use unknown_sender_policy.
Tests
- src/access.test.ts (new): 14 tests for canAccessAgentGroup, role
helpers, pickApprover, ensureUserDm, pickApprovalDelivery.
- src/db/db-v2.test.ts: adds 3 tests for the auto-created
agent_destinations row (normalized name, no duplicates, collision
break within an agent group).
- host-core.test.ts, channel-registry.test.ts: updated fixtures to
use unknown_sender_policy: 'public' where the test exercises routing
rather than the access gate.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reshape AgentProvider so provider-specific assumptions stop leaking into
the generic layer. No change to what reaches sdkQuery() — same values,
different plumbing.
- QueryInput: opaque `continuation` replaces `sessionId` + `resumeAt`;
`systemContext.instructions` replaces ambiguous `systemPrompt`;
`mcpServers`, `env`, `additionalDirectories` move to `ProviderOptions`
at construction time.
- AgentProvider gains `isSessionInvalid(err)` and
`supportsNativeSlashCommands` so the poll-loop stops regex-matching
Claude error strings and gates passthrough slash commands per provider.
- ClaudeProvider owns `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and the
stale-session regex internally.
- ProviderEvent.activity kept and documented as the liveness signal
(fires on every SDK message so the idle timer stays honest during
long tool runs); init carries `continuation` instead of `sessionId`.
- poll-loop drops mcpServers/env/systemPrompt from its config; admin
user id now passed explicitly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an agent has one configured destination (e.g. Discord) but
receives a message from a different channel (e.g. Slack), the
single-destination shortcut was routing replies to the destination
instead of the originating channel. Now uses the inbound message's
routing context (channel_type, platform_id) when available, falling
back to the destination table only when routing context is absent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three features built on top of @onecli-sh/sdk 0.3.1, landed together because
they share wiring surfaces (session DB schema, delivery dispatcher, Chat SDK
bridge, channel adapter contract).
## OneCLI manual-approval handler
* `src/onecli-approvals.ts` — long-polls OneCLI via the SDK's
`configureManualApproval`; on each request, delivers an `ask_question` card
to the admin agent group's first messaging group, persists a
`pending_approvals` row, and waits on an in-memory Promise resolved by the
admin's button click or an expiry timer. Expired cards are edited to
"Expired (...)" and a startup sweep flushes any rows left over from a
previous process.
* Short 11-byte approval id (`oa-<8 base36>`) instead of the SDK's UUID so the
Telegram 64-byte `callback_data` limit is respected; the OneCLI UUID stays
in the persisted payload for audit.
* Migration 003 consolidated: `pending_approvals` now has the OneCLI-aware
columns from the start (`agent_group_id`, `channel_type`, `platform_id`,
`platform_message_id`, `expires_at`, `status`), `session_id` relaxed to
nullable so cross-session approvals fit.
* `handleQuestionResponse` in `src/index.ts` now routes OneCLI approvals
through `resolveOneCLIApproval` before falling back to the
session-bound approval path.
## Credential collection from chat
New `trigger_credential_collection` MCP tool — the agent researches a
third-party API, calls the tool with `{name, hostPattern, headerName,
valueFormat, description}`, and blocks until the host reports saved, rejected,
or failed. The credential value never enters the agent's context: the user
submits it into a Chat SDK Modal on the host side, the host writes it to
OneCLI via a thin facade (`src/onecli-secrets.ts` — shells out to
`onecli secrets create`, shape mirrors the SDK we expect upstream), and only
the status string flows back to the container via a system message.
* `src/credentials.ts` — host-side handler: delivers the card to the
conversation's own channel (not the admin channel — credential collection
is a user-facing flow, distinct from admin approval), persists a
`pending_credentials` row, drives the submit → `createSecret` → notify
pipeline. Falls back gracefully when the channel doesn't support modals.
* `src/db/credentials.ts` + migration 005: `pending_credentials` table.
* `src/channels/chat-sdk-bridge.ts`: renders a `credential_request` card,
handles the `nccr:` action prefix by opening a Modal with a TextInput,
registers an `onModalSubmit` handler for the `nccm:` callback prefix.
* `container/agent-runner/src/mcp-tools/credentials.ts`: the blocking MCP
tool, mirroring the `ask_user_question` polling pattern.
* `container/agent-runner/src/db/messages-in.ts`: `findCredentialResponse`
helper to pick up the system message the host writes back.
## Threaded adapter routing
The destination layer previously didn't carry thread context, so agent replies
to Discord always landed in the root channel regardless of which thread the
inbound came from.
* `ChannelAdapter.supportsThreads: boolean` — declared by every channel skill
at `createChatSdkBridge`. Threaded: Discord, Slack, Teams, Google Chat,
Linear, GitHub, Webex. Non-threaded: Telegram, WhatsApp Cloud, Matrix,
Resend, iMessage.
* `src/router.ts`: non-threaded adapters strip `threadId` at ingest (threads
collapse to channel-level sessions). Threaded adapters override the
wiring's `session_mode` to `'per-thread'` so each thread = a session
(except `agent-shared`, which is preserved as a cross-channel intent the
adapter can't know about).
* `session_routing` table in `inbound.db` — single-row default reply routing
written by the host on every container wake from
`session.messaging_group_id` + `session.thread_id`. Forward-compat
`CREATE TABLE IF NOT EXISTS` handles older session DBs lazily.
* `container/agent-runner/src/db/session-routing.ts` — container-side reader.
* `send_message` / `send_file` / `ask_user_question` / `send_card` /
scheduling tools all default their routing (channel, platform, **and**
thread) from the session when no explicit `to` is given. Explicit `to`
uses the destination's channel with `thread_id = null` (cross-destination
sends start a new conversation elsewhere).
* `poll-loop.ts::sendToDestination` (the final-text single-destination
shortcut) now inherits `thread_id` from `RoutingContext` too — this was
the root cause of Discord replies landing in the root channel even after
`send_message` was wired correctly.
## Related cleanups
* `src/container-runner.ts`: OneCLI agent identifier switched from the lossy
folder-derived string to `agent_group.id`, making `getAgentGroup(externalId)`
a trivial reverse lookup for per-agent scoping.
* `wakeContainer` race fix via an in-flight promise map — concurrent wakes
during the async buildContainerArgs / OneCLI `applyContainerConfig` window
no longer double-spawn containers against the same session directory.
* `src/db/db-v2.test.ts`: dropped the brittle `expect(row.v).toBe(N)` schema
version assertion — it had to be bumped on every migration addition.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The v2 poll loop held the session ID in a local variable, so every
container restart started a fresh SDK session even though the .jsonl
transcript was still sitting in the shared .claude mount. Store it in
outbound.db (container-owned, already per channel/thread), seed the
loop on startup, clear on /clear, and recover from stale-session
errors the same way v1 did.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an agent has exactly one configured destination, wrapping output in
<message to="..."> blocks is unnecessary. Plain text goes to the sole
destination automatically. This preserves the simple "just reply" flow
for the common case of one user on one channel.
Applies in three places:
- System prompt addendum: single-destination case gets a simplified
explanation ("your messages are delivered to X, just write directly").
Multi-destination case keeps the <message to="..."> syntax docs.
- Main output parser: if zero <message> blocks are found and there is
exactly one destination, the entire cleaned text (with <internal>
stripped) is sent to that destination.
- send_message / send_file MCP tools: `to` parameter is now optional.
With one destination, omitted defaults to it. With multiple, omitting
returns an error listing the options.
Multi-destination behavior is unchanged — explicit <message to="..."> is
still required, and untagged text is still scratchpad.
groups/global/CLAUDE.md updated to describe both cases.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces implicit routing context (NANOCLAW_PLATFORM_ID env vars) with
per-agent named destination maps. Agents reference channels and peer
agents by local names; the host re-validates every outbound route against
a new agent_destinations table that is both the routing map and the ACL.
Model changes:
- New migration 004 adds agent_destinations (agent_group_id, local_name,
target_type, target_id). Backfills from existing messaging_group_agents.
- Host writes /workspace/.nanoclaw-destinations.json before every container
wake so admin changes take effect on next start.
- Container loads map at startup, appends system-prompt addendum listing
available destinations and the <message to="name">…</message> syntax.
- Agent main output is parsed for <message to="..."> blocks; each block
becomes a messages_out row with routing resolved via the local map.
Untagged text and <internal>…</internal> are scratchpad (logged only).
- send_message MCP tool now takes `to` (destination name) instead of raw
routing fields. send_to_agent deleted (redundant — agents are just
destinations). send_file/edit_message/add_reaction route via map too.
- Inbound formatter adds from="name" attribute via reverse-lookup so the
agent sees a consistent namespace in both directions.
Permission enforcement:
- Host checks hasDestination() before every channel delivery AND every
agent-to-agent route. Unauthorized messages dropped and logged.
- routeAgentMessage simplified: ~15 lines, no JSON parse, content copied
verbatim (target formatter resolves the sender via its own local map).
- create_agent is admin-only, checked at both the container (tool not
registered for non-admins) and the host (re-check on receive). Inserts
bidirectional destination rows so parent↔child comms work immediately.
Includes path-traversal guard on folder name.
Self-modification cleanup:
- add_mcp_server now requires admin approval (previously had none).
- install_packages validates package names on BOTH sides (container tool
+ host receiver) with strict regex. Max 20 packages per request.
- All three self-mod tools are fire-and-forget: write request, return
immediately with "submitted" message. Admin approval triggers a chat
notification to the requesting agent — no tool-call polling, no 5-min
holds. On rebuild/mcp_server approval, the container is killed so the
next wake picks up new config/image.
- Approval delivery extracted into requestApproval() helper (the one
place where three call sites were literally identical).
Also folded in the phase-1 dynamic import cleanup (create_agent no longer
does `await import('./db/agent-groups.js')`) and removes NANOCLAW_PLATFORM_ID
/ CHANNEL_TYPE / THREAD_ID env-var routing entirely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates SQLite write contention across the host-container mount
boundary by splitting the single session.db into two files, each with
exactly one writer:
inbound.db — host writes (messages_in, delivered tracking)
outbound.db — container writes (messages_out, processing_ack)
Key changes:
- Host uses even seq numbers, container uses odd (collision-free)
- Container heartbeat via file touch instead of DB UPDATE
- Scheduling MCP tools now emit system actions via messages_out
(host applies them to inbound.db during delivery)
- Host sweep reads processing_ack + heartbeat file for stale detection
- OneCLI ensureAgent() call added (was missing from v2, caused
applyContainerConfig to reject unknown agent identifiers)
Verified: tsc clean, 327 tests pass, real e2e through Docker works.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace in-memory Chat SDK state with SqliteStateAdapter — thread
subscriptions now persist across restarts
- Add migration 002 for chat_sdk_kv, subscriptions, locks, lists tables
- Handle /clear in agent-runner (reset sessionId) — SDK has
supportsNonInteractive:false for this command
- Pass /compact, /context, /cost, /files through to SDK as admin commands
- Skip admin commands in follow-up poll so they start fresh queries
- Emit compact_boundary events as user-visible feedback messages
- Pass NANOCLAW_ADMIN_USER_ID and NANOCLAW_ASSISTANT_NAME to containers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
End-to-end ask_user_question flow:
- Agent MCP tool writes question card to messages_out
- Host delivery creates pending_questions row, delivers as Discord Card with buttons
- Local webhook server receives Gateway INTERACTION_CREATE events
- Acknowledges interaction + updates card to show selected answer
- Routes response back to session DB as system message
- MCP tool poll picks up response and returns to agent
Key fixes:
- Poll loop now skips system messages (reserved for MCP tool responses)
- Gateway listener uses webhookUrl forwarding mode for interaction support
- Button custom_id encodes questionId + option text for self-contained routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Host sweep: fix DELETE journal mode, busy_timeout, seq in recurrence INSERT
- Outbound files: delivery reads from outbox dir, passes buffers to adapter,
cleans up after delivery. Chat SDK bridge sends files via postMessage.
- Inbound attachments: formatter includes attachment info in prompts
- Commands: categorize /commands as admin, filtered, or passthrough.
Admin commands check sender against NANOCLAW_ADMIN_USER_ID.
Filtered commands silently dropped. Passthrough sent raw to agent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use DELETE journal mode for session DBs instead of WAL. WAL doesn't
sync reliably across Docker volume mounts (VirtioFS), causing dropped
writes and duplicate deliveries.
- Add 20s idle detection to end the query stream. The concurrent poll
tracks SDK activity via a new 'activity' provider event. When no SDK
events arrive for 20s and no messages are pending, the stream ends
and the poll loop continues.
- Add touchProcessing heartbeat so the host can distinguish active
agents from idle ones by checking status_changed recency.
- Catch query errors in the poll loop and write error responses to
messages_out instead of crashing the process.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AgentProvider abstraction with Claude and Mock implementations.
Poll loop reads messages_in, formats by kind, queries provider,
writes results to messages_out. Concurrent polling pushes follow-up
messages into active queries.
- providers/types.ts: AgentProvider, AgentQuery, ProviderEvent
- providers/claude.ts: wraps Agent SDK with MessageStream, hooks,
transcript archiving
- providers/mock.ts: canned responses with push() support
- providers/factory.ts: createProvider()
- formatter.ts: format by kind (chat/task/webhook/system), XML
escaping, routing extraction
- poll-loop.ts: poll → format → query → write, concurrent polling
- mcp-tools.ts: MCP server with send_message tool
- index-v2.ts: new entry point (config from env, enters poll loop)
- 11 new tests, all 288 tests pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>