The per-session destination map was being written as a sidecar JSON file
(/workspace/.nanoclaw-destinations.json) — inconsistent with the rest of
v2, where all host↔container IO goes through inbound.db / outbound.db.
Move it into a `destinations` table in INBOUND_SCHEMA. The host writes
it before every container wake AND on demand (e.g. after create_agent)
so the creator sees the new child destination mid-session without a
restart. The container queries the table live on every lookup — no
cache, no staleness window.
- src/db/schema.ts: add `destinations` table to INBOUND_SCHEMA.
- src/session-manager.ts: writeDestinationsFile → writeDestinations,
writes via DELETE + INSERT inside a transaction.
- src/delivery.ts: create_agent handler calls writeDestinations on the
creator's session after inserting the new destination rows.
- container/agent-runner/src/destinations.ts: queries inbound.db
directly in every findByName/getAllDestinations/findByRouting call.
No more cache. No setDestinationsForTest (obsolete). No fs import.
- container/agent-runner/src/index.ts and mcp-tools/index.ts: remove
loadDestinations() calls — no longer needed.
- Test helper initTestSessionDb creates the destinations table.
Integration test inserts a row directly instead of mocking the cache.
No backwards compatibility: sessions predating the schema update must
be recreated. This is fine on the v2 branch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces implicit routing context (NANOCLAW_PLATFORM_ID env vars) with
per-agent named destination maps. Agents reference channels and peer
agents by local names; the host re-validates every outbound route against
a new agent_destinations table that is both the routing map and the ACL.
Model changes:
- New migration 004 adds agent_destinations (agent_group_id, local_name,
target_type, target_id). Backfills from existing messaging_group_agents.
- Host writes /workspace/.nanoclaw-destinations.json before every container
wake so admin changes take effect on next start.
- Container loads map at startup, appends system-prompt addendum listing
available destinations and the <message to="name">…</message> syntax.
- Agent main output is parsed for <message to="..."> blocks; each block
becomes a messages_out row with routing resolved via the local map.
Untagged text and <internal>…</internal> are scratchpad (logged only).
- send_message MCP tool now takes `to` (destination name) instead of raw
routing fields. send_to_agent deleted (redundant — agents are just
destinations). send_file/edit_message/add_reaction route via map too.
- Inbound formatter adds from="name" attribute via reverse-lookup so the
agent sees a consistent namespace in both directions.
Permission enforcement:
- Host checks hasDestination() before every channel delivery AND every
agent-to-agent route. Unauthorized messages dropped and logged.
- routeAgentMessage simplified: ~15 lines, no JSON parse, content copied
verbatim (target formatter resolves the sender via its own local map).
- create_agent is admin-only, checked at both the container (tool not
registered for non-admins) and the host (re-check on receive). Inserts
bidirectional destination rows so parent↔child comms work immediately.
Includes path-traversal guard on folder name.
Self-modification cleanup:
- add_mcp_server now requires admin approval (previously had none).
- install_packages validates package names on BOTH sides (container tool
+ host receiver) with strict regex. Max 20 packages per request.
- All three self-mod tools are fire-and-forget: write request, return
immediately with "submitted" message. Admin approval triggers a chat
notification to the requesting agent — no tool-call polling, no 5-min
holds. On rebuild/mcp_server approval, the container is killed so the
next wake picks up new config/image.
- Approval delivery extracted into requestApproval() helper (the one
place where three call sites were literally identical).
Also folded in the phase-1 dynamic import cleanup (create_agent no longer
does `await import('./db/agent-groups.js')`) and removes NANOCLAW_PLATFORM_ID
/ CHANNEL_TYPE / THREAD_ID env-var routing entirely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Channel adapters prefix platform IDs with their channel type
(e.g. "telegram:123"). Normalize in register.ts so the DB always
stores the canonical format. Removes fallback lookup from router.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move prefix handling from register.ts to router.ts. Users register with
raw platform IDs (what they naturally have), adapters send prefixed IDs
(their internal format). Router now tries stripping the channel type
prefix when the exact lookup fails, matching either format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use platformId directly as thread ID in deliver() and setTyping()
instead of calling encodeThreadId with Discord-shaped args — platformId
is already in the adapter's encoded format (e.g. "telegram:6037840640")
- Add triggerTyping() in delivery.ts, call from router on message route
- Enable Telegram channel in barrel
- Verified E2E: Telegram message in → agent → typing indicator → response
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add three-level isolation model (shared session, same agent, separate agent)
with agent-shared session mode for cross-channel shared sessions
- Create /manage-channels skill for wiring channels to agent groups
- Refactor all 12 v2 channel skills: lean SKILL.md + VERIFY.md + REMOVE.md
with structured Channel Info section for platform-specific metadata
- Create /add-discord-v2 skill (was missing)
- Add step 5a to setup SKILL.md invoking /manage-channels after channel install
- Update setup/verify.ts to check all 12 channel token types
- Add docs/v2-isolation-model.md explaining the isolation model
- Update v2-checklist.md and v2-setup-wiring.md to reflect completed work
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Discord was directly imported in src/index.ts before the barrel wiring.
Moving to the barrel without uncommenting it broke Discord.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Import channel barrel from src/index.ts so channel skills that
uncomment lines in src/channels/index.ts actually execute
- Rewrite setup/register.ts to create v2 entities (agent_groups,
messaging_groups, messaging_group_agents) in data/v2.db instead
of v1's store/messages.db
- Fix setup/verify.ts to check v2 central DB for registered groups
- Add prominent "MESSAGE DROPPED" warnings in router when no agent
groups are wired, with actionable guidance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates SQLite write contention across the host-container mount
boundary by splitting the single session.db into two files, each with
exactly one writer:
inbound.db — host writes (messages_in, delivered tracking)
outbound.db — container writes (messages_out, processing_ack)
Key changes:
- Host uses even seq numbers, container uses odd (collision-free)
- Container heartbeat via file touch instead of DB UPDATE
- Scheduling MCP tools now emit system actions via messages_out
(host applies them to inbound.db during delivery)
- Host sweep reads processing_ack + heartbeat file for stale detection
- OneCLI ensureAgent() call added (was missing from v2, caused
applyContainerConfig to reject unknown agent identifiers)
Verified: tsc clean, 327 tests pass, real e2e through Docker works.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace `as never` cast with proper polyfill for channelIdFromThreadId.
Narrow GatewayAdapter cast to only the gateway code path in bridge.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move all v1 files (index, router, container-runner, db, ipc, types,
logger, channels/registry, and all utilities) to src/v1/ as a
fully self-contained archive with no shared dependencies
- Rename v2 files to remove -v2 suffix (index-v2.ts → index.ts, etc.)
- Update all imports across v2 source, tests, and setup files
- Migrate shared utilities (config, env, container-runtime, mount-security,
timezone, group-folder) from pino logger to v2 log module
- Migrate setup/ files from logger to log with argument order swap
- Container agent-runner: move v1 entry to v1/, rename v2 to index.ts
- Update setup skill to offer all 13 v2 channels
- Install all Chat SDK adapter packages
- dist/index.js now runs v2; dist/v1/index.js runs v1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace in-memory Chat SDK state with SqliteStateAdapter — thread
subscriptions now persist across restarts
- Add migration 002 for chat_sdk_kv, subscriptions, locks, lists tables
- Handle /clear in agent-runner (reset sessionId) — SDK has
supportsNonInteractive:false for this command
- Pass /compact, /context, /cost, /files through to SDK as admin commands
- Skip admin commands in follow-up poll so they start fresh queries
- Emit compact_boundary events as user-visible feedback messages
- Pass NANOCLAW_ADMIN_USER_ID and NANOCLAW_ASSISTANT_NAME to containers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
End-to-end ask_user_question flow:
- Agent MCP tool writes question card to messages_out
- Host delivery creates pending_questions row, delivers as Discord Card with buttons
- Local webhook server receives Gateway INTERACTION_CREATE events
- Acknowledges interaction + updates card to show selected answer
- Routes response back to session DB as system message
- MCP tool poll picks up response and returns to agent
Key fixes:
- Poll loop now skips system messages (reserved for MCP tool responses)
- Gateway listener uses webhookUrl forwarding mode for interaction support
- Button custom_id encodes questionId + option text for self-contained routing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Host sweep: fix DELETE journal mode, busy_timeout, seq in recurrence INSERT
- Outbound files: delivery reads from outbox dir, passes buffers to adapter,
cleans up after delivery. Chat SDK bridge sends files via postMessage.
- Inbound attachments: formatter includes attachment info in prompts
- Commands: categorize /commands as admin, filtered, or passthrough.
Admin commands check sender against NANOCLAW_ADMIN_USER_ID.
Filtered commands silently dropped. Passthrough sent raw to agent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use DELETE journal mode for session DBs instead of WAL. WAL doesn't
sync reliably across Docker volume mounts (VirtioFS), causing dropped
writes and duplicate deliveries.
- Add 20s idle detection to end the query stream. The concurrent poll
tracks SDK activity via a new 'activity' provider event. When no SDK
events arrive for 20s and no messages are pending, the stream ends
and the poll loop continues.
- Add touchProcessing heartbeat so the host can distinguish active
agents from idle ones by checking status_changed recency.
- Catch query errors in the poll loop and write error responses to
messages_out instead of crashing the process.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ChannelAdapter interface with setup/deliver/teardown/setTyping lifecycle.
Self-registration pattern via channel-registry. Host wiring in index-v2
bridges inbound messages to routeInbound and outbound delivery to adapters.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Override entrypoint to compile and run index-v2.js (no stdin)
- Add better-sqlite3 + @types to agent-runner dependencies
- Exclude test files from agent-runner tsconfig (Docker build)
- Add real e2e test script (host → container → Claude → session DB)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Host orchestrator connecting channel events to session DBs and
delivering responses back through channel adapters.
- session-manager.ts: session folder/DB lifecycle, message writing
- container-runner-v2.ts: Docker spawn with session + agent group
mounts, OneCLI, idle timeout, agent-runner recompilation
- router-v2.ts: inbound routing (channel → messaging group → agent
group → session → messages_in → wake container)
- delivery.ts: two-tier polling (1s active, 60s sweep) for
messages_out, channel adapter delivery
- host-sweep.ts: stale detection with backoff, recurrence, wake
containers for due messages
- index-v2.ts: thin entry point wiring everything together
- scripts/test-v2-agent.ts: real Claude provider integration test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add the v2 data layer: typed interfaces, central DB with migration
runner, per-entity CRUD, and agent-runner session DB operations.
- src/log.ts: concise message-first logging API
- src/types-v2.ts: AgentGroup, MessagingGroup, Session, MessageIn/Out
- src/db/: connection (WAL), migration runner, 001-initial schema,
CRUD for agent_groups, messaging_groups, sessions, pending_questions
- container/agent-runner/src/db/: session DB connection, messages_in
reads + status transitions, messages_out writes
- 31 new tests, all 277 tests pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Main group had no mount for the global memory directory
(/workspace/global), so it could only reach it through the read-only
project root. This meant the main agent couldn't write to global
memory despite groups/main/CLAUDE.md instructing it to do so.
Add a writable mount at /workspace/global for the isMain branch,
matching the read-only mount that non-main groups already have.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Session files (JSONLs, debug logs, todos, telemetry, group logs) accumulate
unboundedly — especially from daily cron tasks. This adds a cleanup script
that prunes old artifacts while protecting active sessions (read from DB),
and wires it into the main process on a 24h interval.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Mount store/ separately as read-write so the main agent can access
the SQLite database directly.
- Add requiresTrigger parameter to the register_group MCP tool
(host IPC already supported it, but the tool never exposed it).
Defaults to false (no trigger).
- Update group registration instructions to ask user about trigger.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add generic reply context fields to NewMessage (reply_to_message_id,
reply_to_message_content, reply_to_sender_name) so any channel can
pass quoted message context to the agent.
- Add thread_id and reply_to_* fields to NewMessage interface
- Add DB migration for reply context columns on messages table
- Update storeMessage/getMessagesSince/getNewMessages to persist and
retrieve reply fields
- Render reply context as <quoted_message> XML in formatMessages
- Add DB and formatting tests
Co-Authored-By: Alfred-the-buttler <leon.alfred.bot@gmail.com>
Co-Authored-By: moktamd <moktamd@users.noreply.github.com>
Co-Authored-By: gurixs-carson <gurixs-carson@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The original regex didn't match the actual error ("No conversation
found with session ID: ..."). Added `no conversation found` pattern.
Removed the inline retry — clearing the session and returning 'error'
lets the existing group-queue.ts backoff loop retry with a fresh
session naturally. Simpler, no duplicate error paths.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Claude Code exits with code 1 during a session resume because the
session transcript file no longer exists (ENOENT on .jsonl), clear the
stale session from SQLite and retry once with a fresh session.
Detection is targeted: only triggers on ENOENT referencing a .jsonl
file or explicit "session not found" errors. Transient failures
(network, API) fall through to the normal backoff retry path.
Also removes unrelated ollama files that were mixed in during rebase.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Claude Code exits with code 1 during a session resume, the group's
session ID is now cleared from the database and the query is retried with
a fresh session. This prevents the infinite retry loop that occurred when
a stale/corrupt session ID was stored in SQLite.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix container-runner bug: stopContainer() returns void but was
passed to exec() as a command string. Replace with direct call
and try/catch.
- Mock container-runtime in tests so they don't need Docker running.
- Increase claw-skill test timeout to handle slower python startup.
- Clear .env.example (telegram token was added by mistake).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When lastAgentTimestamp was missing (new group, corrupted state, or
startup recovery), the empty-string fallback caused getMessagesSince to
return up to 200 messages — the entire group history. This sent a
massive prompt to the container agent instead of just recent messages.
Fix: recover the cursor from the last bot reply timestamp in the DB
(proof of what we already processed), and cap all prompt queries to a
configurable MAX_MESSAGES_PER_PROMPT (default 10). Covers all three
call sites: processGroupMessages, the piping path, and
recoverPendingMessages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>