nanoclaw

Author	SHA1	Message	Date
gavrielc	6d6584d120	fix(test-infra): openInboundDb honors in-memory test DB openInboundDb() always opened /workspace/inbound.db which doesn't exist in CI. In test mode, return a thin wrapper over the in-memory singleton that delegates prepare/exec but no-ops close(), so callers' try/finally cleanup doesn't destroy the shared DB mid-test. One flag (_testMode), no monkey-patching, no saved-close bookkeeping. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-05 16:02:10 +03:00
Mike Nolet	ceb0b9cf5f	fix(test-infra): openInboundDb honors in-memory test DB initTestSessionDb() creates an in-memory inbound singleton, but openInboundDb() always opened the hardcoded /workspace/inbound.db path. Every test that exercised getPendingMessages — directly, or via test fixtures that load data through it (e.g. poll-loop.test.ts:29 loads formatter test rows via getPendingMessages) — failed with SQLITE_CANTOPEN under `bun test` outside a real container. Baseline on main: 34 pass, 25 fail across 6 files. After this fix: 59 pass, 0 fail. In test mode, openInboundDb returns the in-memory singleton. The singleton's .close() is no-op'd in initTestSessionDb so caller try/finally cleanup doesn't tear down the shared DB; closeSessionDb invokes the saved original close to do the real teardown. Production behavior is unchanged — _inboundIsTest only flips inside initTestSessionDb, which is never called outside the test runner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 08:45:23 +02:00
Claw	ccfdf2dd75	fix(agent-runner): open inbound.db fresh per messages_in read Cached singleton can return stale rows on virtiofs/NFS mounts, causing follow-up messages to silently never be polled. Add openInboundDb() with mmap_size=0 and switch the three messages_in readers to it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 15:14:04 -04:00
exe.dev user	8a12fa61ac	refactor: shared source — replace per-group agent-runner copies with single RO mount Replace the per-group agent-runner-src copy model with a single shared read-only mount. Source and skills are now RO + shared; personality, config, working files, and Claude state stay RW + per-group. Key changes: - Mount container/agent-runner/src/ RO at /app/src (all groups share one copy) - Mount container/skills/ RO at /app/skills; per-group skill selection via symlinks in .claude-shared/skills/ based on container.json "skills" field - Mount container.json as nested RO bind on top of RW group dir - Move all NANOCLAW_* env vars to container.json (runner reads at startup) - New runner config.ts module replaces process.env reads - Move command gate (filtered/admin) from container to host router - Dockerfile: remove source COPY, split CLI installs (claude-code last), move agent-runner deps above CLIs for better layer caching - Add writeOutboundDirect for router denial responses - Design doc at docs/shared-src.md Not included (follow-up): DB migration to drop agent_provider columns, cleanup of orphaned agent-runner-src directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-22 12:58:43 +03:00
gavrielc	31f2da9585	fix(container): gate poll loop on trigger=1 to honor accumulate contract A warm container picks up every pending messages_in row on each poll tick and calls markProcessing → agent.query → markCompleted. Before this, that included trigger=0 rows (ignored_message_policy='accumulate' context), causing the agent to wake and potentially respond to messages the wiring had explicitly opted out of engaging on — defeating accumulate's "store as context, don't engage" contract. Gate the main poll loop with `messages.some(m => m.trigger === 1)` — mirrors host-side countDueMessages which is already gated. If the batch has no wake-eligible row, sleep and leave them pending. They ride along via the same getPendingMessages query the next time a real trigger=1 lands, which is the intended accumulate behavior. The concurrent active-turn poll (line ~290) is unchanged on purpose — once the agent has engaged, pushing in accumulate rows mid-turn as additional context is desired. initTestSessionDb was missing the trigger and series_id columns on messages_in, out of sync with the live migration. Added both so the new tests (and any future trigger-aware tests) can run. Four tests cover the data contract: trigger=0 rows are returned by getPendingMessages (so they ride along), the gate predicate correctly identifies accumulate-only batches, mixed batches pass the gate, and the schema default of 1 applies when the column is omitted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 11:23:47 +03:00
gavrielc	6a815190c0	feat(lifecycle): stuck detection + heartbeat lifecycle + SDK tool blocklist Replaces the two overlapping old mechanisms (30-min setTimeout kill in container-runner, 10-min heartbeat STALE_THRESHOLD reset in host-sweep) with message-scoped stuck detection anchored to the processing_ack claim age + an absolute 30-min ceiling that extends for long-declared Bash tools. Old model problems: - IDLE_TIMEOUT setTimeout fired on plain wall-clock time; slow-but-alive agents got killed at 30min regardless of activity - 10-min STALE_THRESHOLD in the sweep was unreliable — the heartbeat is only touched on SDK events, so legitimate silent tool work (sleep 30, long WebFetch, npm install) looked identical to a hung container - Two overlapping sources of truth for "when to let go of a container" New model: - Host sweep is the single source of truth. - Container exposes a new `container_state` single-row table in outbound.db (schema added; container writes, host reads). PreToolUse hook writes current_tool + tool_declared_timeout_ms (read from Bash's tool_input); PostToolUse / PostToolUseFailure clear it. - Sweep decides with a pure helper `decideStuckAction`: * absolute ceiling — kill if heartbeat age > max(30min, bash_timeout) * per-claim stuck — kill if any processing_ack row has claim_age > max(60s, bash_timeout) AND heartbeat hasn't been touched since claim * otherwise ok Kill paths reset leftover processing rows with exponential backoff, reusing the existing retry machinery. Tool blocklist expanded: - AskUserQuestion (SDK placeholder; we have mcp__nanoclaw__ask_user_question) - EnterPlanMode, ExitPlanMode, EnterWorktree, ExitWorktree (Claude Code UI affordances; would hang in headless containers) PreToolUse hook is also defense-in-depth: if a disallowed tool name slips through, it returns `{ decision: 'block' }` so the agent sees a clear error instead of appearing stuck. Removed: - container-runner.ts: IDLE_TIMEOUT setTimeout, resetIdle callback on activeContainers entry, resetContainerIdleTimer export. - delivery.ts: the resetContainerIdleTimer call on successful delivery. - poll-loop.ts: IDLE_END_MS + its setInterval. Keeping the query open is cheaper than close+reopen (no cold prompt cache). Liveness is now a host-side concern. - host-sweep.ts: 10-min STALE_THRESHOLD_MS + getStuckProcessingIds in the stale-detection path (still exported for kill reset). Tests: - src/host-sweep.test.ts — 9 tests for decideStuckAction covering: fresh heartbeat, absolute ceiling, absent heartbeat, Bash-timeout extension (both ceiling and per-claim), claim age below tolerance, heartbeat touched after claim, unparseable timestamps. Ref: docs/v1-vs-v2/ACTION-ITEMS.md items 9, 6a, 10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:16:57 +03:00
gavrielc	c5d0ef8b4f	feat(v2): migrate container runtime to Bun, improve image build surface Container side: - agent-runner switches to Bun. Drops better-sqlite3 (native compile gone), drops tsc build step in-image AND the tsc-on-every-session-wake in the entrypoint — bun runs src/index.ts directly. bun:sqlite replaces better-sqlite3; cross-mount DB invariants (journal_mode=DELETE, busy_timeout) preserved. Named params converted from @name to $name because bun:sqlite does not auto-strip the prefix the way better-sqlite3 does. - Tests ported from vitest to bun:test (only describe/it/expect/before/afterEach used, API-compatible). vitest.config.ts excludes container/agent-runner/. - bun.lock replaces pnpm-lock.yaml + pnpm-workspace.yaml under container/agent-runner/. Host pnpm workspace does NOT include this tree. Dockerfile improvements (independent of Bun but bundled while touching the file): - tini as PID 1 for correct SIGTERM propagation (prevents half-written outbound.db on shutdown). - Extracted entrypoint.sh — readable and diffable vs the old inline printf. - BuildKit cache mounts for apt + bun install + pnpm install. - --no-install-recommends on apt, pinned CLAUDE_CODE_VERSION, AGENT_BROWSER, VERCEL, BUN_VERSION. - CJK fonts (~200MB) behind ARG INSTALL_CJK_FONTS=false; build.sh reads from .env; setup/container.ts reads the same .env so /setup and manual rebuild stay in sync. - PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in case any postinstall tries to pull a redundant Chromium. - /home/node 755 (was 777). Host side: - src/container-runner.ts dynamic spawn command collapses from `pnpm exec tsc --outDir /tmp/dist … && node /tmp/dist/index.js` to `exec bun run /app/src/index.ts` — cold start ~200-500ms faster per wake. CI: - oven-sh/setup-bun@v2 alongside Node/pnpm. Adds explicit container typecheck (was documented in CLAUDE.md, not enforced) and `bun test` for agent-runner tests.	2026-04-17 11:38:01 +03:00
gavrielc	9dda75bb21	docs(v2): cross-mount invariants + diagrams; inline a2a routing - session-manager.ts: shrink the cross-mount invariant header from 31 lines to 12, keeping each invariant's cause and consequence inline. - agent-runner/db/connection.ts: parallel cross-mount comment for the container-side reader (inbound.db must be journal_mode=DELETE). - agent-runner/db/messages-out.ts: document that even/odd seq parity is load-bearing — seq is the agent-facing message ID returned by send_message and consumed by edit_message / add_reaction, looked up across both tables. - v2-checklist.md: record the cross-mount invariants and seq parity under Core Architecture so future "simplifications" don't regress them. - scripts/sanity-live-poll.ts: empirical validation harness for the three cross-mount invariants — flips each one and observes silent message loss / corruption. - delivery.ts: inline routeAgentMessage at its single callsite (-17 net lines). The wrapper added more boilerplate than it factored. - docs/v2-architecture-diagram.{md,html}: rendered Mermaid diagrams of the v2 system, message flow, named destinations, entity model, and the two-DB split. - channels/adapter.ts, chat-sdk-bridge.ts, credentials.ts, db/sessions.ts, db/db-v2.test.ts: prettier format pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 00:21:12 +03:00
gavrielc	062b0cb6bf	fix(agent-runner): add updated_at column to session_state on older DBs session_state was added after the initial v2 schema with a lazy `CREATE TABLE IF NOT EXISTS` in getOutboundDb(), so older session outbound.db files have a session_state table from before updated_at existed. The lazy create is a no-op when the table already exists, leaving the column missing and causing: Error: table session_state has no column named updated_at on every `INSERT OR REPLACE INTO session_state` call. Follow up the CREATE IF NOT EXISTS with a PRAGMA table_info check and ALTER TABLE ADD COLUMN when updated_at is missing. Cheap on every open, only runs DDL once per DB. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:18:34 +03:00
gavrielc	b59216c299	fix(v2): persist SDK session ID across container restarts The v2 poll loop held the session ID in a local variable, so every container restart started a fresh SDK session even though the .jsonl transcript was still sitting in the shared .claude mount. Store it in outbound.db (container-owned, already per channel/thread), seed the loop on startup, clear on /clear, and recover from stale-session errors the same way v1 did. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 01:17:42 +03:00
gavrielc	b591d7ce96	refactor: move destinations from JSON file into inbound.db The per-session destination map was being written as a sidecar JSON file (/workspace/.nanoclaw-destinations.json) — inconsistent with the rest of v2, where all host↔container IO goes through inbound.db / outbound.db. Move it into a `destinations` table in INBOUND_SCHEMA. The host writes it before every container wake AND on demand (e.g. after create_agent) so the creator sees the new child destination mid-session without a restart. The container queries the table live on every lookup — no cache, no staleness window. - src/db/schema.ts: add `destinations` table to INBOUND_SCHEMA. - src/session-manager.ts: writeDestinationsFile → writeDestinations, writes via DELETE + INSERT inside a transaction. - src/delivery.ts: create_agent handler calls writeDestinations on the creator's session after inserting the new destination rows. - container/agent-runner/src/destinations.ts: queries inbound.db directly in every findByName/getAllDestinations/findByRouting call. No more cache. No setDestinationsForTest (obsolete). No fs import. - container/agent-runner/src/index.ts and mcp-tools/index.ts: remove loadDestinations() calls — no longer needed. - Test helper initTestSessionDb creates the destinations table. Integration test inserts a row directly instead of mocking the cache. No backwards compatibility: sessions predating the schema update must be recreated. This is fine on the v2 branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 16:45:53 +03:00
gavrielc	d8fbd3b239	feat: agent-to-agent communication, dynamic agent creation, self-modification tools Agent-to-agent: host routes messages with channel_type='agent' to target agent's inbound.db, enriches with sender info, wakes target container. Bidirectional routing works via inherited routing context. Dynamic agents: create_agent MCP tool + system action handler creates agent groups, folders, and optional CLAUDE.md on the fly. Self-modification: install_packages (apt/npm, requires admin approval), add_mcp_server (no approval), request_rebuild (builds per-agent-group Docker image with approved packages). Approval flow reuses interactive card infrastructure with pending_approvals table. Also includes fixes from prior session: attachment download, reply context extraction, message editing (platform message ID tracking), delivery retry limits, and card update on button click. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 01:11:06 +03:00
gavrielc	82cb363f84	v2: split session DB into inbound/outbound for write isolation Eliminates SQLite write contention across the host-container mount boundary by splitting the single session.db into two files, each with exactly one writer: inbound.db — host writes (messages_in, delivered tracking) outbound.db — container writes (messages_out, processing_ack) Key changes: - Host uses even seq numbers, container uses odd (collision-free) - Container heartbeat via file touch instead of DB UPDATE - Scheduling MCP tools now emit system actions via messages_out (host applies them to inbound.db during delivery) - Host sweep reads processing_ack + heartbeat file for stale detection - OneCLI ensureAgent() call added (was missing from v2, caused applyContainerConfig to reject unknown agent identifiers) Verified: tsc clean, 327 tests pass, real e2e through Docker works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 12:17:31 +03:00
gavrielc	afbc20a6c4	v2 phase 4+5: Discord via Chat SDK, expanded MCP tools, message seq IDs - Chat SDK bridge + Discord adapter (gateway listener, message routing) - MCP tools refactored into modular structure: core (send_message, send_file, edit_message, add_reaction), scheduling (schedule/list/cancel/pause/resume tasks), interactive (ask_user_question, send_card), agents (send_to_agent) - Message seq IDs: shared integer sequence across messages_in/out so agents see small numeric IDs instead of platform snowflakes - busy_timeout=5000 for session DB (poll loop + MCP server concurrent access) - Always copy agent-runner source to fix stale cache when non-index files change - Seed script for Discord testing, e2e test script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 02:53:39 +03:00
gavrielc	6f2a7314d0	v2: fix agent-runner lifecycle and session DB reliability - Use DELETE journal mode for session DBs instead of WAL. WAL doesn't sync reliably across Docker volume mounts (VirtioFS), causing dropped writes and duplicate deliveries. - Add 20s idle detection to end the query stream. The concurrent poll tracks SDK activity via a new 'activity' provider event. When no SDK events arrive for 20s and no messages are pending, the stream ends and the poll loop continues. - Add touchProcessing heartbeat so the host can distinguish active agents from idle ones by checking status_changed recency. - Catch query errors in the poll loop and write error responses to messages_out instead of crashing the process. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 01:34:59 +03:00
gavrielc	3f0451b7b0	v2 phase 1: foundation — types, DB layer, logging Add the v2 data layer: typed interfaces, central DB with migration runner, per-entity CRUD, and agent-runner session DB operations. - src/log.ts: concise message-first logging API - src/types-v2.ts: AgentGroup, MessagingGroup, Session, MessageIn/Out - src/db/: connection (WAL), migration runner, 001-initial schema, CRUD for agent_groups, messaging_groups, sessions, pending_questions - container/agent-runner/src/db/: session DB connection, messages_in reads + status transitions, messages_out writes - 31 new tests, all 277 tests pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 23:34:09 +03:00

16 Commits