nanoclaw

Author	SHA1	Message	Date
gavrielc	3af6e70c05	test(agent-runner): add dispatch, origin metadata, and thread resolution tests Add 14 tests covering key routing and dispatch flows that previously had zero direct coverage: dispatchResultText: - bare text produces no outbound (scratchpad only) - unknown destination dropped, valid destination sent - multiple <message> blocks each produce correct outbound - internal tags stripped from scratchpad originAttr / from= metadata: - chat/task/webhook/system messages include from= when destination matches - fallback to raw unknown:channel:platform when no match - from= omitted when routing is null resolveDestinationThread: - null thread_id when no prior inbound from destination - most recent thread_id wins with multiple inbound messages Also fix merge issue: restore getAllDestinations import removed by our PR but still needed by #2327's compaction reminder. Fix stale destinations test assertion from #2328 ("no special wrapping needed" → "Every response must be wrapped"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-08 00:23:03 +03:00
gavrielc	9db39b291d	fix(agent-runner): require explicit destination addressing, fix per-destination threading The poll loop had a bare-text routing fallback in dispatchResultText: when the agent produced text without <message to="..."> wrapping, it would auto- route to the session's originating channel (via a frozen RoutingContext) or to the single configured destination. This caused three problems: 1. Routing drift: RoutingContext was extracted once from the initial batch and never refreshed. When the initial batch was a null-routed cron task and a real chat arrived mid-query, replies were silently dropped to scratchpad because the frozen routing had all-null fields. 2. Cross-channel thread bleed: sendToDestination applied a single routing.threadId to every outbound message regardless of destination. In agent-shared sessions (multiple channels sharing one session), one channel's thread ID was stamped onto messages to a different channel. 3. Inconsistent formatting: task, webhook, and system messages had no origin metadata in their formatted output, so the agent couldn't tell which destination they came from — even when the underlying messages_in rows carried routing fields. Changes: - Remove the bare-text routing fallbacks in dispatchResultText (both the routing-based and single-destination shortcuts). All agent output must be wrapped in <message to="name">...</message>. Bare text is scratchpad. - Update buildDestinationsSection() to require explicit wrapping for all groups, including single-destination. No more "no special wrapping needed" shortcut. - Resolve thread_id per-destination via resolveDestinationThread(), which queries messages_in for the most recent message matching the target channel+platform. Falls back to null (top-level channel message) when no prior inbound exists for that destination. - Extract originAttr() helper in formatter.ts and apply it to all message types. Tasks now render as <task from="dest" time="...">, webhooks as <webhook from="dest" source="..." event="...">, system responses as <system_response from="dest" ...>. The agent always sees where a message originated. - Add a PreCompact shell hook (compact-instructions.ts) that outputs custom compaction instructions, telling the compactor to preserve recent message XML structure and routing metadata in the summary. Wired via settings.json in the .claude-shared scaffold, with a migration path (ensurePreCompactHook) for existing groups. Relation to open PRs: - #2277 (mergeRouting) becomes unnecessary — the routing fallback it patches no longer exists. Can be closed. - #2327 (post-compaction destination reminder) is complementary — it handles the post-compaction push, this handles pre-compaction instructions. Both can merge independently. - #2328 (default routing instruction) is complementary — it adds "reply to the from= destination" guidance to the multi-destination section. Compatible with the unified instruction format here. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-07 19:47:46 +03:00
gavrielc	31f2da9585	fix(container): gate poll loop on trigger=1 to honor accumulate contract A warm container picks up every pending messages_in row on each poll tick and calls markProcessing → agent.query → markCompleted. Before this, that included trigger=0 rows (ignored_message_policy='accumulate' context), causing the agent to wake and potentially respond to messages the wiring had explicitly opted out of engaging on — defeating accumulate's "store as context, don't engage" contract. Gate the main poll loop with `messages.some(m => m.trigger === 1)` — mirrors host-side countDueMessages which is already gated. If the batch has no wake-eligible row, sleep and leave them pending. They ride along via the same getPendingMessages query the next time a real trigger=1 lands, which is the intended accumulate behavior. The concurrent active-turn poll (line ~290) is unchanged on purpose — once the agent has engaged, pushing in accumulate rows mid-turn as additional context is desired. initTestSessionDb was missing the trigger and series_id columns on messages_in, out of sync with the live migration. Added both so the new tests (and any future trigger-aware tests) can run. Four tests cover the data contract: trigger=0 rows are returned by getPendingMessages (so they ride along), the gate predicate correctly identifies accumulate-only batches, mixed batches pass the gate, and the schema default of 1 applies when the column is omitted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 11:23:47 +03:00
gavrielc	c5d0ef8b4f	feat(v2): migrate container runtime to Bun, improve image build surface Container side: - agent-runner switches to Bun. Drops better-sqlite3 (native compile gone), drops tsc build step in-image AND the tsc-on-every-session-wake in the entrypoint — bun runs src/index.ts directly. bun:sqlite replaces better-sqlite3; cross-mount DB invariants (journal_mode=DELETE, busy_timeout) preserved. Named params converted from @name to $name because bun:sqlite does not auto-strip the prefix the way better-sqlite3 does. - Tests ported from vitest to bun:test (only describe/it/expect/before/afterEach used, API-compatible). vitest.config.ts excludes container/agent-runner/. - bun.lock replaces pnpm-lock.yaml + pnpm-workspace.yaml under container/agent-runner/. Host pnpm workspace does NOT include this tree. Dockerfile improvements (independent of Bun but bundled while touching the file): - tini as PID 1 for correct SIGTERM propagation (prevents half-written outbound.db on shutdown). - Extracted entrypoint.sh — readable and diffable vs the old inline printf. - BuildKit cache mounts for apt + bun install + pnpm install. - --no-install-recommends on apt, pinned CLAUDE_CODE_VERSION, AGENT_BROWSER, VERCEL, BUN_VERSION. - CJK fonts (~200MB) behind ARG INSTALL_CJK_FONTS=false; build.sh reads from .env; setup/container.ts reads the same .env so /setup and manual rebuild stay in sync. - PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in case any postinstall tries to pull a redundant Chromium. - /home/node 755 (was 777). Host side: - src/container-runner.ts dynamic spawn command collapses from `pnpm exec tsc --outDir /tmp/dist … && node /tmp/dist/index.js` to `exec bun run /app/src/index.ts` — cold start ~200-500ms faster per wake. CI: - oven-sh/setup-bun@v2 alongside Node/pnpm. Adds explicit container typecheck (was documented in CLAUDE.md, not enforced) and `bun test` for agent-runner tests.	2026-04-17 11:38:01 +03:00
gavrielc	b63dd186df	refactor(agent-runner): decouple provider interface from Claude specifics Reshape AgentProvider so provider-specific assumptions stop leaking into the generic layer. No change to what reaches sdkQuery() — same values, different plumbing. - QueryInput: opaque `continuation` replaces `sessionId` + `resumeAt`; `systemContext.instructions` replaces ambiguous `systemPrompt`; `mcpServers`, `env`, `additionalDirectories` move to `ProviderOptions` at construction time. - AgentProvider gains `isSessionInvalid(err)` and `supportsNativeSlashCommands` so the poll-loop stops regex-matching Claude error strings and gates passthrough slash commands per provider. - ClaudeProvider owns `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and the stale-session regex internally. - ProviderEvent.activity kept and documented as the liveness signal (fires on every SDK message so the idle timer stays honest during long tool runs); init carries `continuation` instead of `sessionId`. - poll-loop drops mcpServers/env/systemPrompt from its config; admin user id now passed explicitly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 10:25:29 +03:00
gavrielc	82cb363f84	v2: split session DB into inbound/outbound for write isolation Eliminates SQLite write contention across the host-container mount boundary by splitting the single session.db into two files, each with exactly one writer: inbound.db — host writes (messages_in, delivered tracking) outbound.db — container writes (messages_out, processing_ack) Key changes: - Host uses even seq numbers, container uses odd (collision-free) - Container heartbeat via file touch instead of DB UPDATE - Scheduling MCP tools now emit system actions via messages_out (host applies them to inbound.db during delivery) - Host sweep reads processing_ack + heartbeat file for stale detection - OneCLI ensureAgent() call added (was missing from v2, caused applyContainerConfig to reject unknown agent identifiers) Verified: tsc clean, 327 tests pass, real e2e through Docker works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 12:17:31 +03:00
gavrielc	6f2a7314d0	v2: fix agent-runner lifecycle and session DB reliability - Use DELETE journal mode for session DBs instead of WAL. WAL doesn't sync reliably across Docker volume mounts (VirtioFS), causing dropped writes and duplicate deliveries. - Add 20s idle detection to end the query stream. The concurrent poll tracks SDK activity via a new 'activity' provider event. When no SDK events arrive for 20s and no messages are pending, the stream ends and the poll loop continues. - Add touchProcessing heartbeat so the host can distinguish active agents from idle ones by checking status_changed recency. - Catch query errors in the poll loop and write error responses to messages_out instead of crashing the process. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 01:34:59 +03:00
gavrielc	5a0098edc9	v2 phase 2: agent-runner — provider interface, poll loop, formatter AgentProvider abstraction with Claude and Mock implementations. Poll loop reads messages_in, formats by kind, queries provider, writes results to messages_out. Concurrent polling pushes follow-up messages into active queries. - providers/types.ts: AgentProvider, AgentQuery, ProviderEvent - providers/claude.ts: wraps Agent SDK with MessageStream, hooks, transcript archiving - providers/mock.ts: canned responses with push() support - providers/factory.ts: createProvider() - formatter.ts: format by kind (chat/task/webhook/system), XML escaping, routing extraction - poll-loop.ts: poll → format → query → write, concurrent polling - mcp-tools.ts: MCP server with send_message tool - index-v2.ts: new entry point (config from env, enters poll loop) - 11 new tests, all 288 tests pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 23:36:55 +03:00

8 Commits