Commit Graph

8 Commits

Author SHA1 Message Date
glifocat
12719be6e1 feat(poll-loop): inject destination reminder after SDK auto-compaction
Closes qwibitai/nanoclaw#2325.

When the Claude Code SDK auto-compacts the conversation context, the
compaction summary tends to drop the agent's learned <message to="…">
wrapping discipline. The destinations table is still populated and the
system prompt still lists them, but the behavioral pattern degrades —
A2A sends and multi-channel routing silently revert to bare-text or
single-channel delivery for the rest of the session, until the next
/clear.

Three small changes wire a reminder back into the live query when this
fires:

- New `compacted` event on ProviderEvent. Distinct from `result` so it
  doesn't mark the turn completed or get dispatched as a chat message
  (which is also why "Context compacted (N tokens compacted)." stops
  appearing as noise in user-facing chats — it was a side-effect of
  reusing the result event path).
- ClaudeProvider yields `compacted` instead of `result` for the SDK's
  compact_boundary system event.
- Poll-loop's event handler reacts by pushing a system-tagged reminder
  back into the active query when there are >1 destinations. Single-
  destination groups skip the push since they have a fallback that
  works without wrapping.

Tests cover both branches (multi-destination → reminder fires;
single-destination → no reminder) using a CompactingProvider that
emits the new event mid-stream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:11:25 +02:00
Adam
81ef193e69 refactor(session-state): key continuations per provider to survive provider switches
Before, every provider stored its opaque continuation id under the
single outbound.db key `sdk_session_id`. Flipping a session's
agent_provider (e.g. Codex → Claude) meant the new provider read the
old provider's id at wake, handed it to its own SDK, and got a
"No conversation found" error that cost the user one sacrificed
message before the stale-session recovery path cleared the id.

This reshapes session_state so continuations are keyed
`continuation:<provider>` instead. Consequences:

- Per-provider continuations coexist. Flipping Claude → Codex → Claude
  resumes the Claude thread exactly where it left off, with the
  intervening Codex thread also still on file.
- No provider ever reads another provider's id. Switching costs no
  sacrificed message and emits no transient error.
- Legacy installs are migrated forward on first startup:
  migrateLegacyContinuation() adopts any pre-existing `sdk_session_id`
  row into the current provider's slot (best guess — it was whichever
  provider ran last), then deletes the legacy row unconditionally so
  it can't poison a future provider's read.

runPollLoop now takes providerName alongside the provider instance,
and threads it through processQuery to setContinuation on init.

Tests: 9 new tests covering set/get isolation across providers,
clear-specificity, legacy-adoption, legacy-always-deleted,
prefer-existing-slot-over-legacy, and idempotency of a second
migration call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:34:28 +10:00
gavrielc
c5d0ef8b4f feat(v2): migrate container runtime to Bun, improve image build surface
Container side:
- agent-runner switches to Bun. Drops better-sqlite3 (native compile gone),
  drops tsc build step in-image AND the tsc-on-every-session-wake in the
  entrypoint — bun runs src/index.ts directly. bun:sqlite replaces
  better-sqlite3; cross-mount DB invariants (journal_mode=DELETE, busy_timeout)
  preserved. Named params converted from @name to $name because bun:sqlite
  does not auto-strip the prefix the way better-sqlite3 does.
- Tests ported from vitest to bun:test (only describe/it/expect/before/afterEach
  used, API-compatible). vitest.config.ts excludes container/agent-runner/.
- bun.lock replaces pnpm-lock.yaml + pnpm-workspace.yaml under
  container/agent-runner/. Host pnpm workspace does NOT include this tree.

Dockerfile improvements (independent of Bun but bundled while touching the file):
- tini as PID 1 for correct SIGTERM propagation (prevents half-written
  outbound.db on shutdown).
- Extracted entrypoint.sh — readable and diffable vs the old inline printf.
- BuildKit cache mounts for apt + bun install + pnpm install.
- --no-install-recommends on apt, pinned CLAUDE_CODE_VERSION, AGENT_BROWSER,
  VERCEL, BUN_VERSION.
- CJK fonts (~200MB) behind ARG INSTALL_CJK_FONTS=false; build.sh reads from
  .env; setup/container.ts reads the same .env so /setup and manual rebuild
  stay in sync.
- PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in case any postinstall tries to pull a
  redundant Chromium.
- /home/node 755 (was 777).

Host side:
- src/container-runner.ts dynamic spawn command collapses from
  `pnpm exec tsc --outDir /tmp/dist … && node /tmp/dist/index.js` to
  `exec bun run /app/src/index.ts` — cold start ~200-500ms faster per wake.

CI:
- oven-sh/setup-bun@v2 alongside Node/pnpm. Adds explicit container
  typecheck (was documented in CLAUDE.md, not enforced) and `bun test` for
  agent-runner tests.
2026-04-17 11:38:01 +03:00
gavrielc
b63dd186df refactor(agent-runner): decouple provider interface from Claude specifics
Reshape AgentProvider so provider-specific assumptions stop leaking into
the generic layer. No change to what reaches sdkQuery() — same values,
different plumbing.

- QueryInput: opaque `continuation` replaces `sessionId` + `resumeAt`;
  `systemContext.instructions` replaces ambiguous `systemPrompt`;
  `mcpServers`, `env`, `additionalDirectories` move to `ProviderOptions`
  at construction time.
- AgentProvider gains `isSessionInvalid(err)` and
  `supportsNativeSlashCommands` so the poll-loop stops regex-matching
  Claude error strings and gates passthrough slash commands per provider.
- ClaudeProvider owns `CLAUDE_CODE_AUTO_COMPACT_WINDOW` and the
  stale-session regex internally.
- ProviderEvent.activity kept and documented as the liveness signal
  (fires on every SDK message so the idle timer stays honest during
  long tool runs); init carries `continuation` instead of `sessionId`.
- poll-loop drops mcpServers/env/systemPrompt from its config; admin
  user id now passed explicitly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:25:29 +03:00
gavrielc
b591d7ce96 refactor: move destinations from JSON file into inbound.db
The per-session destination map was being written as a sidecar JSON file
(/workspace/.nanoclaw-destinations.json) — inconsistent with the rest of
v2, where all host↔container IO goes through inbound.db / outbound.db.

Move it into a `destinations` table in INBOUND_SCHEMA. The host writes
it before every container wake AND on demand (e.g. after create_agent)
so the creator sees the new child destination mid-session without a
restart. The container queries the table live on every lookup — no
cache, no staleness window.

- src/db/schema.ts: add `destinations` table to INBOUND_SCHEMA.
- src/session-manager.ts: writeDestinationsFile → writeDestinations,
  writes via DELETE + INSERT inside a transaction.
- src/delivery.ts: create_agent handler calls writeDestinations on the
  creator's session after inserting the new destination rows.
- container/agent-runner/src/destinations.ts: queries inbound.db
  directly in every findByName/getAllDestinations/findByRouting call.
  No more cache. No setDestinationsForTest (obsolete). No fs import.
- container/agent-runner/src/index.ts and mcp-tools/index.ts: remove
  loadDestinations() calls — no longer needed.
- Test helper initTestSessionDb creates the destinations table.
  Integration test inserts a row directly instead of mocking the cache.

No backwards compatibility: sessions predating the schema update must
be recreated. This is fine on the v2 branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:45:53 +03:00
gavrielc
e83ffbc103 feat: named destinations + permission enforcement + fire-and-forget self-mod
Replaces implicit routing context (NANOCLAW_PLATFORM_ID env vars) with
per-agent named destination maps. Agents reference channels and peer
agents by local names; the host re-validates every outbound route against
a new agent_destinations table that is both the routing map and the ACL.

Model changes:
- New migration 004 adds agent_destinations (agent_group_id, local_name,
  target_type, target_id). Backfills from existing messaging_group_agents.
- Host writes /workspace/.nanoclaw-destinations.json before every container
  wake so admin changes take effect on next start.
- Container loads map at startup, appends system-prompt addendum listing
  available destinations and the <message to="name">…</message> syntax.
- Agent main output is parsed for <message to="..."> blocks; each block
  becomes a messages_out row with routing resolved via the local map.
  Untagged text and <internal>…</internal> are scratchpad (logged only).
- send_message MCP tool now takes `to` (destination name) instead of raw
  routing fields. send_to_agent deleted (redundant — agents are just
  destinations). send_file/edit_message/add_reaction route via map too.
- Inbound formatter adds from="name" attribute via reverse-lookup so the
  agent sees a consistent namespace in both directions.

Permission enforcement:
- Host checks hasDestination() before every channel delivery AND every
  agent-to-agent route. Unauthorized messages dropped and logged.
- routeAgentMessage simplified: ~15 lines, no JSON parse, content copied
  verbatim (target formatter resolves the sender via its own local map).
- create_agent is admin-only, checked at both the container (tool not
  registered for non-admins) and the host (re-check on receive). Inserts
  bidirectional destination rows so parent↔child comms work immediately.
  Includes path-traversal guard on folder name.

Self-modification cleanup:
- add_mcp_server now requires admin approval (previously had none).
- install_packages validates package names on BOTH sides (container tool
  + host receiver) with strict regex. Max 20 packages per request.
- All three self-mod tools are fire-and-forget: write request, return
  immediately with "submitted" message. Admin approval triggers a chat
  notification to the requesting agent — no tool-call polling, no 5-min
  holds. On rebuild/mcp_server approval, the container is killed so the
  next wake picks up new config/image.
- Approval delivery extracted into requestApproval() helper (the one
  place where three call sites were literally identical).

Also folded in the phase-1 dynamic import cleanup (create_agent no longer
does `await import('./db/agent-groups.js')`) and removes NANOCLAW_PLATFORM_ID
/ CHANNEL_TYPE / THREAD_ID env-var routing entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:31:37 +03:00
gavrielc
82cb363f84 v2: split session DB into inbound/outbound for write isolation
Eliminates SQLite write contention across the host-container mount
boundary by splitting the single session.db into two files, each with
exactly one writer:

  inbound.db  — host writes (messages_in, delivered tracking)
  outbound.db — container writes (messages_out, processing_ack)

Key changes:
- Host uses even seq numbers, container uses odd (collision-free)
- Container heartbeat via file touch instead of DB UPDATE
- Scheduling MCP tools now emit system actions via messages_out
  (host applies them to inbound.db during delivery)
- Host sweep reads processing_ack + heartbeat file for stale detection
- OneCLI ensureAgent() call added (was missing from v2, caused
  applyContainerConfig to reject unknown agent identifiers)

Verified: tsc clean, 327 tests pass, real e2e through Docker works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:17:31 +03:00
gavrielc
18d0b6e53f v2: add agent-runner integration tests
Poll loop end-to-end with mock provider: message pickup, batch
processing, concurrent polling for late arrivals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:40:00 +03:00