nanoclaw

Author	SHA1	Message	Date
gavrielc	f351e46008	refactor(approvals): persist title+options on channel/sender approval tables getAskQuestionRender used to hardcode the card title and option labels for pending_channel_approvals and pending_sender_approvals in the DB-access layer, duplicating wording that already lived in the approval modules. That caused a visible drift between the initial card title — picked per event in channel-approval.ts ("📣 Bot mentioned in new chat" vs. "💬 New direct message") — and the post-click render, which always showed the constant "📣 Channel registration". Mirror the pattern already used by pending_approvals: add title / options_json columns on both pending_*_approvals tables via migration 013, have the approval modules write them at creation time, and let getAskQuestionRender just SELECT. - Migration 013 ALTERs the two tables to add title + options_json. - PendingChannelApproval / PendingSenderApproval types and their create functions grow the two fields. - channel-approval.ts / sender-approval.ts normalize options once and pass both title and options_json into the insert. - getAskQuestionRender drops the hardcoded render objects and reads the stored values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:54:47 +03:00
Gabi Simons	a8eb82d529	Merge branch 'main' into main	2026-04-23 18:24:24 +03:00
gavrielc	dd5bc85b02	refactor(skill/atomic-chat-tool): ship MCP file in skill folder, revert src edits The initial /add-atomic-chat-tool merge added src edits directly to main. That conflicts with the utility-skill pattern used elsewhere (e.g. /claw): the skill folder should ship the file and SKILL.md should instruct copy + idempotent edits at install time, not a git merge that carries src diffs. - Move container/agent-runner/src/atomic-chat-mcp-stdio.ts → .claude/skills/add-atomic-chat-tool/atomic-chat-mcp-stdio.ts - Revert the atomic_chat mcpServers entry in agent-runner index.ts - Revert mcp__atomic_chat__* from TOOL_ALLOWLIST in providers/claude.ts - Revert ATOMIC_CHAT_* env forwarding and [ATOMIC] log elevation in src/container-runner.ts - Empty .env.example back out - Rewrite SKILL.md: copy the shipped file, then apply deterministic Edits (index.ts, providers/claude.ts, container-runner.ts, .env.example) with exact before/after snippets the installer agent can match. Main is now back to its pre-PR state for the tool; /add-atomic-chat-tool re-applies everything at install time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:29:10 +03:00
Misha Skvortsov	3a9b98f1a4	feat: add Atomic Chat MCP tool skill Exposes local Atomic Chat models (OpenAI-compatible API at 127.0.0.1:1337/v1) as tools to the container agent. Adds atomic_chat_list_models and atomic_chat_generate alongside the existing Ollama skill. Rebased on current main: - MCP server registered in agent-runner index.ts using bun (no tsc step in-image), sibling path to index.ts, env: {} with ATOMIC_CHAT_* forwarded when set. - allowedTools entry moved to providers/claude.ts TOOL_ALLOWLIST. - SKILL.md: drop obsolete per-group copy step (single RO mount supersedes it); use pnpm build. Made-with: Cursor Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:18:34 +03:00
exe.dev user	40f5683c36	fix(approvals): show correct post-click labels on channel/sender cards getAskQuestionRender only checked pending_questions and pending_approvals, missing the channel and sender approval tables. Approval button clicks showed the raw value ("approve") instead of the selectedLabel ("✅ Wired"). Extend the lookup to also check pending_channel_approvals and pending_sender_approvals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:45 +00:00
exe.dev user	15f30682d7	fix(approvals): show human-readable names in approval cards Channel and sender approval cards showed raw platform IDs (e.g. discord:1475578393738219540:...) instead of readable context. Extract sender name from the event content for channel approvals, and use the channel type name for sender approvals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:34 +00:00
exe.dev user	d121cd1cd6	fix(router): pass isGroup from adapter through to messaging group creation The router hardcoded is_group=0 when auto-creating messaging groups, causing channel mentions to be misclassified as DMs. The Chat SDK bridge knows which handler fired (onDirectMessage vs onNewMention) so thread the signal through InboundMessage → InboundEvent → router. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:23 +00:00
exe.dev user	61ca43d193	fix(discord): resolve user ID from DM interactions for approval clicks Discord puts the clicking user at interaction.member.user for guild interactions but interaction.user for DM interactions. The Gateway handler only checked interaction.member, so DM button clicks resolved to an empty user ID and were silently rejected as unauthorized. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:12 +00:00
Lazer Cohen	2383bde80f	fix(container): scope orphan reaper by install label so peers don't kill each other Two installs on the same host could trash each other's containers: the reaper used `docker ps --filter name=nanoclaw-`, a substring match that picked up every install's containers. A crash-looping peer (e.g. a legacy v1 plist respawning ~6k times) would call cleanupOrphans on every boot and kill the healthy install's session containers within seconds of spawn. - Stamp `--label nanoclaw-install=<slug>` onto every spawned container. - cleanupOrphans filters by that label; healthy peers are left alone. - Setup preflight enumerates `com.nanoclaw*` launchd plists / nanoclaw user systemd units, probes state/runs, and unloads any that are crash-looping (state != running AND runs > 10) before installing this install's service. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 12:12:30 +03:00
gavrielc	5f1b3e5cad	style: apply prettier formatting to install-slug additions	2026-04-23 10:10:48 +03:00
gavrielc	7a9401ddf2	feat(setup): per-checkout service name and docker image tag Two NanoClaw installs on the same host used to fight over the shared `com.nanoclaw` launchd label / `nanoclaw.service` systemd unit and the `nanoclaw-agent:latest` docker tag — the second install silently rewrote the service pointer and rebuilt the image out from under the first. Introduces a deterministic per-checkout slug (sha1(projectRoot)[:8]) and namespaces everything off it: - Service: `com.nanoclaw-v2-<slug>` / `nanoclaw-v2-<slug>.service` - Image: `nanoclaw-agent-v2-<slug>:latest` (base), `nanoclaw-agent-v2-<slug>:<agentGroupId>` (per-group) New shared helpers: src/install-slug.ts (host) + setup/lib/install-slug.sh (bash). Both compute the same slug so verify/probe/add-*.sh/build.sh/container-runner all agree. Any v1 `com.nanoclaw` service left on the host stays untouched and can coexist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:10:09 +03:00
gavrielc	3b8240a91b	refactor(self-mod): drop request_rebuild — approvals now bundle rebuild+restart install_packages and add_mcp_server already did the right thing on approve (install auto-rebuilt+killed, add_mcp_server just killed), so request_rebuild was redundant plumbing agents sometimes called after an install — wasting an admin approval round-trip. Delete it end-to-end: - container/agent-runner/src/mcp-tools/self-mod.ts: remove requestRebuild tool + registration; update install_packages description. - src/modules/self-mod/{request,apply,index}.ts: drop handleRequestRebuild + applyRequestRebuild + registrations; rewrite the rebuild-failed notify to point admins at retrying install_packages instead. - src/modules/{approvals,self-mod}/{agent,project}.md and skill/self- customize/SKILL.md: scrub agent-facing references; clarify that add_mcp_server needs no rebuild (bun runs TS directly). - docs/{module-contract,architecture-diagram,checklist,db-central,shared- source,v1-vs-v2/*}.md, CLAUDE.md, pending-approvals migration comment, approvals/index.ts docstring, REFACTOR.md: trailing references. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:28:36 +03:00
gavrielc	e64bdb3016	refactor(claude-md): split shared base into module fragments, inject name at runtime Move every agent-specific instruction out of the shared container/CLAUDE.md so the base is genuinely universal. Persona/identity now comes from the system-prompt addendum (buildSystemPromptAddendum now takes assistantName and prepends "# You are {name}"). Per-module instructions live alongside each MCP tool source: container/agent-runner/src/mcp-tools/core.instructions.md container/agent-runner/src/mcp-tools/scheduling.instructions.md container/agent-runner/src/mcp-tools/self-mod.instructions.md composeGroupClaudeMd() scans that directory and emits `module-<name>.md` fragments as symlinks to /app/src/mcp-tools/<name>.instructions.md (valid via the existing RO source mount). Skill fragments renamed to `skill-<name>.md` for naming consistency with `module-` and `mcp-`. Mount tightening so composer-managed files can't be clobbered by agent writes: nested RO mounts for /workspace/agent/CLAUDE.md and /workspace/agent/.claude-fragments/. CLAUDE.local.md (per-group memory) stays RW as the only writable CLAUDE.md-family file. .gitignore: ignore CLAUDE.local.md, .claude-shared.md, .claude-fragments/ everywhere, and simplify groups/ rules to ignore the whole tree (per- installation state, not tracked). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:14:51 +03:00
gavrielc	95e74d8383	docs(onecli): expand secrets section; correct stale admin-roles refs Document the selective-mode gotcha for auto-created OneCLI agents (no secrets injected by default) with the CLI commands to inspect and fix it. Note that approval policies are not configurable via the SDK or `onecli@1.3.0` CLI — web UI only. Replace stale `NANOCLAW_ADMIN_USER_IDS` / `src/access.ts` references across CLAUDE.md, docs/architecture.md, docs/checklist.md, and docs/module-contract.md. Admin gating now runs host-side in src/command-gate.ts against `user_roles`; approver picks live in src/modules/approvals/primitive.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 16:46:17 +03:00
gavrielc	3db66c0ced	fix: forward ONECLI_API_KEY to OneCLI SDK for authenticated container config Ports the v1 fix from PR #1777 (originally `8b5b581` by @johnnyfish). Cherry-pick did not apply cleanly because v2 reformatted the surrounding code and split OneCLI usage into two sites — manual port was needed. v2-specific adaptations: - Also forward apiKey at the second OneCLI call site in src/modules/approvals/onecli-approvals.ts (v2 split the approvals module out of container-runner). - Skipped the companion test-mock commit (`38163bc`) — it patches src/container-runner.test.ts, which no longer exists in v2 (tests consolidated into host-core.test.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: johnnyfish <jonathanfishner11@gmail.com>	2026-04-22 15:16:59 +03:00
gavrielc	8e1c8f8f61	style: apply prettier formatting to touched files Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:57:09 +03:00
gavrielc	c8fc1da719	refactor(claude-md): compose per-group CLAUDE.md from shared base + fragments Replace the per-group "written once at init, owned by the group" CLAUDE.md with a host-regenerated entry point that imports: - a shared base (`container/CLAUDE.md` mounted RO at `/app/CLAUDE.md`) - optional per-skill fragments (skills that ship `instructions.md`) - optional per-MCP-server fragments (inline `instructions` field in `container.json`) - per-group agent memory (`CLAUDE.local.md`, auto-loaded by Claude Code) Principle: RW = per-group memory, RO = shared content. Source/skills/base are shared; personality, config, working files, and Claude state stay per-group. Key changes: - New `src/claude-md-compose.ts` — per-spawn composition + `migrateGroupsToClaudeLocal()` one-time cutover. - New `container/CLAUDE.md` — shared base, seeded verbatim from the former `groups/global/CLAUDE.md`. - `src/container-runner.ts` — swap `/workspace/global` mount for RO `/app/CLAUDE.md`; call `composeGroupClaudeMd()` after `initGroupFilesystem()`. - `src/group-init.ts` — drop `.claude-global.md` symlink + initial `CLAUDE.md` write; seed `CLAUDE.local.md` from `opts.instructions`. - `src/index.ts` — call `migrateGroupsToClaudeLocal()` at startup. - `src/container-config.ts` — add optional `instructions` field to `McpServerConfig` (inline per-MCP guidance fragment). - `container/Dockerfile` — drop dead `/workspace/global` mkdir. - Remove obsolete `scripts/migrate-group-claude-md.ts`. Migration (runs once at host startup, idempotent): - Delete `.claude-global.md` symlinks in each group. - Rename each `groups/<folder>/CLAUDE.md` → `CLAUDE.local.md` (preserves existing per-group content as memory). - Delete `groups/global/` directory. Design docs: `docs/claude-md-composition.md` and `docs/shared-source.md` (the latter is the sibling design discussion this refactor builds on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:58:43 +03:00
exe.dev user	8a12fa61ac	refactor: shared source — replace per-group agent-runner copies with single RO mount Replace the per-group agent-runner-src copy model with a single shared read-only mount. Source and skills are now RO + shared; personality, config, working files, and Claude state stay RW + per-group. Key changes: - Mount container/agent-runner/src/ RO at /app/src (all groups share one copy) - Mount container/skills/ RO at /app/skills; per-group skill selection via symlinks in .claude-shared/skills/ based on container.json "skills" field - Mount container.json as nested RO bind on top of RW group dir - Move all NANOCLAW_* env vars to container.json (runner reads at startup) - New runner config.ts module replaces process.env reads - Move command gate (filtered/admin) from container to host router - Dockerfile: remove source COPY, split CLI installs (claude-code last), move agent-runner deps above CLIs for better layer caching - Add writeOutboundDirect for router denial responses - Design doc at docs/shared-src.md Not included (follow-up): DB migration to drop agent_provider columns, cleanup of orphaned agent-runner-src directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-22 12:58:43 +03:00
gavrielc	1858ef35f0	Merge pull request #1908 from qwibitai/setup-auto feat(setup): scripted branded setup flow (nanoclaw.sh)	2026-04-22 03:06:30 +03:00
gavrielc	416fe01855	refactor(setup): drop CLI-bonus wiring from init-first-agent init-first-agent used to double-wire the CLI channel to every new DM agent as a convenience for `pnpm run chat`, gated by --no-cli-bonus. With the /new-setup-2 flow gone and a dedicated scratch CLI agent created earlier in setup:auto, that bonus just stomps on CLI routing the user already set up. Remove the CLI_CHANNEL/CLI_PLATFORM_ID constants, ensureCliMessagingGroup, the --no-cli-bonus flag, and the cli-bonus wiring block. Pass the paired user's identity through to the welcome delivery so the sender resolver sees the real owner (e.g. telegram:<id>) instead of cli:local. Extend the CLI channel's admin-transport payload to accept optional sender/senderId overrides — falls back to the old cli/cli:local defaults when omitted, so existing callers are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:13:22 +03:00
Dave Kim	91c668e0cc	fix: persist SDK session_id on init + split long messages before adapter truncation Two related bugs that surfaced together when a Discord response exceeded 2000 chars: 1. Session id lost on mid-turn container exit. `runPollLoop` was calling `setStoredSessionId` only after `processQuery` returned. If the container died between the SDK's `init` event (where session_id arrives) and the stream completing, the id was never persisted. The next wake called `getStoredSessionId()` → undefined and started a fresh Claude session, dropping all prior context. Fix: persist immediately in the `init` branch inside `processQuery`. The existing post-query store becomes a harmless no-op. 2. Silent truncation past adapter limits. `chat-sdk-bridge.deliver` handed full text straight to `adapter.postMessage`. Discord's adapter hard-truncates at 2000 chars; Telegram's at 4096. Responses longer than that were cut off without any signal to the user or host. Fix: add `maxTextLength` to `ChatSdkBridgeConfig` and a `splitForLimit` helper that breaks on paragraph → line → hard-char boundaries, then posts chunks sequentially. Files ride on the first chunk; the returned id is the first chunk's so edits and reactions still target the reply head. Channel adapter files (Discord, Telegram, …) live on the `channels` branch — a companion PR wires `maxTextLength: 1900` for Discord and `4000` for Telegram so the splitter actually engages in those installs. Without wiring, behavior is unchanged.	2026-04-21 13:04:57 +00:00
gavrielc	01ffce6f74	Revert "fix(permissions): welcome new approved channels via /welcome, route to them" This reverts commit `9776dd4f32`.	2026-04-21 15:20:06 +03:00
Koshkoshinsk	9776dd4f32	fix(permissions): welcome new approved channels via /welcome, route to them When the unknown-channel approval flow completes, seed a /welcome task into the newly-wired session so the agent greets the new user on first contact. The replayed /start (Telegram's default first-message) is filtered by the agent-runner's command-command filter, so without an explicit onboarding trigger the first interaction produced nothing. Pin the destination by its local_name from agent_destinations to avoid the agent picking the wrong named destination (previously it greeted the owner, whose DM is in CLAUDE.md). Also guard dispatchResultText against echoing trailing status lines when the agent has already sent messages explicitly via send_message. Otherwise a task-triggered flow that calls send_message then emits "welcome message sent" produces a duplicate chat to the recipient. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 11:40:12 +00:00
gavrielc	d8d61d3695	fix: Teams user-id prefix + defer cli:local owner grant parseUserId now falls back to user.kind when the id prefix isn't a registered adapter — Teams uses `29:` rather than `teams:`, so the literal prefix wouldn't resolve the channel adapter for cold DMs. init-cli-agent no longer claims the first-owner slot on `cli:local`. The CLI identity is scratch; owner promotion belongs to init-first-agent once the real channel user is wired.	2026-04-21 10:16:13 +03:00
gavrielc	0f6a1ba1ed	style: apply prettier formatting to touched files Pre-commit hook reflowed imports on files changed in the previous commit. Unrelated format drift on other files intentionally left unstaged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 23:31:42 +03:00
gavrielc	6c26c0413a	feat(router,cli): replyTo override + CLI admin-transport flows - InboundEvent gains an optional replyTo; router stamps the row's address fields from it when set, so replies can route to a different channel than the one the inbound came in on. - ChannelSetup adds onInboundEvent for admin-transport adapters that build the full event themselves. - CLI wire format accepts {text, to, reply_to}. Routed messages go through onInboundEvent and do not evict an active chat client. - init-first-agent hands the DM welcome to the running service via data/cli.sock — synchronous wake, no sweep wait. Fails loudly if the service is down; no silent fallback. - Split the CLI scratch-agent bootstrap into scripts/init-cli-agent.ts; init-first-agent is DM-only. Agents cannot set replyTo: it lives only on the inbound/router seam and is consumed once when writing messages_in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 23:30:47 +03:00
gavrielc	719f97e483	feat(permissions): unknown-channel registration flow with owner approval When the router sees a mention or DM on a messaging group that isn't wired to any agent, it now escalates to an owner for approval instead of silently dropping. Mirrors the existing unknown-sender approval pattern (ACTION-ITEMS item 22). Schema (migration 012): - `messaging_groups.denied_at TEXT NULL` — timestamp set on deny so future mentions stop escalating. ALTER TABLE ADD COLUMN, FK-safe (unlike the rebuild that bit migration 011). - `pending_channel_approvals` — PK on `messaging_group_id` gives free in-flight dedup. One card per channel, no spam on rapid retries. Router: - New hook `setChannelRequestGate(mg, event) => Promise<void>`, invoked from the no-wirings branch when the message was addressed to the bot (isMention=true). Hook is fire-and-forget. - Checks `mg.denied_at` before escalating — denied channels drop silently and do not re-prompt. - The two "no-wirings" branches (fresh auto-create and existing mg with no agents) are consolidated into one escalation path that calls the gate once. Without the module, behavior is log + record (no regression). Permissions module: - `channel-approval.ts::requestChannelApproval` — MVP picker: target agent is `getAllAgentGroups()[0]`, card names it explicitly ("Wire it to <Andy>?"). Approver via existing `pickApprover` + `pickApprovalDelivery` primitives. - Response handler: same click-auth pattern as sender-approval (clicker must be the designated approver OR have admin privilege over the target agent group). - Approve defaults per the feature spec: engage_mode = 'mention-sticky' for groups, 'pattern' + '.' for DMs sender_scope = 'known' ignored_message_policy = 'accumulate' session_mode = 'shared' DM vs group inferred from the original event's threadId (non-null → group) because the auto-created mg has a placeholder is_group=0 until the adapter fills it in. - Triggering sender is auto-added to agent_group_members so sender_scope= 'known' doesn't bounce the replayed message into a sender-approval cascade. - Deny: stamps messaging_groups.denied_at, clears pending row. - Failure modes — no owner, no agent groups, no reachable DM — log and drop without creating a pending row, letting a future attempt try again (same as sender-approval). 9 new integration tests cover every branch: mention triggers card, DM triggers card, dedup, approve creates correct wiring + admits sender + replays, approve-on-DM uses pattern/'.' defaults, deny sets denied_at and future mentions drop silently, unauthorized clicker rejected, no-owner drops, no-agent-groups drops. 168 tests pass (was 159; +9). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 14:34:00 +03:00
gavrielc	a4061a0012	refactor(channels,router): move all policy to router; bridge is transport Follow-up to `b159722`. That shrank the bridge's shouldEngage to a flood gate + coarse sticky-subscribe signal. This completes the move — policy lives exclusively in the router, the bridge is transport-only, and the conversations map + ChannelSetup.conversations + ChannelAdapter.updateConversations are all gone. Key shifts: 1. Subscribe moves from bridge to router. Bridge used to call `thread.subscribe()` from its onNewMention / onDirectMessage handlers based on a coarse "any mention-sticky wiring exists on this channel" check. That forced the decision before the router could apply per-wiring engage logic, and it relied on the conversations map being current (staleness risk). ChannelAdapter gains `subscribe?(platformId, threadId)`. The Chat SDK bridge implements it via SqliteStateAdapter.subscribe(threadId) (idempotent — a repeat call on an already-subscribed thread is a no-op). The router's fan-out loop calls it once per message when the first mention-sticky wiring actually engages. Precise, not coarse. 2. Short-circuit the drop path with one combined query. New `getMessagingGroupWithAgentCount(channelType, platformId)` does the messaging_groups lookup AND counts wirings in a single SELECT, using the existing UNIQUE(channel_type, platform_id) index on messaging_groups and UNIQUE(messaging_group_id, agent_group_id) on messaging_group_agents for the JOIN. No new indexes needed. routeInbound now short-circuits: - No messaging_groups row AND not addressed (no mention/DM) → return silently. One DB read, nothing written. This is the Discord-bot-in-a-big-guild case; we no longer auto-create rows for every plain message in every channel the bot can see. - Messaging group exists but no wirings AND not addressed → return silently. One DB read. - Otherwise fall through to sender resolution + fan-out as before. Behavioral change: plain chatter on unwired channels no longer gets dropped_messages audit rows, which used to bloat the table. Audit still fires on addressed-to-bot drops where the admin cares ("someone @-mentioned us but nobody's wired"). 3. Bridge is now purely transport. Deleted entirely: ConversationConfig, ChannelSetup.conversations, ChannelAdapter.updateConversations?, bridge's `conversations` map, buildConversationMap, shouldEngage, EngageSource, engageDecision, bridge.updateConversations method, src/index.ts buildConversationConfigs. Four handlers reduce to "resolve channel id, build InboundMessage with isMention, call onInbound". Net ~130 LOC deleted from the bridge. Collateral: the conversations-map staleness problem is gone. The upcoming channel-registration feature doesn't need any map-refresh plumbing — when an approval creates a new wiring, the next message hits the DB fresh and just works. Bridge tests prune to the narrow platform-adjacent surface (openDM delegation, subscribe presence). Host-core test that asserted the old "auto-create on every unknown message" behavior updates to reflect the new escalation-gated semantics: plain messages on unknown channels don't auto-create, mentions do. 159 tests pass (was 172 — net -13, almost entirely from bridge-engage-mode tests that covered logic now owned by the router and exercised through host-core.test.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 13:55:49 +03:00
gavrielc	b15972284b	refactor(channels): shrink bridge shouldEngage to flood gate + subscribe signal Before this change the bridge and the router both owned engage_mode policy. Bridge's shouldEngage had a full switch over mention / mention-sticky / pattern + source-based rules + engage_pattern regex test + ignored_message_policy accumulate fallback. Router's evaluateEngage had the same switch against the same fields. Two parallel logic paths with subtle vocabulary differences (bridge: "which SDK handler fired"; router: "what isMention says"). Every time we touched one we had to reason about the other — the Telegram hasMention bug and the "pattern mode silently drops in group chats" bug were both drift between the two. Refactor to one place. Router keeps all per-wiring policy — engage mode, pattern regex, sender scope, ignored-message policy — unchanged. Bridge drops to a coarse flood gate + subscribe signal: - forward: does this channel have ANY wiring? Forward if yes. Unknown channels still forward for subscribed/mention/dm (they may be newly auto-created, or will trigger the coming channel-registration flow). Unknown channels DROP for new-message so we don't flood from every unsubscribed thread the bot happens to sit in. - stickySubscribe: any mention-sticky wiring on the channel AND the source is mention or dm. Coarse union — subscribe is idempotent and one call serves every sticky wiring. The `text` param on shouldEngage is gone (pattern regex lives in the router now). Four bridge handler sites simplify accordingly. messageToInbound still carries the SDK-confirmed isMention flag through to the router unchanged. Behavioral delta: pure-mention-wired channels (no pattern, no accumulate) will now see every plain group message reach the router before being dropped there, where before the bridge dropped at the transport boundary. Extra DB lookup per dropped message in this specific case; acceptable for the cleaner seam and can be optimized back at the bridge if it ever matters in practice. Bridge tests prune the 10 engage_mode-specific cases that covered logic now owned by evaluateEngage in the router (host-core.test.ts covers it end-to-end through routeInbound). Bridge tests keep only what's bridge-specific: the flood gate and the stickySubscribe coarse union. 172 tests pass (was 182 — net -10 redundant bridge tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 13:32:08 +03:00
gavrielc	68058cbc4a	fix(permissions): authorize unknown-sender approval clicks The approval click handler trusted row.approver_user_id as the actor regardless of who actually clicked the card. A random user who received the forwarded card could click Approve and get the stranger admitted to the agent group — their click was simply not checked. Separately, payload.userId arrives as the raw platform userId from Chat SDK onAction (e.g. "6037840640"), not the namespaced form ("telegram:6037840640") that matches users(id). Without namespacing, users-table lookups miss. Namespace the clicker id with payload.channelType, then authorize: the clicker must be either the designated approver OR have owner / admin privilege over the agent group (hasAdminPrivilege covers owner, global admin, scoped admin). Unauthorized clicks return true (claim the response so the registry doesn't log it as unclaimed) but take no action — the pending row stays in place so a legitimate approver can still act on it. Existing tests passed a pre-namespaced userId directly, masking the first bug. Fixed the fixtures to match production plumbing and added two tests: one asserts a random bystander's click is rejected (row stays pending, no member added), the other asserts a global admin can approve even when they weren't the designated approver. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 12:16:35 +03:00
gavrielc	f74df3b0d3	fix(router): trust SDK isMention signal; drop broken hasMention regex The router's mention / mention-sticky engage check was regex-matching @<agent_group.name> (e.g. @Andy) against message text. Platforms don't work that way — users address bots via the bot's platform username (@nanoclaw_v2_refactr_1_bot on Telegram, user-id mentions on Slack / Discord). The regex matched only coincidentally and never on Telegram, so mention-mode wirings silently never fired there. Two parallel mention detectors existed: the Chat SDK's onNewMention, which correctly resolves the bot's platform identity, and the router's hasMention text regex, which ignored the SDK verdict and invented its own heuristic. The router's detector was wrong in principle — the agent group's display name is a NanoClaw-side nickname, not a platform address. Thread the SDK signal through: InboundMessage gains an optional `isMention` field, the bridge sets it from each handler (onNewMention → true, onDirectMessage → true, onSubscribedMessage → message.isMention, onNewMessage(/./) → false), src/index.ts forwards it into InboundEvent, and evaluateEngage now checks `isMention === true` for mention modes. hasMention deleted entirely — there is only one source of truth for "did the user mention this bot": the platform / SDK. Agent-name-in-text matching for disambiguating multiple agents wired to one chat is a separate feature; users can express it today with engage_mode='pattern' + the agent's name as the regex. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 12:16:20 +03:00
gavrielc	0105de0257	fix(host-sweep): skip ceiling check when heartbeat file is absent decideStuckAction treated a missing heartbeat file as heartbeatAge = Infinity, which always exceeded the 30-minute ceiling. Result: every freshly-spawned container got killed within seconds of spawn on the first sweep pass because it hadn't produced an SDK event yet (heartbeat is only touched on SDK events inside processQuery, not on boot). Skip the ceiling branch when heartbeatMtimeMs === 0. Containers that genuinely never wrote a heartbeat because they died are caught by the separate "container process not running" cleanup path. Containers that boot, claim a message, but hang at the gate are caught by the claim-stuck check below — which correctly fires regardless of heartbeat presence once claimAge exceeds tolerance. Updates the "absent heartbeat → kill-ceiling" test (which was encoding the bug) and adds a companion that the claim-stuck path still fires for absent-heartbeat containers with aged claims. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 12:15:52 +03:00
gavrielc	c38e5b11a8	fix(channels): wire accumulate mode through the bridge The router + session DB were already fully plumbed for ignored_message_policy='accumulate' — fan-out in routeInbound calls deliverToAgent(wake=false) for non-engaging agents on accumulate wirings, writeSessionMessage writes trigger=0, countDueMessages filters trigger=1, container formatter includes all messages regardless of trigger. But the Chat SDK bridge dropped non-engaging messages before the router ever saw them, so accumulate was dead on arrival for every adapter that goes through the bridge. Expose ignored_message_policy on ConversationConfig, project it in buildConversationConfigs, and widen shouldEngage's "forward" decision to "engage OR accumulate" with the union taken across all wirings on a conversation. stickySubscribe stays gated on a real engage — subscribing a thread we'd only silently accumulate on would misrepresent the bot's presence. shouldEngage return shape is now { forward, stickySubscribe } — engage was an internal concept the caller never needed, and conflating it with forward was the source of this bug. 7 new tests cover: non-engaging messages forwarding under accumulate, mixed drop/accumulate wirings taking the union, accumulate not triggering sticky subscribe, unknown-conversation drop precedence over accumulate, and drop policy preserving existing behavior on engaging messages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 11:18:43 +03:00
gavrielc	ce25e1e97c	style(channels): prettier line-wrap in chat-sdk-bridge.test.ts Post-commit reformat picked up by format:fix hook on the previous commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 11:12:40 +03:00
gavrielc	52c6223292	fix(channels): register onNewMessage(/./) to fix pattern mode in group chats Chat SDK dispatch (per handling-events.mdx) is exclusive and prioritized: subscribed → onSubscribedMessage; unsubscribed + mention → onNewMention; unsubscribed + pattern match → onNewMessage. We never registered the third, so engage_mode='pattern' silently dropped every message in unsubscribed group threads — the SDK simply never surfaced them anywhere. Register chat.onNewMessage(/./, …) and route it through shouldEngage with a new 'new-message' source. Unknown-conversation policy drops for this source (would otherwise flood from every unwired channel the bot can see). mention / mention-sticky wirings ignore 'new-message' — they require an explicit @mention to start a conversation. Pattern wirings evaluate normally. Extracted shouldEngage from a closure to an exported function with an EngageSource type so it's unit-testable. Added 17 tests covering every source × engage-mode combination, unknown-conversation behavior, invalid regex fail-open, and multi-wiring union. Accumulate (ignored_message_policy='accumulate') is still not plumbed — the bridge drops non-engaging messages entirely instead of forwarding them as context-only. That requires a trigger: 0 \| 1 field on InboundMessage → router → writeSessionMessage (schema already has the column). Separate change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 11:11:56 +03:00
gavrielc	57e0cda9e5	Revert "fix(channels): pre-subscribe group threads for pattern / accumulate wirings" This reverts commit `73b20880ff`.	2026-04-20 10:35:33 +03:00
gavrielc	73b20880ff	fix(channels): pre-subscribe group threads for pattern / accumulate wirings The engage modes shipped in #1869 included `pattern` (regex match any message) and the `accumulate` ignored-message policy, but neither could fire in group chats because Chat SDK only surfaces: - DMs (onDirectMessage) - @mentions in unsubscribed threads (onNewMention) - every message in subscribed threads (onSubscribedMessage) A bot sitting in a Discord/Slack channel hears nothing from a plain message unless the thread is already subscribed. So `pattern '.'` on a group wiring → silent. `pattern /urgent/i` → silent. `mention + accumulate` → the non-mention messages that should be stored as context were never received, so nothing to accumulate. Fix: call `chat.subscribe(platformId)` at setup time for every wiring whose `engageMode === 'pattern'` or `ignoredMessagePolicy === 'accumulate'`. Failures logged + swallowed per-conversation so one un-subscribable channel doesn't crash startup. ## Knock-on: SDK stops firing onNewMention once subscribed Per SDK types:1468, `onNewMention` only fires in unsubscribed threads. Once we pre-subscribe a channel for a pattern wiring, subsequent mentions arrive as `onSubscribedMessage` with `message.isMention === true`. Before: a `mention` wiring coexisting with a `pattern` wiring in the same channel would silently stop firing after pre-subscribe. After: `shouldEngage` accepts the `isMention` flag independently from `source`, so the `mention` mode matches on (dm OR mention-new OR subscribed-with-isMention). Source shape changed `'subscribed' \| 'mention' \| 'dm'` → `'subscribed' \| 'mention-new' \| 'dm'` to make the "unsubscribed-mention event" distinction explicit. ## New fields - `ConversationConfig.ignoredMessagePolicy` — projected from the messaging_group_agents row so the bridge knows which wirings need pre-subscription. buildConversationConfigs in src/index.ts populates it. Tests: host 153/153, container 46/46. No new tests yet — the subscribe call path needs a Chat mock, deferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 10:34:15 +03:00
gavrielc	fca3d8de70	fix(migrations): drop 011 table-rebuild; keep only pending_sender_approvals The original 011 also rebuilt `messaging_groups` to flip the `unknown_sender_policy` column DEFAULT from "strict" to "request_approval". On live DBs the DROP TABLE step fails SQLite's foreign-key integrity check because `sessions`, `user_dms`, and `pending_sender_approvals` all reference `messaging_groups(id)`. `PRAGMA foreign_keys=OFF` / `defer_foreign_keys` can't be toggled inside the implicit migration transaction, so the rebuild can't be made to apply cleanly. The default-flip was cosmetic anyway: every `createMessagingGroup` callsite passes `unknown_sender_policy` explicitly. Router auto-create was already updated to hardcode "request_approval" (router.ts:151), and setup / seed scripts pick per context. Changes: - Migration 011 now only creates the `pending_sender_approvals` table + index. The rebuild block is gone. - Reference `SCHEMA` in src/db/schema.ts updated to reflect what the DB actually has: DEFAULT 'strict' (from migration 001), with a note about the effective policy applied at insert sites. Discovered on v2 post-merge during live restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 10:08:35 +03:00
gavrielc	9882c94530	fix(channels): use Chat SDK ChatMessage.text, not .content The engage-mode gating added in #1869 read `message.content` from the Chat SDK's ChatMessage in all three inbound handlers (onSubscribedMessage, onNewMention, onDirectMessage). ChatMessage exposes the user-visible string as `.text` — `.content` exists on the underlying nested structure but isn't the plain-text field. Result: `shouldEngage` always saw an empty string, pattern gating never matched, non-wildcard regex wirings silently dropped every inbound. Fix: use `message.text` in all three gates. Discovered during live smoke-test on v2 post-merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 10:08:35 +03:00
gavrielc	622a370815	feat(permissions): unknown-sender request_approval flow + flipped default policy When an unknown sender writes into a wired messaging group, surface the situation to an admin instead of silently dropping. Flow: 1. Router → access gate → handleUnknownSender (policy='request_approval') 2. Fire-and-forget requestSenderApproval: pickApprover + pickApprovalDelivery pick a reachable admin DM; deliver an Approve / Deny card; insert a pending_sender_approvals row carrying the original InboundEvent JSON. 3. In-flight dedup: UNIQUE(messaging_group_id, sender_identity) — a retry from the same stranger while pending is silently dropped, not re-carded. 4. Admin clicks → Chat SDK bridge → onAction → host response-registry. The new handleSenderApprovalResponse in the permissions module claims responses whose questionId matches a pending_sender_approvals row. 5. approve: addMember(stranger, agent_group) + replay the stored event via routeInbound — the second attempt clears the gate because the user is now known. 6. deny: delete the pending row. No denial persistence (ACTION-ITEMS item 5 decision) — a future attempt triggers a fresh card. Schema: - Migration 011 adds pending_sender_approvals (id, mg_id, agent_group_id, sender_identity, sender_name, original_message JSON, approver_user_id, created_at, UNIQUE(mg_id, sender_identity)). - Also flips messaging_groups.unknown_sender_policy default from 'strict' to 'request_approval' (rebuild-table). Existing rows unchanged — only the default applied to new rows flips. - Router auto-create for unknown platform/chat drops the hardcoded 'strict' override; schema default applies. - src/db/schema.ts reference updated to match. Why default-flip: users wire their DM during setup and don't discover that 'strict' means "silent drop of everyone not in user_roles/members". The approval flow is the safe default — the admin sees the stranger, explicitly decides. 'public' stays opt-in for truly open channels. Failure modes (row NOT created so a future attempt can try again): - No eligible approver configured (fresh install before first owner). - No reachable DM for any approver. - Delivery adapter missing. Tests (src/modules/permissions/sender-approval.test.ts, 4 cases): - First unknown message → card delivered + row created - Retry while pending → dedup'd (1 card, 1 row) - Approve → member added + message replayed + container woken - Deny → row cleared + no member added Closes: ACTION-ITEMS item 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:36:11 +03:00
gavrielc	16b9499532	feat(routing): engage modes + sender scope + accumulate/drop + per-agent fan-out Replaces the opaque trigger_rules JSON + response_scope enum on messaging_group_agents with four explicit orthogonal columns: engage_mode 'pattern' \| 'mention' \| 'mention-sticky' engage_pattern regex source; required when mode='pattern'; '.' is the "always" sentinel sender_scope 'all' \| 'known' ignored_message_policy 'drop' \| 'accumulate' Inbound routing becomes a fan-out — every wired agent is evaluated independently. A match gets its own session + container wake. A miss with accumulate keeps the message as context-only (trigger=0) in that agent's session, so when the agent does eventually engage it sees the prior chatter. ## Schema - Migration 010 (`engage-modes`): adds the 4 new columns, backfills from trigger_rules.pattern + requiresTrigger + response_scope, drops the legacy columns. - messages_in gains `trigger INTEGER NOT NULL DEFAULT 1` (session DB schema + `migrateMessagesInTable` forward-compat). - countDueMessages gates waking on `trigger = 1`. ## Routing - `pickAgent` (returns one) → loop over all wired agents. Per agent: evaluate engage_mode; run access gate + sender-scope gate; on full match → resolveSession + writeSessionMessage(trigger=1) + wake. On miss with accumulate → writeSessionMessage(trigger=0), no wake. On miss with drop → skip. - New `findSessionForAgent(agentGroupId, mgId, threadId)` scopes session lookup by agent so fan-out doesn't cross sessions. - `messageIdForAgent` namespaces inbound message ids by agent_group_id so PRIMARY KEY doesn't collide across per-agent session DBs. ## Adapter layer - `ConversationConfig` replaces `triggerPattern` + `requiresTrigger` with `engageMode` + `engagePattern`. - Chat SDK bridge stores `Map<platformId, ConversationConfig[]>` (multi- agent per conversation) and applies union gating pre-onInbound: * onSubscribedMessage: engage if any wiring keeps firing in subscribed state (mention-sticky or pattern) * onNewMention: engage on mention; only subscribes the thread if at least one wiring is `mention-sticky` * onDirectMessage: engage per mode; sticky follows same rule - Bridge no longer unconditionally calls `thread.subscribe()`. ## Sender scope - Permissions module registers a second hook `setSenderScopeGate` that runs per-wiring after the existing access gate. `sender_scope='known'` requires canAccessAgentGroup(); `'all'` is a no-op. Not installed → no-op everywhere (default allow). ## Container side - Host passes `NANOCLAW_MAX_MESSAGES_PER_PROMPT` (reuses existing MAX_MESSAGES_PER_PROMPT config; was dead code from v1). - `getPendingMessages` queries `ORDER BY seq DESC LIMIT N`, reverses to chronological order for the prompt — accumulated context rides along with trigger rows up to the cap. - `MessageInRow` gains `trigger: number` so the container can tell them apart in downstream code (container still processes both; only the host uses `trigger=0` for don't-wake). ## Defaults (per ACTION-ITEMS item 1 decision) - DM (is_group=0): `engage_mode='pattern'`, `engage_pattern='.'` (always) - Threaded group: `engage_mode='mention-sticky'` (seed-discord) - Non-threaded group / CLI: pattern '.' in bootstrap scripts ## Tests - src/host-core.test.ts: 3 new cases — fan-out (2 agents, 2 sessions, 2 wakes), accumulate (trigger=0 + no wake), drop (no session created). - Existing 10 host-core tests still pass. - Migration 010 runs on an empty DB in 0-row path — verified. Closes: ACTION-ITEMS items 1, 4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:30:04 +03:00
gavrielc	6a815190c0	feat(lifecycle): stuck detection + heartbeat lifecycle + SDK tool blocklist Replaces the two overlapping old mechanisms (30-min setTimeout kill in container-runner, 10-min heartbeat STALE_THRESHOLD reset in host-sweep) with message-scoped stuck detection anchored to the processing_ack claim age + an absolute 30-min ceiling that extends for long-declared Bash tools. Old model problems: - IDLE_TIMEOUT setTimeout fired on plain wall-clock time; slow-but-alive agents got killed at 30min regardless of activity - 10-min STALE_THRESHOLD in the sweep was unreliable — the heartbeat is only touched on SDK events, so legitimate silent tool work (sleep 30, long WebFetch, npm install) looked identical to a hung container - Two overlapping sources of truth for "when to let go of a container" New model: - Host sweep is the single source of truth. - Container exposes a new `container_state` single-row table in outbound.db (schema added; container writes, host reads). PreToolUse hook writes current_tool + tool_declared_timeout_ms (read from Bash's tool_input); PostToolUse / PostToolUseFailure clear it. - Sweep decides with a pure helper `decideStuckAction`: * absolute ceiling — kill if heartbeat age > max(30min, bash_timeout) * per-claim stuck — kill if any processing_ack row has claim_age > max(60s, bash_timeout) AND heartbeat hasn't been touched since claim * otherwise ok Kill paths reset leftover processing rows with exponential backoff, reusing the existing retry machinery. Tool blocklist expanded: - AskUserQuestion (SDK placeholder; we have mcp__nanoclaw__ask_user_question) - EnterPlanMode, ExitPlanMode, EnterWorktree, ExitWorktree (Claude Code UI affordances; would hang in headless containers) PreToolUse hook is also defense-in-depth: if a disallowed tool name slips through, it returns `{ decision: 'block' }` so the agent sees a clear error instead of appearing stuck. Removed: - container-runner.ts: IDLE_TIMEOUT setTimeout, resetIdle callback on activeContainers entry, resetContainerIdleTimer export. - delivery.ts: the resetContainerIdleTimer call on successful delivery. - poll-loop.ts: IDLE_END_MS + its setInterval. Keeping the query open is cheaper than close+reopen (no cold prompt cache). Liveness is now a host-side concern. - host-sweep.ts: 10-min STALE_THRESHOLD_MS + getStuckProcessingIds in the stale-detection path (still exported for kill reset). Tests: - src/host-sweep.test.ts — 9 tests for decideStuckAction covering: fresh heartbeat, absolute ceiling, absent heartbeat, Bash-timeout extension (both ceiling and per-claim), claim age below tolerance, heartbeat touched after claim, unparseable timestamps. Ref: docs/v1-vs-v2/ACTION-ITEMS.md items 9, 6a, 10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:16:57 +03:00
gavrielc	dcfa12ea06	feat(timezone): recreate v1 TZ-aware formatting + scheduling behavior The agent needs to perceive times in the user's timezone, not UTC. Dropping this in the v1→v2 port produced a class of bugs where the agent would schedule tasks for the wrong hour, suggest dinner at midnight, etc. This restores v1 parity. Container side: - New container/agent-runner/src/timezone.ts mirrors src/timezone.ts with isValidTimezone / resolveTimezone / formatLocalTime, plus: * TIMEZONE constant resolved at load from process.env.TZ (host sets this from src/container-runner.ts:254) * parseZonedToUtc(input, tz) — treats a naive ISO as wall-clock time in `tz`, returns the corresponding UTC Date. Strings with Z or offset are passed through. - formatter.ts: * formatMessages() now prepends <context timezone="IANA"/>\n — matches v1 src/v1/router.ts:20-22 * formatSingleChat uses formatLocalTime(ts, TIMEZONE) instead of a home-rolled HH:MM 24h formatter → outputs like "Jun 15, 2026, 8:00 AM" * reply_to="<id>" attribute + <quoted_message from="X">Y</quoted_message> element — matches v1 format exactly; old <reply-to/> shape is gone * stripInternalTags() exported for the dispatch path to reuse - poll-loop.ts uses the exported stripInternalTags() instead of inline regex. - mcp-tools/scheduling.ts: * schedule_task/update_task descriptions now explicitly document that processAfter accepts either UTC or naive local time (interpreted in the user's TZ from the context header) * handlers normalize through parseZonedToUtc() and store a UTC ISO Host side: - src/modules/scheduling/recurrence.ts passes { tz: TIMEZONE } to CronExpressionParser.parse. Without this, "0 9 * * *" fires at 09:00 UTC instead of 09:00 user-local — this was the v1 behavior (src/v1/task-scheduler.ts:20-49). Tests: - container/agent-runner/src/timezone.test.ts — mirror of src/timezone.test.ts + new parseZonedToUtc cases - container/agent-runner/src/formatter.test.ts — context header, reply_to, quoted_message, XML escaping, stripInternalTags (ported from v1 formatting.test.ts) - src/modules/scheduling/recurrence.test.ts — cron TZ respected, completed rows only cloned when recurrence is set Ref: docs/v1-vs-v2/ACTION-ITEMS.md item 18 + timezone-formatting-v1-recreation.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:09:14 +03:00
gavrielc	0283391e0a	chore(config): remove dead POLL_INTERVAL / SCHEDULER_POLL_INTERVAL / IPC_POLL_INTERVAL These three constants were carried over from v1's polling + IPC architecture and have zero callers in the v2 runtime: - POLL_INTERVAL (2000ms) — v1 message loop; replaced by event-driven delivery + delivery.ts's ACTIVE_POLL_MS (hardcoded 1000ms) - SCHEDULER_POLL_INTERVAL (60000ms) — v1 task scheduler; replaced by host-sweep.ts's SWEEP_INTERVAL_MS (hardcoded 60_000) - IPC_POLL_INTERVAL (1000ms) — v1 file-based IPC; meaningless in v2's session-DB architecture Grep confirms no imports in src/, container/, or tests. Docs/SPEC.md updated to match. Ref: docs/v1-vs-v2/ACTION-ITEMS.md item 15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:01:47 +03:00
gavrielc	47950671fa	docs: add v1→v2 action-items analysis + SDK signal probe tool - docs/v1-vs-v2/: full v1→v2 regression analysis (SUMMARY + 21 per-module docs + ACTION-ITEMS rollup with decisions + timezone recreation spec). - container/agent-runner/scripts/sdk-signal-probe.ts: empirical harness used to characterise Claude Agent SDK event/hook/stderr timing for the stuck-detection design in item 9. - src/channels/chat-sdk-bridge.ts: document the conversations Map staleness in a code comment; fix deferred to when dynamic group registration lands (ACTION-ITEMS item 17). No runtime behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 01:00:04 +03:00
gavrielc	131fc99700	feat(channels): add CLI channel — talk to your agent from the terminal First default channel that ships with main. Unix-socket adapter + thin client; plugs into the running daemon rather than spawning its own host. ## src/channels/cli.ts - ChannelAdapter with channelType='cli', platformId='local'. - setup() unlinks any stale socket, listens on $DATA_DIR/cli.sock (mode 0600 so only the local user can connect). - On client connect: reads newline-delimited JSON ({"text": "..."}) and calls config.onInbound('local', null, {id, kind:'chat', content, ts}). - deliver() writes {"text": <body>} back to the connected socket; silently no-ops when no client is attached (outbound row still persists). - Single-client policy: a second connection supersedes the first with a [superseded] notice. - teardown() closes the client, closes the server, removes the socket file. ## scripts/chat.ts + pnpm run chat One-shot client: - pnpm run chat <message...> - Connects to the socket, writes one JSON line with the message. - Reads replies; exits 2s after the first reply lands (hard timeout 120s). - ENOENT/ECONNREFUSED prints a hint to start the daemon. ## scripts/init-first-agent.ts - Fix stale imports after earlier module extractions (permissions + agent-to-agent moved their DB helpers into modules/). - After wiring the DM channel, also create cli/local messaging_group (unknown_sender_policy='public' — local socket perms handle auth) and wire it to the same agent. User can `pnpm run chat` immediately. ## package.json - Add "chat": "tsx scripts/chat.ts" script. ## Validation - pnpm run build clean. - pnpm test — 137 host tests pass. - bun test in container/agent-runner — 17 pass. - Service boot logs: "CLI channel listening" + "Channel adapter started channel=cli type=cli". Clean SIGTERM shutdown; socket file removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:51:04 +03:00
gavrielc	7169c25e70	refactor: relocate outbox I/O to session-manager + dead-code sweep ## Outbox extraction (delivery.ts → session-manager.ts) File I/O for outbound attachments now lives in session-manager.ts alongside the symmetric inbound extractAttachmentFiles. delivery.ts no longer touches the filesystem — it hands buffers to the adapter and calls clearOutbox on success. - New `readOutboxFiles(agentGroupId, sessionId, messageId, filenames)` and `clearOutbox(agentGroupId, sessionId, messageId)` in session-manager.ts. - deliverMessage in delivery.ts loses ~35 lines of fs/path code and its `fs`/`path` imports. ## Dead-code sweep TypeScript's --noUnusedLocals surfaced several cruft imports. Fixed: - src/container-runner.ts: drop unused `markContainerIdle` import; drop unused `session` parameter from `buildContainerArgs` signature. - src/delivery.ts: drop unused `getSession`, `writeSessionMessage`, `wakeContainer` imports. - src/host-sweep.ts: drop unused `updateSession`, `outboundDbPath` imports. - container/agent-runner/src/poll-loop.ts: drop unused `config`, `processingIds` params from `processQuery`. - Test files: drop unused imports in channel-registry.test, db-v2.test, host-core.test. Skipped: `conversations` state in chat-sdk-bridge.ts (never read but tangled with public `updateConversations` method; cleaning it risks a merge conflict with the channels branch at the next sync). ## Validation - `pnpm run build` clean - `pnpm test` — 137 host tests pass - `bun test` in container/agent-runner — 17 tests pass - Service boots (`NanoClaw running`, `OneCLI approval handler started`) and shuts down cleanly on SIGTERM Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:34:08 +03:00
gavrielc	95fdec335a	refactor(modules): re-tier approvals as default; extract self-mod as optional Promotes approvals to the default tier with a public API (requestApproval + registerApprovalHandler) that other modules consume. Self-modification (install_packages / request_rebuild / add_mcp_server) moves into a new optional module that registers delivery actions + matching approval handlers via the new API. ## Approvals (default tier) - Adds `src/modules/approvals/primitive.ts` exporting `requestApproval`, `registerApprovalHandler`, `notifyAgent`. Absorbs `pickApprover` / `pickApprovalDelivery` / `channelTypeOf` from the deleted `src/access.ts`. - Rewrites `response-handler.ts` to dispatch to registered approval handlers on approve (action-keyed Map). Reject path is centralized. - Drops the three self-mod-specific delivery-action registrations from `approvals/index.ts`; they belong to self-mod now. - `onecli-approvals.ts` now imports picks from the primitive instead of `src/access.ts`. ## Self-mod (optional tier) - New `src/modules/self-mod/` with request handlers (validate input + call requestApproval) and apply handlers (orchestration on approve). - `apply.ts` owns updateContainerConfig + buildAgentGroupImage + killContainer calls. Self-mod depends on approvals (via registerApprovalHandler + requestApproval + notifyAgent) and on core (container-runner, container-config). - Registers 3 delivery actions + 3 approval handlers at import time. ## Other changes - `src/access.ts` and `src/access.test.ts` deleted. Tests split across `src/modules/approvals/picks.test.ts` (approver selection) and `src/modules/permissions/permissions.test.ts` (access + roles + DM). - `src/modules/index.ts` barrel: approvals loads before self-mod so registerApprovalHandler is bound when self-mod registers at import time. ## Validation - `pnpm run build` clean - `pnpm test` — 137 host tests pass - `bun test` in container/agent-runner — 17 tests pass - Service starts; boot log shows `OneCLI approval handler started`, `NanoClaw running`; clean SIGTERM shutdown Resolves the transitional tier violation flagged in PR #5 where core imported from the permissions optional module via `src/access.ts`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 19:41:26 +03:00
gavrielc	46b19dcf9c	refactor(modules): extract agent-to-agent as registry-based module Last extraction of Phase 3. Moves inter-agent messaging + create_agent + destination projection into src/modules/agent-to-agent/. Core retains: - `channel_type === 'agent'` dispatch in delivery.ts, guarded by hasTable('agent_destinations') + dynamic import into module. - Channel-permission ACL in delivery.ts, guarded by hasTable, with inlined SQL (no module import from core). - writeDestinations call in container-runner.ts, guarded by hasTable + dynamic import into module. - createMessagingGroupAgent's destination side effect in db/messaging-groups.ts, guarded by hasTable. This is a documented transitional tier violation (core imports from optional module), analogous to src/access.ts. Migration `004-agent-destinations.ts` renamed to `module-agent-to-agent- destinations.ts` preserving `name: 'agent-destinations'` so existing DBs don't re-run it. delivery.ts: 600 → 449 lines. handleSystemAction's last switch case gone (just registry + default log-and-drop). notifyAgent helper removed (only create_agent used it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 19:00:10 +03:00
gavrielc	32bcc2c5ae	refactor(permissions): preserve pre-PR behavior in three spots PR #5 review flagged three behavior changes that shouldn't have slipped in. This commit reverts each to match the pre-refactor behavior exactly. 1. User upsert ordering. Split the router hook into two setters: setSenderResolver (runs before agent resolution) and setAccessGate (runs after). Restores the pre-PR sequence where the users row is upserted even if the message is dropped by wiring or trigger rules. 2. dropped_messages audit. Moved src/modules/permissions/db/dropped-messages.ts back to src/db/dropped-messages.ts. The table is core audit infra, not permissions-specific. Router re-writes rows for no_agent_wired and no_trigger_match; the access gate writes rows for policy refusals. 3. Permissionless container fallback. Dropped. poll-loop restores the original deny-all check when NANOCLAW_ADMIN_USER_IDS is empty. Module contract doc updated with the two-hook shape. Validation: host build clean, 137/137 host tests, 17/17 container tests, typecheck clean, service boots to "NanoClaw running" with permissions module registering both hooks and clean SIGTERM shutdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 18:00:10 +03:00

1 2 3 4 5 ...

352 Commits