nanoclaw

Author	SHA1	Message	Date
gavrielc	d2151ae848	Merge branch 'main' into fix/session-manager-attachment-extensions	2026-04-30 10:39:50 +03:00
gavrielc	6e5e568da1	sanitize agent sent file names to prevent path traversal	2026-04-30 10:33:46 +03:00
gavrielc	2a3be9ec7f	extract attachment-naming, harden mimeType guard, add tests Move the MIME/type-to-extension maps and derivation helpers out of session-manager.ts into a dedicated attachment-naming module — keeps session-manager focused on session lifecycle and gives the helpers a natural home for unit tests alongside the existing attachment-safety module. Two small fixes alongside the extraction: - extForMime now guards `typeof mime !== 'string'` before .split, so a buggy bridge passing `mimeType: { ... }` (object) no longer crashes the inbound write loop. - deriveAttachmentName computes Date.now() once per call instead of twice, and tightens the explicit-name check to a string-and-truthy guard so non-string values fall through to derivation. Adds attachment-naming.test.ts with 11 cases covering MIME normalization (case + parameters), Telegram type fallback, the non-string defensive guard, and the bare-timestamp fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 09:41:24 +03:00
robbyczgw-cla	b9d302524e	fix(session-manager): derive attachment extension from mimeType and att.type When a channel bridge passes an attachment without an explicit `name`, extractAttachmentFiles fell back to `attachment-<ts>` with no extension. Agents could not tell whether the file was a JPEG, PDF, or audio clip, and tools keyed on extension (image viewers, exiftool, etc.) misbehaved. Two cases are now covered: 1. Channels that set `mimeType` but no `name` (Discord/Slack documents, Telegram document uploads). A small MIME-to-extension table covers the common content types — image/, audio/, video/*, pdf, zip, txt, json. Unknown MIMEs fall back to the unsuffixed name. 2. Channels that set `att.type` but no `mimeType` (Telegram photos, stickers, voice, animations). The chat-sdk bridge sets a coarse media-class (`photo` / `sticker` / `voice` / `video` / `animation`) which is reliable enough to derive a canonical extension. Telegram GIFs are MP4 under the hood. The existing isSafeAttachmentName security guard is preserved — the derived name still passes through it before disk I/O. The new lookup tables emit static values from internal maps and cannot construct a path-traversal payload; attacker-controlled att.name continues to flow through the same validator.	2026-04-29 15:01:09 +00:00
gavrielc	3c620bc8d0	Merge branch 'fix/credential-failure-ux' of https://github.com/qwibitai/nanoclaw into fix/credential-failure-ux	2026-04-29 17:52:17 +03:00
gavrielc	d5b48e4742	fix(credentials): address review feedback - wakeContainer now never throws — returns Promise<boolean>, catches internally. Closes the regression risk for the 5 awaited callers in agent-to-agent, interactive, and approvals/response-handler that the previous version left unwrapped. Router uses the boolean to stop the typing indicator on transient failure; host-sweep just awaits. - Tighten AUTH_REQUIRED_RE: anchor to start-of-string with the specific `·` (U+00B7) separator the CLI uses, so an agent that quotes the banner mid-sentence in a normal reply doesn't trip the classifier. - Log a one-line note from writeAuthRequiredMessage so substitutions are visible when debugging "user got the credentials message but I don't see why." - Add unit tests for ClaudeProvider.isAuthRequired covering both banner variants, trailing content, mid-sentence quoting, leading-prose quoting, alternate separators, and unrelated text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 17:51:32 +03:00
gavrielc	1dd8fabde9	Merge branch 'main' into fix/credential-failure-ux	2026-04-29 17:42:25 +03:00
gavrielc	5f34e26240	fix(credentials): translate auth errors and require OneCLI for spawn Two related fixes for the case where credentials aren't usable: 1. Replace Claude Code's "Not logged in / Invalid API key · Please run /login" output with a host-aware message. The user can't run /login from chat, so the raw text is unhelpful. Provider gains an optional isAuthRequired() classifier; the poll-loop substitutes the message on both result-text and error paths. 2. Treat OneCLI gateway failure as a transient hard error instead of spawning a credential-less container. The catch in container-runner now propagates; router and host-sweep wrap wakeContainer to log and leave the inbound row pending so the next 60s sweep tick retries. Router also stops the typing indicator on failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 17:02:15 +03:00
gavrielc	336e01d2a1	fix circuit-breaker off-by-one, ENOENT, and reset-on-throw + tests - getDelay indexed by attempt (1-based) into a 0-indexed array, so the leading 0 was unreachable and every "after a crash" delay was shifted up one slot. Use attempt - 1 so the documented schedule (0s → 0s → 10s → 30s → 2min → 5min → 15min cap) actually holds. - enforceStartupBackoff runs before initDb (which creates DATA_DIR), so on a fresh checkout fs.writeFileSync hit ENOENT. write() now mkdirSync's DATA_DIR first. - shutdown() didn't run resetCircuitBreaker if teardownChannelAdapters threw, so a graceful exit with a teardown error would be counted as a crash on the next start. Wrap teardown in try/finally. - Adds src/circuit-breaker.test.ts: state transitions, full schedule (parameterized), reset-window expiry, malformed file, and the fresh-install path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 22:51:11 +03:00
Daniel Milliner	2bf296b04a	add startup circuit breaker and troubleshooting docs Backs off on rapid restarts to avoid exhausting Discord gateway identify limits and triggering Cloudflare IP bans. Resets on clean shutdown so only crashes accumulate the counter. Also adds a troubleshooting section to CLAUDE.md with the most useful diagnostic locations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-28 14:07:24 +00:00
gavrielc	7e37b13aab	Fix path traversal in attachment handling on channel-inbound path	2026-04-28 13:26:44 +03:00
gavrielc	6591062fbb	refactor: route custom Anthropic endpoint through OneCLI vault The original approach passed ANTHROPIC_AUTH_TOKEN into the container as an env var and disabled the proxy for the custom host (NO_PROXY) — which works, but bypasses OneCLI entirely for that credential. The container holds the raw secret, the gateway loses audit/rotation, and we lose the rest of the vault's protections for this cohort. OneCLI-native version: store the token as a generic secret with header injection (--header-name Authorization --value-format 'Bearer {value}' + host-pattern matching the base URL hostname). The container only needs ANTHROPIC_BASE_URL plus a placeholder ANTHROPIC_AUTH_TOKEN — the proxy rewrites the Authorization header on the wire. setup/lib/setup-config.ts — adds --anthropic-auth-token alongside the existing --anthropic-base-url. setup/auto.ts — runAuthStep short-circuits the auth-method prompt when both NANOCLAW_ANTHROPIC_BASE_URL and NANOCLAW_ANTHROPIC_AUTH_TOKEN are set: creates the OneCLI generic secret, writes ANTHROPIC_BASE_URL to .env (so the runtime reads it), and appends `import './claude.js';` to src/providers/index.ts (so the provider only registers when the user has configured a custom endpoint — no branching for everyone else). src/providers/claude.ts — drops ANTHROPIC_AUTH_TOKEN/NO_PROXY passthrough. Reads ANTHROPIC_BASE_URL from .env, sets a placeholder ANTHROPIC_AUTH_TOKEN in container env so the SDK includes an Authorization header for OneCLI to overwrite. src/providers/index.ts — removes the unconditional import; setup appends it on demand. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:34:31 +03:00
KeXin95	26fc3ff322	feat: pass ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN into agent containers Users with a custom Anthropic-compatible endpoint (ANTHROPIC_BASE_URL) were getting 401s because the OneCLI proxy injects ANTHROPIC_API_KEY=placeholder and forwards to api.anthropic.com, overriding the custom endpoint and key. Add a claude provider host config that reads ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, and CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC from .env and passes them into the container. Also sets NO_PROXY for the custom host so the OneCLI proxy doesn't intercept those requests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-27 00:32:16 +03:00
gavrielc	2825f657ca	Merge branch 'main' into fix/register-channel-wiring	2026-04-24 17:20:29 +03:00
gavrielc	f804ebf2e9	Merge branch 'main' into fix/session-state-per-provider-and-agent-route-files	2026-04-24 17:13:06 +03:00
grtwrn	fc375ca72b	fix(register): wire channels with correct engage fields, skip prefix for native IDs setup/register.ts had two bugs that prevented new channels from being registered via `/manage-channels`: 1. createMessagingGroupAgent was called with the legacy field names `trigger_rules` and `response_scope`. The SQL INSERT expects `engage_mode` / `engage_pattern` / `sender_scope` / `ignored_message_policy` (migration 010). Every register call failed with `RangeError: Missing named parameter "engage_mode"` after the agent and messaging group were partially created — leaving an orphaned pair. Now mirrors scripts/init-first-agent.ts:wireIfMissing: - Groups (is_group=1) default to engage_mode='mention' (bot only responds when addressed). - DMs (is_group=0) default to engage_mode='pattern' with '.' (respond to every message). - An explicit --trigger overrides the pattern regex. 2. The "normalize platform_id" block unconditionally prefixed "<channel>:" even for native IDs like WhatsApp JIDs ("120363408974444974@g.us"), iMessage emails ("user@example.com"), or Signal phones ("+15551234567") / Signal groups ("group:abc"). But the router (src/router.ts:158) looks up messaging_groups by the raw event.platformId from the adapter, which for these native adapters never has a prefix. So the prefixed row was never matched — the message was silently dropped with no "Message routed" log. Extracted scripts/init-first-agent.ts:namespacedPlatformId into src/platform-id.ts so both setup paths use the same heuristic (skip the prefix for IDs containing '@', starting with '+', or starting with 'group:'). Prevents future drift between the two paths. Tested by: re-running `setup/index.ts --step register` for a WhatsApp group JID, confirming the row is created with correct engage fields and matching platform_id, then sending a test message and observing "Message routed" with the right agent group. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 17:06:10 +03:00
glifocat	3d6837c411	chore(format): apply prettier to chat-sdk-bridge.ts Two long-line violations introduced in `d121cd1` (isGroup plumbing) exceed the printWidth limit. CI format:check fails on every PR opened against main until this is fixed; the fix is isolated here so no behavior change is mixed in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 12:12:05 +02:00
Adam	fd03b89333	fix(agent-route): reject unsafe attachment filenames to prevent path traversal Filenames in forwardAttachedFiles arrived from the source agent's messages_out content and were used directly in path.join on both source outbox read and target inbox write. A value like `../evil.sh` could escape `inbox/<a2a-id>/` on the target session (and similarly the source outbox on read), breaking session isolation — an adversarial or hallucinating sub-agent could overwrite files in a sibling session. Adds isSafeAttachmentName(name) — exported so it's unit-testable — which rejects empty, `.`, `..`, anything containing `/`, `\`, or NUL, and anything path.basename would strip. Guard runs before any I/O. Unsafe names are dropped with a warning log, same pattern as missing-source-file handling; a bad filename in one attachment doesn't kill the whole route's text delivery. Addresses Codex Review P1 on qwibitai/nanoclaw#1967. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 15:45:08 +10:00
Adam	672e228876	fix(agent-route): forward file attachments between agents Before: `send_file(to='parent')` from a sub-agent wrote the bytes to the sub-agent's own session outbox, but agent-to-agent routing copied only the content JSON — the target's inbound message referenced `files: ['x.png']` but the bytes lived in a session directory the target couldn't mount. Parent agents orchestrating sub-agents (e.g. Design Team delegating illustration work to an Illustrator sub-agent on Codex) received file-reference messages with nothing to forward. Fix: on route, if the source's content has `files`, copy each referenced file from `<source>/outbox/<src-msg-id>/` to `<target>/inbox/<a2a-msg-id>/`, and emit `attachments` (the existing formatter convention — see formatter.ts:223) with `localPath` relative to `/workspace/`. The target formatter already renders these as `[file: <name> — saved to /workspace/inbox/<a2a-id>/<name>]`, so the target agent sees the path and can call `send_file(path=…, to=…)` to forward onward. Convention matches what session-manager.ts:256 already does for base64-encoded channel-inbound attachments — same inbox layout, same content shape. Nothing on the formatter/agent side needed to change. ## Scope - `forwardAttachedFiles(source, target)` — pure-ish helper that copies files and returns the attachments array. - `forwardFileAttachments(msg, …)` — wraps the helper for the route path: parses content, copies files if present, merges into any existing `attachments`, re-serialises. - `routeAgentMessage` — uses the rewritten content when writing the target's inbound row. - Log line now includes `forwardedFileCount` for observability. Missing source files are skipped with a warning rather than killing the route — a bad filename in a batch shouldn't drop the accompanying text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 15:34:29 +10:00
exe.dev user	5845a5a980	fix(container-runner): honor agent_provider DB columns with session override resolveProviderContribution read only containerConfig.provider (from each group's container.json) and ignored both agent_groups.agent_provider and sessions.agent_provider. The provider-install skills (opencode, codex) and CLAUDE.md document those DB columns as the source of truth with session-overrides-group precedence, but the code never consulted them — so setting `agent_provider = 'codex'` on a group had no effect, and the only way to route to a non-default provider was to edit the per-group JSON directly. Discovered while wiring up Codex: DB update landed but the spawned container kept running Claude. Extract a pure `resolveProviderName(session, group, containerConfig)` with the documented precedence: sessions.agent_provider → agent_groups.agent_provider → container.json `provider` → 'claude' `resolveProviderContribution` now calls it. The container.json fallback stays so existing installs that only set provider in JSON keep working. Empty strings treated as unset to avoid footguns when a DB-backed form writes '' for "no override." Added unit tests covering precedence, null-fallthrough, empty-string fallthrough, and case normalization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:47:10 +00:00
gavrielc	c1d0395d11	Merge branch 'main' into main	2026-04-23 23:04:35 +03:00
gavrielc	f351e46008	refactor(approvals): persist title+options on channel/sender approval tables getAskQuestionRender used to hardcode the card title and option labels for pending_channel_approvals and pending_sender_approvals in the DB-access layer, duplicating wording that already lived in the approval modules. That caused a visible drift between the initial card title — picked per event in channel-approval.ts ("📣 Bot mentioned in new chat" vs. "💬 New direct message") — and the post-click render, which always showed the constant "📣 Channel registration". Mirror the pattern already used by pending_approvals: add title / options_json columns on both pending_*_approvals tables via migration 013, have the approval modules write them at creation time, and let getAskQuestionRender just SELECT. - Migration 013 ALTERs the two tables to add title + options_json. - PendingChannelApproval / PendingSenderApproval types and their create functions grow the two fields. - channel-approval.ts / sender-approval.ts normalize options once and pass both title and options_json into the insert. - getAskQuestionRender drops the hardcoded render objects and reads the stored values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:54:47 +03:00
gavrielc	ffd38f660a	Merge branch 'main' into fix/pending-rows-idempotent	2026-04-23 22:37:22 +03:00
exe.dev user	97868af5a7	fix(delivery): make pending_questions/approvals insert idempotent createPendingQuestion and createPendingApproval both run before the adapter delivery call. When delivery fails and the retry loop reinvokes deliverMessage with the same questionId/approvalId, the second attempt hit UNIQUE constraint on the pending_questions.question_id (or pending_approvals.approval_id) and threw — so the retry never reached the send step, and every subsequent retry failed the same way until max-attempts marked the message permanently failed. Switch both inserts to INSERT OR IGNORE. Return bool indicating whether a new row was actually inserted so delivery.ts can avoid logging "Pending question created" twice for the same card. Symptom that surfaced this: a send-layer ValidationError on one attempt followed by SqliteError on every subsequent attempt, with the user seeing neither the card nor a follow-up. Seen in conjunction with the Telegram 64-byte callback_data limit (fixed separately in #1942/chat-sdk-bridge), but the idempotency gap applies to any transient delivery failure — rate limits, network blips, adapter 5xx — and is worth fixing on its own. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 17:05:41 +00:00
exe.dev user	ff277c0d49	fix(chat-sdk-bridge): encode option index in callback_data for Telegram 64-byte cap ask_question cards failed to deliver on Telegram whenever any option had a non-trivial value (e.g. an ISO datetime, a URL, or a long token). Telegram limits inline-keyboard callback_data to 64 bytes, and the previous encoding embedded both the questionId and the full option value in each button's actionId plus a second copy as value, producing payloads well over the cap. The adapter threw ValidationError, delivery was marked permanently failed, and the agent sat waiting on an answer that never reached the user. Fix: - Button id is now `ncq:<questionId>:<index>` and button value is the stringified index. Callback payloads shrink from ~100 bytes to ~40 and fit Telegram's cap for any option list with <100 items. - Both callback-decode sites (Chat SDK `onAction` for Telegram/Slack/ etc., and the Discord Gateway interaction handler) resolve the index back to the real option value via `getAskQuestionRender(questionId)` before dispatching to the host's onAction — so response handlers (pending_questions, pending_approvals) are unchanged and still receive the canonical value. - `resolveSelectedOption` helper has a backward-compat fallback: non-numeric tails are treated as literal values so any card delivered under the old encoding still resolves if the user clicks it after deploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:56:21 +00:00
Gabi Simons	a8eb82d529	Merge branch 'main' into main	2026-04-23 18:24:24 +03:00
exe.dev user	237876c2c6	chore(format): wrap session-manager import in container-runner Pre-commit prettier reformatted this in the working tree but didn't re-stage. Keeping it in a separate commit to avoid amending a prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:12:56 +00:00
exe.dev user	209061f54f	fix(sweep): wake before reset + idempotent retry for orphan claims When a container exits with an unresolved processing_ack claim, the sweep's crashed-container cleanup would reset the matching inbound message with tries++ and a future process_after. dueCount then dropped to 0, so the wake step never fired — and the next sweep tick found the same orphan claim, bumped tries again, and pushed process_after further out. The message reached MAX_TRIES and was marked failed without any container ever being spawned. Two changes: 1. Reorder sweep so the wake step runs before crashed-container cleanup. A fresh container clears orphan 'processing' rows on its own startup (container/agent-runner/src/db/connection.ts), so once we get it running the claim resolves itself. 2. Make resetStuckProcessingRows idempotent: if a message already has process_after set to a future time, skip the retry bump. The wake path will pick it up when the backoff elapses. Requires returning process_after from getMessageForRetry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:12:16 +00:00
exe.dev user	bee80b0072	fix(container): clear orphan heartbeat before spawn After a container exits, its .heartbeat file is left behind with the mtime of its last SDK activity. When the same session spawns a new container, the host sweep's ceiling check reads that stale mtime and kills the freshly-spawned container within seconds — before the new instance has had time to touch the file itself. The sweep already has a carve-out for "no heartbeat file" (treated as a fresh spawn, given grace), so simply removing the orphan at spawn time restores the intended semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:12:02 +00:00
gavrielc	dd5bc85b02	refactor(skill/atomic-chat-tool): ship MCP file in skill folder, revert src edits The initial /add-atomic-chat-tool merge added src edits directly to main. That conflicts with the utility-skill pattern used elsewhere (e.g. /claw): the skill folder should ship the file and SKILL.md should instruct copy + idempotent edits at install time, not a git merge that carries src diffs. - Move container/agent-runner/src/atomic-chat-mcp-stdio.ts → .claude/skills/add-atomic-chat-tool/atomic-chat-mcp-stdio.ts - Revert the atomic_chat mcpServers entry in agent-runner index.ts - Revert mcp__atomic_chat__* from TOOL_ALLOWLIST in providers/claude.ts - Revert ATOMIC_CHAT_* env forwarding and [ATOMIC] log elevation in src/container-runner.ts - Empty .env.example back out - Rewrite SKILL.md: copy the shipped file, then apply deterministic Edits (index.ts, providers/claude.ts, container-runner.ts, .env.example) with exact before/after snippets the installer agent can match. Main is now back to its pre-PR state for the tool; /add-atomic-chat-tool re-applies everything at install time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:29:10 +03:00
Misha Skvortsov	3a9b98f1a4	feat: add Atomic Chat MCP tool skill Exposes local Atomic Chat models (OpenAI-compatible API at 127.0.0.1:1337/v1) as tools to the container agent. Adds atomic_chat_list_models and atomic_chat_generate alongside the existing Ollama skill. Rebased on current main: - MCP server registered in agent-runner index.ts using bun (no tsc step in-image), sibling path to index.ts, env: {} with ATOMIC_CHAT_* forwarded when set. - allowedTools entry moved to providers/claude.ts TOOL_ALLOWLIST. - SKILL.md: drop obsolete per-group copy step (single RO mount supersedes it); use pnpm build. Made-with: Cursor Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:18:34 +03:00
exe.dev user	40f5683c36	fix(approvals): show correct post-click labels on channel/sender cards getAskQuestionRender only checked pending_questions and pending_approvals, missing the channel and sender approval tables. Approval button clicks showed the raw value ("approve") instead of the selectedLabel ("✅ Wired"). Extend the lookup to also check pending_channel_approvals and pending_sender_approvals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:45 +00:00
exe.dev user	15f30682d7	fix(approvals): show human-readable names in approval cards Channel and sender approval cards showed raw platform IDs (e.g. discord:1475578393738219540:...) instead of readable context. Extract sender name from the event content for channel approvals, and use the channel type name for sender approvals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:34 +00:00
exe.dev user	d121cd1cd6	fix(router): pass isGroup from adapter through to messaging group creation The router hardcoded is_group=0 when auto-creating messaging groups, causing channel mentions to be misclassified as DMs. The Chat SDK bridge knows which handler fired (onDirectMessage vs onNewMention) so thread the signal through InboundMessage → InboundEvent → router. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:23 +00:00
exe.dev user	61ca43d193	fix(discord): resolve user ID from DM interactions for approval clicks Discord puts the clicking user at interaction.member.user for guild interactions but interaction.user for DM interactions. The Gateway handler only checked interaction.member, so DM button clicks resolved to an empty user ID and were silently rejected as unauthorized. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-23 12:23:12 +00:00
Lazer Cohen	2383bde80f	fix(container): scope orphan reaper by install label so peers don't kill each other Two installs on the same host could trash each other's containers: the reaper used `docker ps --filter name=nanoclaw-`, a substring match that picked up every install's containers. A crash-looping peer (e.g. a legacy v1 plist respawning ~6k times) would call cleanupOrphans on every boot and kill the healthy install's session containers within seconds of spawn. - Stamp `--label nanoclaw-install=<slug>` onto every spawned container. - cleanupOrphans filters by that label; healthy peers are left alone. - Setup preflight enumerates `com.nanoclaw*` launchd plists / nanoclaw user systemd units, probes state/runs, and unloads any that are crash-looping (state != running AND runs > 10) before installing this install's service. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 12:12:30 +03:00
gavrielc	5f1b3e5cad	style: apply prettier formatting to install-slug additions	2026-04-23 10:10:48 +03:00
gavrielc	7a9401ddf2	feat(setup): per-checkout service name and docker image tag Two NanoClaw installs on the same host used to fight over the shared `com.nanoclaw` launchd label / `nanoclaw.service` systemd unit and the `nanoclaw-agent:latest` docker tag — the second install silently rewrote the service pointer and rebuilt the image out from under the first. Introduces a deterministic per-checkout slug (sha1(projectRoot)[:8]) and namespaces everything off it: - Service: `com.nanoclaw-v2-<slug>` / `nanoclaw-v2-<slug>.service` - Image: `nanoclaw-agent-v2-<slug>:latest` (base), `nanoclaw-agent-v2-<slug>:<agentGroupId>` (per-group) New shared helpers: src/install-slug.ts (host) + setup/lib/install-slug.sh (bash). Both compute the same slug so verify/probe/add-*.sh/build.sh/container-runner all agree. Any v1 `com.nanoclaw` service left on the host stays untouched and can coexist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 10:10:09 +03:00
gavrielc	3b8240a91b	refactor(self-mod): drop request_rebuild — approvals now bundle rebuild+restart install_packages and add_mcp_server already did the right thing on approve (install auto-rebuilt+killed, add_mcp_server just killed), so request_rebuild was redundant plumbing agents sometimes called after an install — wasting an admin approval round-trip. Delete it end-to-end: - container/agent-runner/src/mcp-tools/self-mod.ts: remove requestRebuild tool + registration; update install_packages description. - src/modules/self-mod/{request,apply,index}.ts: drop handleRequestRebuild + applyRequestRebuild + registrations; rewrite the rebuild-failed notify to point admins at retrying install_packages instead. - src/modules/{approvals,self-mod}/{agent,project}.md and skill/self- customize/SKILL.md: scrub agent-facing references; clarify that add_mcp_server needs no rebuild (bun runs TS directly). - docs/{module-contract,architecture-diagram,checklist,db-central,shared- source,v1-vs-v2/*}.md, CLAUDE.md, pending-approvals migration comment, approvals/index.ts docstring, REFACTOR.md: trailing references. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:28:36 +03:00
gavrielc	e64bdb3016	refactor(claude-md): split shared base into module fragments, inject name at runtime Move every agent-specific instruction out of the shared container/CLAUDE.md so the base is genuinely universal. Persona/identity now comes from the system-prompt addendum (buildSystemPromptAddendum now takes assistantName and prepends "# You are {name}"). Per-module instructions live alongside each MCP tool source: container/agent-runner/src/mcp-tools/core.instructions.md container/agent-runner/src/mcp-tools/scheduling.instructions.md container/agent-runner/src/mcp-tools/self-mod.instructions.md composeGroupClaudeMd() scans that directory and emits `module-<name>.md` fragments as symlinks to /app/src/mcp-tools/<name>.instructions.md (valid via the existing RO source mount). Skill fragments renamed to `skill-<name>.md` for naming consistency with `module-` and `mcp-`. Mount tightening so composer-managed files can't be clobbered by agent writes: nested RO mounts for /workspace/agent/CLAUDE.md and /workspace/agent/.claude-fragments/. CLAUDE.local.md (per-group memory) stays RW as the only writable CLAUDE.md-family file. .gitignore: ignore CLAUDE.local.md, .claude-shared.md, .claude-fragments/ everywhere, and simplify groups/ rules to ignore the whole tree (per- installation state, not tracked). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 17:14:51 +03:00
gavrielc	95e74d8383	docs(onecli): expand secrets section; correct stale admin-roles refs Document the selective-mode gotcha for auto-created OneCLI agents (no secrets injected by default) with the CLI commands to inspect and fix it. Note that approval policies are not configurable via the SDK or `onecli@1.3.0` CLI — web UI only. Replace stale `NANOCLAW_ADMIN_USER_IDS` / `src/access.ts` references across CLAUDE.md, docs/architecture.md, docs/checklist.md, and docs/module-contract.md. Admin gating now runs host-side in src/command-gate.ts against `user_roles`; approver picks live in src/modules/approvals/primitive.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 16:46:17 +03:00
gavrielc	3db66c0ced	fix: forward ONECLI_API_KEY to OneCLI SDK for authenticated container config Ports the v1 fix from PR #1777 (originally `8b5b581` by @johnnyfish). Cherry-pick did not apply cleanly because v2 reformatted the surrounding code and split OneCLI usage into two sites — manual port was needed. v2-specific adaptations: - Also forward apiKey at the second OneCLI call site in src/modules/approvals/onecli-approvals.ts (v2 split the approvals module out of container-runner). - Skipped the companion test-mock commit (`38163bc`) — it patches src/container-runner.test.ts, which no longer exists in v2 (tests consolidated into host-core.test.ts). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: johnnyfish <jonathanfishner11@gmail.com>	2026-04-22 15:16:59 +03:00
gavrielc	8e1c8f8f61	style: apply prettier formatting to touched files Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:57:09 +03:00
gavrielc	c8fc1da719	refactor(claude-md): compose per-group CLAUDE.md from shared base + fragments Replace the per-group "written once at init, owned by the group" CLAUDE.md with a host-regenerated entry point that imports: - a shared base (`container/CLAUDE.md` mounted RO at `/app/CLAUDE.md`) - optional per-skill fragments (skills that ship `instructions.md`) - optional per-MCP-server fragments (inline `instructions` field in `container.json`) - per-group agent memory (`CLAUDE.local.md`, auto-loaded by Claude Code) Principle: RW = per-group memory, RO = shared content. Source/skills/base are shared; personality, config, working files, and Claude state stay per-group. Key changes: - New `src/claude-md-compose.ts` — per-spawn composition + `migrateGroupsToClaudeLocal()` one-time cutover. - New `container/CLAUDE.md` — shared base, seeded verbatim from the former `groups/global/CLAUDE.md`. - `src/container-runner.ts` — swap `/workspace/global` mount for RO `/app/CLAUDE.md`; call `composeGroupClaudeMd()` after `initGroupFilesystem()`. - `src/group-init.ts` — drop `.claude-global.md` symlink + initial `CLAUDE.md` write; seed `CLAUDE.local.md` from `opts.instructions`. - `src/index.ts` — call `migrateGroupsToClaudeLocal()` at startup. - `src/container-config.ts` — add optional `instructions` field to `McpServerConfig` (inline per-MCP guidance fragment). - `container/Dockerfile` — drop dead `/workspace/global` mkdir. - Remove obsolete `scripts/migrate-group-claude-md.ts`. Migration (runs once at host startup, idempotent): - Delete `.claude-global.md` symlinks in each group. - Rename each `groups/<folder>/CLAUDE.md` → `CLAUDE.local.md` (preserves existing per-group content as memory). - Delete `groups/global/` directory. Design docs: `docs/claude-md-composition.md` and `docs/shared-source.md` (the latter is the sibling design discussion this refactor builds on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 12:58:43 +03:00
exe.dev user	8a12fa61ac	refactor: shared source — replace per-group agent-runner copies with single RO mount Replace the per-group agent-runner-src copy model with a single shared read-only mount. Source and skills are now RO + shared; personality, config, working files, and Claude state stay RW + per-group. Key changes: - Mount container/agent-runner/src/ RO at /app/src (all groups share one copy) - Mount container/skills/ RO at /app/skills; per-group skill selection via symlinks in .claude-shared/skills/ based on container.json "skills" field - Mount container.json as nested RO bind on top of RW group dir - Move all NANOCLAW_* env vars to container.json (runner reads at startup) - New runner config.ts module replaces process.env reads - Move command gate (filtered/admin) from container to host router - Dockerfile: remove source COPY, split CLI installs (claude-code last), move agent-runner deps above CLIs for better layer caching - Add writeOutboundDirect for router denial responses - Design doc at docs/shared-src.md Not included (follow-up): DB migration to drop agent_provider columns, cleanup of orphaned agent-runner-src directories. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-22 12:58:43 +03:00
gavrielc	1858ef35f0	Merge pull request #1908 from qwibitai/setup-auto feat(setup): scripted branded setup flow (nanoclaw.sh)	2026-04-22 03:06:30 +03:00
gavrielc	416fe01855	refactor(setup): drop CLI-bonus wiring from init-first-agent init-first-agent used to double-wire the CLI channel to every new DM agent as a convenience for `pnpm run chat`, gated by --no-cli-bonus. With the /new-setup-2 flow gone and a dedicated scratch CLI agent created earlier in setup:auto, that bonus just stomps on CLI routing the user already set up. Remove the CLI_CHANNEL/CLI_PLATFORM_ID constants, ensureCliMessagingGroup, the --no-cli-bonus flag, and the cli-bonus wiring block. Pass the paired user's identity through to the welcome delivery so the sender resolver sees the real owner (e.g. telegram:<id>) instead of cli:local. Extend the CLI channel's admin-transport payload to accept optional sender/senderId overrides — falls back to the old cli/cli:local defaults when omitted, so existing callers are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:13:22 +03:00
Dave Kim	91c668e0cc	fix: persist SDK session_id on init + split long messages before adapter truncation Two related bugs that surfaced together when a Discord response exceeded 2000 chars: 1. Session id lost on mid-turn container exit. `runPollLoop` was calling `setStoredSessionId` only after `processQuery` returned. If the container died between the SDK's `init` event (where session_id arrives) and the stream completing, the id was never persisted. The next wake called `getStoredSessionId()` → undefined and started a fresh Claude session, dropping all prior context. Fix: persist immediately in the `init` branch inside `processQuery`. The existing post-query store becomes a harmless no-op. 2. Silent truncation past adapter limits. `chat-sdk-bridge.deliver` handed full text straight to `adapter.postMessage`. Discord's adapter hard-truncates at 2000 chars; Telegram's at 4096. Responses longer than that were cut off without any signal to the user or host. Fix: add `maxTextLength` to `ChatSdkBridgeConfig` and a `splitForLimit` helper that breaks on paragraph → line → hard-char boundaries, then posts chunks sequentially. Files ride on the first chunk; the returned id is the first chunk's so edits and reactions still target the reply head. Channel adapter files (Discord, Telegram, …) live on the `channels` branch — a companion PR wires `maxTextLength: 1900` for Discord and `4000` for Telegram so the splitter actually engages in those installs. Without wiring, behavior is unchanged.	2026-04-21 13:04:57 +00:00
gavrielc	01ffce6f74	Revert "fix(permissions): welcome new approved channels via /welcome, route to them" This reverts commit `9776dd4f32`.	2026-04-21 15:20:06 +03:00
Koshkoshinsk	9776dd4f32	fix(permissions): welcome new approved channels via /welcome, route to them When the unknown-channel approval flow completes, seed a /welcome task into the newly-wired session so the agent greets the new user on first contact. The replayed /start (Telegram's default first-message) is filtered by the agent-runner's command-command filter, so without an explicit onboarding trigger the first interaction produced nothing. Pin the destination by its local_name from agent_destinations to avoid the agent picking the wrong named destination (previously it greeted the owner, whose DM is in CLAUDE.md). Also guard dispatchResultText against echoing trailing status lines when the agent has already sent messages explicitly via send_message. Otherwise a task-triggered flow that calls send_message then emits "welcome message sent" produces a duplicate chat to the recipient. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 11:40:12 +00:00

1 2 3 4 5 ...

379 Commits