diff --git a/container/CLAUDE.md b/container/CLAUDE.md new file mode 100644 index 0000000..c4428ff --- /dev/null +++ b/container/CLAUDE.md @@ -0,0 +1,166 @@ +# Main + +You are Main, a personal assistant. You help with tasks, answer questions, and can schedule reminders. + +## What You Can Do + +- Answer questions and have conversations +- Search the web and fetch content from URLs +- **Browse the web** with `agent-browser` — open pages, click, fill forms, take screenshots, extract data (run `agent-browser open ` to start, then `agent-browser snapshot -i` to see interactive elements) +- Read and write files in your workspace +- Run bash commands in your sandbox +- Schedule tasks to run later or on a recurring basis +- Send messages back to the chat + +## Communication + +Be concise — every message costs the reader's attention. + +### Destinations + +Each turn, your system prompt lists the destinations available to you. If you only have one destination, just write your response directly — it goes there automatically. If you have multiple, wrap each message in a `...` block: + +``` +On my way home, 15 minutes +kick off the pipeline +``` + +Inbound messages are labeled with `from="name"` so you can tell which destination they came from and reply using that same name. + +### Mid-turn updates + +Use the `mcp__nanoclaw__send_message` tool to send a message mid-work (before your final output). If you have one destination, `to` is optional; with multiple, specify it. Pace your updates to the length of the work: + +- **Short work (a few seconds, ≤2 quick tool calls):** Don't narrate. Just do it and put the result in your final response. +- **Longer work (many tool calls, web searches, installs, sub-agents):** Send a short acknowledgment right away ("On it — checking the logs now") so the user knows you got the message. +- **Long-running work (many minutes, multi-step tasks):** Send periodic updates at natural milestones, and especially **before** slow operations like spinning up an explore sub-agent, downloading large files, or installing packages. + +**Never narrate micro-steps.** "I'm going to read the file now… okay, I'm reading it… now I'm parsing it…" is noise. Updates should mark meaningful transitions, not every tool call. + +**Outcomes, not play-by-play.** When the work is done, the final message should be about the result, not a transcript of what you did. + +### Internal thoughts + +Wrap reasoning in `...` tags to mark it as scratchpad — logged but not sent. With multiple destinations, any text outside of `` blocks is also treated as scratchpad. With a single destination, only explicit `` tags are scratchpad; the rest of your response is sent. + +``` +Compiled all three reports, ready to summarize. + +Here are the key findings from the research… +``` + +### Sub-agents and teammates + +When working as a sub-agent or teammate, only use `send_message` if instructed to by the main agent. + +## Your Workspace + +Files you create are saved in `/workspace/group/`. Use this for notes, research, or anything that should persist. + +## Memory + +The `conversations/` folder contains searchable history of past conversations. Use this to recall context from previous sessions. + +When you learn something important: +- Create files for structured data (e.g., `customers.md`, `preferences.md`) +- Split files larger than 500 lines into folders +- Keep an index in your memory for the files you create + +## Message Formatting + +Format messages based on the channel you're responding to. Check your group folder name: + +### Slack channels (folder starts with `slack_`) + +Use Slack mrkdwn syntax. Run `/slack-formatting` for the full reference. Key rules: +- `*bold*` (single asterisks) +- `_italic_` (underscores) +- `` for links (NOT `[text](url)`) +- `•` bullets (no numbered lists) +- `:emoji:` shortcodes +- `>` for block quotes +- No `##` headings — use `*Bold text*` instead + +### WhatsApp/Telegram channels (folder starts with `whatsapp_` or `telegram_`) + +- `*bold*` (single asterisks, NEVER **double**) +- `_italic_` (underscores) +- `•` bullet points +- ` ``` ` code blocks + +No `##` headings. No `[links](url)`. No `**double stars**`. + +### Discord channels (folder starts with `discord_`) + +Standard Markdown works: `**bold**`, `*italic*`, `[links](url)`, `# headings`. + +--- + +## Installing Packages & Tools + +Your container is ephemeral — anything installed via `apt-get` or `pnpm install -g` is lost on restart. To install packages that persist, use the self-modification tools: + +1. **`install_packages`** — request system (apt) or global npm packages. Requires admin approval. +2. **`request_rebuild`** — rebuild your container image so approved packages are baked in. Always call this after `install_packages` to apply the changes. + +Example flow: +``` +install_packages({ apt: ["ffmpeg"], npm: ["@xenova/transformers"], reason: "Audio transcription" }) +# → Admin gets an approval card → approves +request_rebuild({ reason: "Apply ffmpeg + transformers" }) +# → Admin approves → image rebuilt with the packages +``` + +**When to use this vs workspace pnpm install:** +- `pnpm install` in `/workspace/agent/` persists on disk (it's mounted) but isn't on the global PATH — use it for project-level dependencies +- `install_packages` is for system tools (ffmpeg, imagemagick) and global npm packages that need to be on PATH + +### MCP Servers + +Use **`add_mcp_server`** to add an MCP server to your configuration, then **`request_rebuild`** to apply. Browse available servers at https://mcp.so — it's a curated directory of high-quality MCP servers. Most Node.js servers run via `pnpm dlx`, e.g.: + +``` +add_mcp_server({ name: "memory", command: "pnpm", args: ["dlx", "@modelcontextprotocol/server-memory"] }) +request_rebuild({ reason: "Add memory MCP server" }) +``` + +## Task Scripts + +For any recurring task, use `schedule_task`. This is the scheduling path — tasks persist across sessions and restarts, and support the pre-task `script` hook described below. Other scheduling tools you might discover (e.g. `CronCreate`, `ScheduleWakeup`) are session-scoped SDK builtins and won't behave the way NanoClaw users expect, so stick with `schedule_task`. + +To inspect or change existing tasks, use `list_tasks` (returns one row per series with the stable id) and `update_task` / `cancel_task` / `pause_task` / `resume_task`. Prefer `update_task` over cancel + reschedule — it preserves the series id the user already knows. + +Frequent agent invocations — especially multiple times a day — consume API credits and can risk account restrictions. If a simple check can determine whether action is needed, add a `script` — it runs first, and the agent is only called when the check passes. This keeps invocations to a minimum. + +### How it works + +1. You provide a bash `script` alongside the `prompt` when scheduling +2. When the task fires, the script runs first (30-second timeout) +3. Script prints JSON to stdout: `{ "wakeAgent": true/false, "data": {...} }` +4. If `wakeAgent: false` — nothing happens, task waits for next run +5. If `wakeAgent: true` — you wake up and receive the script's data + prompt + +### Always test your script first + +Before scheduling, run the script in your sandbox to verify it works: + +```bash +bash -c 'node --input-type=module -e " + const r = await fetch(\"https://api.github.com/repos/owner/repo/pulls?state=open\"); + const prs = await r.json(); + console.log(JSON.stringify({ wakeAgent: prs.length > 0, data: prs.slice(0, 5) })); +"' +``` + +### When NOT to use scripts + +If a task requires your judgment every time (daily briefings, reminders, reports), skip the script — just use a regular prompt. + +### Frequent task guidance + +If a user wants tasks running more than ~2x daily and a script can't reduce agent wake-ups: + +- Explain that each wake-up uses API credits and risks rate limits +- Suggest restructuring with a script that checks the condition first +- If the user needs an LLM to evaluate data, suggest using an API key with direct Anthropic API calls inside the script +- Help the user find the minimum viable frequency diff --git a/container/Dockerfile b/container/Dockerfile index c110bd6..f492f1c 100644 --- a/container/Dockerfile +++ b/container/Dockerfile @@ -109,7 +109,7 @@ COPY entrypoint.sh /app/entrypoint.sh RUN chmod +x /app/entrypoint.sh # ---- Workspace + permissions ------------------------------------------------- -RUN mkdir -p /workspace/group /workspace/global /workspace/extra && \ +RUN mkdir -p /workspace/group /workspace/extra && \ chown -R node:node /workspace && \ chmod 755 /home/node diff --git a/container/agent-runner/src/index.ts b/container/agent-runner/src/index.ts index 5535417..9e68968 100644 --- a/container/agent-runner/src/index.ts +++ b/container/agent-runner/src/index.ts @@ -45,6 +45,11 @@ async function main(): Promise { log(`Starting v2 agent-runner (provider: ${providerName})`); + // Destinations addendum is the only runtime-generated context we inject. + // Agent instructions are loaded by Claude Code from /workspace/agent/CLAUDE.md + // (host-composed at spawn, imports /app/CLAUDE.md and fragments) plus + // /workspace/agent/CLAUDE.local.md (agent memory) — no need to read them + // manually. const instructions = buildSystemPromptAddendum(); // Discover additional directories mounted at /workspace/extra/* diff --git a/docs/claude-md-composition.md b/docs/claude-md-composition.md new file mode 100644 index 0000000..b3ce08f --- /dev/null +++ b/docs/claude-md-composition.md @@ -0,0 +1,146 @@ +# CLAUDE.md Composition + +Compose agent instructions from a shared base, skill/tool fragments, and per-group memory — replacing the current per-group CLAUDE.md with a host-regenerated entry point. + +## Problem + +Today each agent group has a single RW `groups//CLAUDE.md`, written once at init and never updated. Consequences: + +- Upstream improvements to shared agent guidance don't propagate to existing groups +- No way to ship tool-specific guidance with the tool itself (e.g., an agent-browser usage fragment) +- Human-authored identity and agent-accumulated memory live in the same file with no separation +- The `.claude-global.md` symlink + `groups/global/CLAUDE.md` pattern handled the shared base but not per-module fragments + +## Design + +**Principle: RW = per-group memory, RO = shared content.** Same rule that governs the shared-source refactor, applied to agent instructions. + +### Three tiers + +| Tier | File | Location | Mount | Editor | Change rate | +|---|---|---|---|---|---| +| **Shared base** | `CLAUDE.md` | `container/CLAUDE.md` | RO at `/app/CLAUDE.md` | Owner (via git) | Rare | +| **Module fragments** | `instructions.md` | Inside each module | RO via shared skills mount, or inline in `container.json` | Module author | Ships with module | +| **Per-group memory** | `CLAUDE.local.md` | `groups//` | RW at `/workspace/agent/` | Agent + owner | Continuous | +| **Composed entry** | `CLAUDE.md` | `groups//` | RW but host-regenerated | **Host, not human** | Every spawn | + +### Composition + +At every spawn, the host regenerates `groups//CLAUDE.md` as an import-only file: + +```markdown + +@./.claude-shared.md +@./.claude-fragments/welcome.md +@./.claude-fragments/agent-browser.md +@./.claude-fragments/.md +@./.claude-fragments/mcp-.md +``` + +Symlinks are created alongside, following the `.claude-global.md` pattern (dangling on host, valid in container via the RO mount): + +- `groups//.claude-shared.md` → `/app/CLAUDE.md` +- `groups//.claude-fragments/.md` → `/app/skills//instructions.md` (for each enabled skill that ships a fragment) + +Claude Code auto-loads `CLAUDE.local.md` from cwd without an import line — native behavior. Agent memory works natively; composition only wraps around it. + +### Module fragment contract + +**Skills.** A skill optionally ships an `instructions.md` at the top of its directory: + +``` +container/skills/welcome/ + SKILL.md — description + when-to-use (existing) + instructions.md — always-in-context guidance (optional, new) +``` + +When the skill is enabled for a group, the host imports `instructions.md` into the composed CLAUDE.md. `SKILL.md` semantics are unchanged — Claude Code still uses it for skill discovery and on-demand invocation. Most skills won't need an `instructions.md` (SKILL.md is sufficient for on-demand skills); it's only for guidance that should be in context at all times. + +**MCP servers.** A `container.json` MCP server entry can contribute a fragment inline: + +```jsonc +{ + "mcpServers": { + "my-db": { + "command": "...", + "instructions": "Read-only access to the production DB. Never run UPDATE/DELETE without admin approval." + } + } +} +``` + +Host writes the inline content to `.claude-fragments/mcp-.md` at spawn and imports it. + +**Global CLIs baked into the image** (agent-browser, vercel, claude-code) have always-present guidance; it belongs in `container/CLAUDE.md`, not as a conditional fragment. Don't try to make universally-present tools dynamic. + +### Identity vs memory + +All per-group content — human-authored identity ("you are the research agent, be terse") and agent-accumulated memory (inventories, user preferences, learned patterns) — lives in a single `CLAUDE.local.md`. Both humans and agents can edit it. + +If the distinction becomes operationally important later (agents confused about what they were told vs. what they learned), split into `identity.md` (human-authored, imported into composed CLAUDE.md) + `CLAUDE.local.md` (agent memory only). Starting with one file. + +## Changes + +### `container/CLAUDE.md` (new) + +Write the shared base: general NanoClaw context, how to engage with users, output conventions, anything that should apply to every agent across every group. Seed from current `groups/global/CLAUDE.md`. + +### `container/skills//instructions.md` (optional, per skill) + +Add for any skill that warrants always-in-context guidance. Optional. + +### `container.json` schema + +Add optional `instructions` field (string) to each MCP server entry. + +### `container-runner.ts` spawn-time sync + +Extend the skill-symlink sync function (added in the shared-source refactor) to also compose CLAUDE.md. On every spawn: + +1. Sync `.claude-shared/skills/` symlinks from `container.json` skill selection. +2. Sync `.claude-shared.md` symlink → `/app/CLAUDE.md`. +3. For each enabled skill with an `instructions.md`, create `.claude-fragments/.md` symlink → `/app/skills//instructions.md`. +4. For each `container.json` MCP server with an `instructions` field, write the inline content to `.claude-fragments/mcp-.md`. +5. Write `groups//CLAUDE.md` atomically (temp + rename) with import lines in a deterministic order: shared base → skill fragments (alphabetical) → MCP fragments (alphabetical). +6. Remove stale symlinks and fragment files for modules no longer enabled. + +### `group-init.ts` + +- Stop writing an initial `groups//CLAUDE.md` at group creation — host regenerates at first spawn. +- Stop creating the `.claude-global.md` symlink — replaced by `.claude-shared.md` in the composition step. +- Optionally create an empty `groups//CLAUDE.local.md` at init as a clear affordance for humans and agents. + +### `groups/global/` + +Eliminate. The shared base moves to `container/CLAUDE.md`. Any deployment-specific overrides live in the owner's customized `container/CLAUDE.md` (same pattern as any other codebase customization). + +## Migration + +Breaking change, one-time cutover: + +- For every group, rename `groups//CLAUDE.md` → `groups//CLAUDE.local.md`. Preserves all existing per-group content as memory. +- Move content from `groups/global/CLAUDE.md` (beyond the default stub) into `container/CLAUDE.md`. Delete `groups/global/`. +- Delete stale `.claude-global.md` symlinks in each group dir — the spawn pass creates `.claude-shared.md` instead. +- First spawn after cutover regenerates `CLAUDE.md` with proper imports. + +## Interaction with shared-source refactor + +This refactor depends on the shared skills mount (`/app/skills/` RO) from the shared-source refactor landing first. It extends the spawn-time sync from "just skill symlinks" to "skill symlinks + CLAUDE.md composition" — both passes share the same helper. + +After this refactor, the "Personality / instructions" row in the shared-source per-group customization table splits: + +| Resource | Location | Mechanism | +|----------|----------|-----------| +| Agent memory | `groups//CLAUDE.local.md` | RW at `/workspace/agent/`, auto-loaded by Claude Code | +| Composed entry | `groups//CLAUDE.md` | Host-regenerated at every spawn | + +## What triggers what + +| Change | Action | Scope | +|--------|--------|-------| +| Edit `container/CLAUDE.md` | Kill running containers (next spawn recomposes) | All groups | +| Add/edit a skill's `instructions.md` | Kill running containers | All groups with the skill enabled | +| Enable/disable a skill in `container.json` | Kill that group's containers | One group | +| Add MCP server with `instructions` field | Kill that group's containers | One group | +| Edit `CLAUDE.local.md` | Nothing — live via RW mount; Claude Code re-reads at next prompt | One group | +| Add a new agent group | Spawn writes `CLAUDE.md` fresh from the composition pass | One group | diff --git a/docs/shared-source.md b/docs/shared-source.md new file mode 100644 index 0000000..ab725ea --- /dev/null +++ b/docs/shared-source.md @@ -0,0 +1,270 @@ +# Shared Source + +Replace per-group agent-runner-src copies with a single shared read-only mount. + +## Problem + +Each agent group gets a full copy of `container/agent-runner/src/` at creation time. This copy is mounted RW at `/app/src` in the container. Consequences: + +- Bug fixes and features don't propagate to existing groups +- Owner edits to `container/agent-runner/src/` silently don't apply to existing groups +- No tooling to diff or detect drift between groups and upstream +- The RW mount lets agents write to their own runtime source without approval +- Cross-cutting changes (host + container) break down when container code is per-group +- Skills have the same copy-and-drift problem + +## Design + +**Principle: RW is per-group, RO is shared.** Every mount is either read-only and shared across all groups, or read-write and scoped to one group. Source and skills become RO + shared. Personality, config, working files, and Claude state stay RW + per-group. This makes drift impossible by construction — no group can diverge from shared code because no group has write access to it. + +### Shared source mount + +Mount `container/agent-runner/src/` into all containers at `/app/src` as **read-only**. + +``` +container/agent-runner/src/ → /app/src (RO, shared) +``` + +Source is never baked into the image. `/app/src/` exists only via this mount — running without it is an intentional startup failure (entrypoint `bun run /app/src/index.ts` → ENOENT). Source-only changes never trigger image rebuilds; edits to `.ts` files take effect on next container spawn. + +Image rebuilds are only needed for: +- Agent-runner npm dependency changes (`package.json` / `bun.lock`) +- System packages, runtime versions, global CLI version bumps +- Dockerfile/entrypoint changes + +### Shared skills mount + +Mount `container/skills/` into all containers at `/app/skills/` as **read-only**. + +Per-group skill selection via `container.json`: + +```jsonc +{ + "skills": ["welcome", "agent-browser", "self-customize"] + // or "skills": "all" (default) +} +``` + +At every spawn, the host syncs symlinks in the group's `.claude-shared/skills/` directory to match the selected set. For `"all"`, the set is recomputed from the shared skills dir on each spawn — newly-added upstream skills appear without intervention. Symlinks for skills no longer in the set are removed. + +Each symlink points to a container path: + +``` +.claude-shared/skills/welcome → /app/skills/welcome +.claude-shared/skills/agent-browser → /app/skills/agent-browser +``` + +Claude Code scans `/home/node/.claude/skills/`, follows the symlinks, loads the selected skills. Same dangling-symlink-on-host pattern as `.claude-global.md` — host tools don't resolve the target, the container mount makes it valid at read time. + +### Per-group customization surface + +What remains per-group (unchanged): + +| Resource | Location | Mechanism | +|----------|----------|-----------| +| Personality / instructions | `groups//CLAUDE.md` | Mount at `/workspace/agent` (RW, live) | +| MCP servers | `groups//container.json` | Env var at spawn | +| apt/npm packages | `groups//container.json` | Per-group image layer | +| Skill selection | `groups//container.json` | Symlinks at spawn | +| Additional mounts | `groups//container.json` | Validated bind mounts | +| Agent provider / model | `groups//container.json` | Read by runner at startup | +| Claude Code settings | `.claude-shared/settings.json` | Mount at `/home/node/.claude` (RW) | +| Working files | `groups//` | Mount at `/workspace/agent` (RW) | + +### Self-modification + +Existing config-level self-mod tools (`install_packages`, `add_mcp_server`, `request_rebuild`) mutate `container.json` and per-group images, not source. Unchanged — stays per-group. + +Source-level self-modification (not yet implemented) uses staging: edits happen against a copy of `container/agent-runner/src/`, reviewed and swapped in on approval. Owner can also edit source directly. + +## Environment variables + +Env is for things read by code we don't own: glibc, Node's http agent, CLIs we shell out to. Everything NanoClaw-specific moves out of env. + +**Stays in env (read by non-nanoclaw code):** + +| Var | Reader | +|---|---| +| `TZ` | glibc, child processes | +| `HTTPS_PROXY`, `NO_PROXY` | Node http agent, curl, git, etc. (OneCLI-injected) | +| `NODE_EXTRA_CA_CERTS` | Node at startup (OneCLI-injected) | + +**Moves to `container.json` (read by runner at startup):** + +| Var | Reason | +|---|---| +| `AGENT_PROVIDER` | Per-group config; runner reads before importing provider module | +| `NANOCLAW_AGENT_GROUP_NAME` | Per-group identity | +| `NANOCLAW_ASSISTANT_NAME` | Per-group identity | +| `NANOCLAW_MAX_MESSAGES_PER_PROMPT` | Config constant; per-group override possible | + +**Deleted (admin gating moves to router):** + +`NANOCLAW_ADMIN_USER_IDS` is removed entirely — not moved to a new location. The container no longer makes authorization decisions. See **Router command gate** below. + +**Hardcoded as conventions:** + +| Var | Convention | +|---|---| +| `SESSION_INBOUND_DB_PATH` | `/workspace/inbound.db` | +| `SESSION_OUTBOUND_DB_PATH` | `/workspace/outbound.db` | +| `SESSION_HEARTBEAT_PATH` | `/workspace/.heartbeat` | +| `NANOCLAW_AGENT_GROUP_ID` | Read from `/workspace/agent/container.json` at startup | + +### Runner startup order + +The runner can no longer assume DB paths or provider identity are handed to it in env. Revised startup: + +1. Set up logging. +2. Read `/workspace/agent/container.json` (mounted RW but read-only here). +3. Open `/workspace/inbound.db` and `/workspace/outbound.db` (fixed paths). +4. Read bootstrap tables from `inbound.db` (destinations). +5. Import the provider module selected by `container.json`. +6. Enter the poll loop. + +### Router command gate + +The host router gates slash commands before writing to `messages_in`. The container still handles whatever reaches it; it just stops making authorization decisions. + +1. **Filtered commands** (`/help`, `/login`, `/logout`, `/doctor`, `/config`, `/start`, `/remote-control`) → drop silently. Never reach the container. +2. **Admin commands** (`/clear`, `/compact`, `/context`, `/cost`, `/files`) → check sender against `user_roles` (owners + global admins + admins scoped to this agent group). + - Denied: write "Permission denied: `` requires admin access." directly to `messages_out` in the same thread. Do not write to `messages_in`. + - Allowed: pass through to container unchanged. +3. **Normal messages** → pass through unchanged. + +Admin commands that flow through continue to be handled the same way they are today: +- `/clear` — container's existing handler in `poll-loop.ts` resets session continuation and writes "Session cleared." +- `/compact`, `/context`, `/cost`, `/files` — container forwards them to Claude Code's native slash-command handler. + +Container receives only authorized messages. The runner has no admin concept, no `adminUserIds` field, no admin-gate branch — but it still recognizes `/clear` to reset session state. + +### Scope rules + +Each channel answers a single scope question: + +| Channel | Scope | What it holds | +|---|---|---| +| Env vars | Process | Things read by code we don't own (`TZ`, `HTTPS_PROXY`) | +| `container.json` | Per-group | Per-group config (MCP, packages, provider, model, skills, mounts) | +| `inbound.db` / `outbound.db` | Per-session | Messages, session state, and host-projected views of cross-group state (destinations) | +| Central DB (`data/v2.db`) | Cross-group | Users, roles, wiring, messaging groups, sessions | + +The runner reads from env (for external-convention vars), `container.json` (for its own group's config), and `inbound.db` (for messages + projected views). It never reads central DB directly — that's always host-projected through inbound.db first. + +After this change, the spawn-time `-e` flags shrink from ~10 to ~3-5 (TZ + OneCLI networking). No `NANOCLAW_*` env var survives. + +## Image layer strategy + +Single Dockerfile with aggressive layer ordering: stable layers first, frequently-bumped layers last. BuildKit's layer cache handles "upstream layers unchanged" rebuilds efficiently — a separate base image isn't justified. + +Two image tags exist at runtime: + +``` +nanoclaw-agent:latest — shared base (rebuild: dep/CLI bumps + Dockerfile changes) + └── nanoclaw-agent: — per-group apt/npm packages (rebuild: per-group via install_packages) +``` + +Layer order within the base: + +```dockerfile +FROM node:22-slim + +# System deps (apt) — rarely change +RUN apt-get install ... + +# Bun — pinned version, rarely changes +RUN ... bun + +# Agent-runner deps — cached independently of CLI versions +COPY agent-runner/package.json agent-runner/bun.lock /app/ +RUN cd /app && bun install --frozen-lockfile + +# Global CLIs — most stable first, most frequently bumped last +RUN pnpm install -g "vercel@${VERCEL_VERSION}" +RUN pnpm install -g "agent-browser@${AGENT_BROWSER_VERSION}" +RUN pnpm install -g "@anthropic-ai/claude-code@${CLAUDE_CODE_VERSION}" +``` + +Bumping claude-code (the most common change) only rebuilds one layer. Agent-runner deps and other CLIs stay cached. + +Source is never baked into the image — always provided by the shared RO mount at runtime. + +### Agent-triggered version bumps + +Agents can request a claude-code version bump via a new self-mod tool (`bump_claude_code`). Same fire-and-forget pattern as `install_packages`: agent requests → owner approves → host rebuilds base image → kill all running containers. Unlike `install_packages` (per-group image), this rebuilds the shared base image and affects all groups. + +## Changes + +### `group-init.ts` + +- Remove the `agent-runner-src` copy block (lines 109–117) +- Remove the `skills/` copy block (lines 100–107) +- Skill symlinks are no longer created at init — sync is spawn-owned (see `container-runner.ts`) + +### `container-runner.ts` `buildMounts()` + +- Remove per-group `agent-runner-src` mount (lines 206–209) +- Add shared RO mount: `container/agent-runner/src/` → `/app/src` +- Add shared RO mount: `container/skills/` → `/app/skills` +- Sync skill symlinks in `.claude-shared/skills/` at spawn: write desired set from `container.json` (`"all"` = every skill in the shared dir, recomputed per spawn), remove symlinks not in the set + +### `container-runner.ts` `buildContainerArgs()` + +- Remove `-e SESSION_INBOUND_DB_PATH`, `-e SESSION_OUTBOUND_DB_PATH`, `-e SESSION_HEARTBEAT_PATH` (hardcoded conventions now) +- Remove `-e AGENT_PROVIDER` (moves to `container.json`) +- Remove `-e NANOCLAW_ASSISTANT_NAME`, `-e NANOCLAW_AGENT_GROUP_ID`, `-e NANOCLAW_AGENT_GROUP_NAME` +- Remove `-e NANOCLAW_MAX_MESSAGES_PER_PROMPT` +- Remove the `user_roles` join + `-e NANOCLAW_ADMIN_USER_IDS` block (lines 269–287) entirely. Admin gating moves to the router — no admin data passed to the container. +- Keep: `-e TZ`, OneCLI-contributed env (`HTTPS_PROXY`, `NODE_EXTRA_CA_CERTS`, `NO_PROXY`) + +### `router.ts` (new command gate) + +- Classify inbound slash commands before writing to `messages_in`: filtered / admin / normal. +- Filtered (`/help`, `/login`, `/logout`, `/doctor`, `/config`, `/start`, `/remote-control`) → drop silently. +- Admin commands (`/clear`, `/compact`, `/context`, `/cost`, `/files`) from non-admins → write "Permission denied" directly to `messages_out`, skip `messages_in`. +- All authorized messages (admin commands from admins, and normal messages) → pass through unchanged to `messages_in`. Container handles them as today. +- The `ADMIN_COMMANDS` and `FILTERED_COMMANDS` lists move from `container/agent-runner/src/formatter.ts` to a host-side module. + +### `container/agent-runner/src/` (runner) + +- New `config.ts` module: loads `/workspace/agent/container.json` at startup, exposes a typed config singleton. All previous `process.env.NANOCLAW_*` reads go through this. +- `db/connection.ts`: use hardcoded paths `/workspace/inbound.db` and `/workspace/outbound.db`; drop `SESSION_*_DB_PATH` lookups. +- `formatter.ts`: remove `ADMIN_COMMANDS`, `FILTERED_COMMANDS`, and the `filtered` / admin-gate categorization. Keep enough to recognize `/clear` so `poll-loop.ts` can route it (e.g., a narrow `isClearCommand(msg)` helper). +- `poll-loop.ts`: remove `adminUserIds` field from config type and the admin-gate branch (lines 113–126). Keep the `/clear` handler (lines 128–142) — `/clear` still flows through from the router. +- Provider selection (`providers/index.ts` or equivalent): read provider from config singleton, not env. + +### `container-config.ts` + +- Add `skills` field to `ContainerConfig` (`string[] | "all"`, default `"all"`) +- Add fields: `provider`, `groupName`, `assistantName`, `maxMessagesPerPrompt` (optional, falls back to code default) + +### `.env` / `.env.example` + +- Remove any `NANOCLAW_*` entries that were documented as tunables. Update `.env.example` to list only TZ and OneCLI-related vars as valid overrides. + +### DB migration + +- Drop `agent_groups.agent_provider` column and `sessions.agent_provider` column. Source of truth becomes `container.json.provider`. +- One-time data migration reads existing values and writes them to each group's `container.json`. Sessions lose any per-session provider override — provider is a per-group property now. + +### Migration + +**This is a breaking change.** Host restart kills all running containers. No gradual rollout. Any code referencing dropped columns or removed env vars must be updated before the migration runs. + +- Provider install skills (`/add-opencode`, `/add-ollama-tool`) now write to the shared `container/agent-runner/src/providers/` tree. The per-group `providers/` overlay pattern is removed. Any uncommitted provider overlays must be upstreamed before cutover. +- Delete existing `data/v2-sessions//agent-runner-src/` directories on first run after cutover. +- Existing `.claude-shared/skills/` directories get replaced with symlinks on next spawn. +- DB migration (see above) reads `agent_provider` columns and projects into `container.json`, then drops the columns. + +## What triggers what + +| Change | Action needed | Scope | +|--------|--------------|-------| +| Agent-runner `.ts` source | Kill running containers | All groups | +| Agent-runner npm deps | Rebuild `nanoclaw-agent` + kill all | All groups | +| System deps, Bun, Node | Rebuild `nanoclaw-agent` + kill all | All groups | +| Claude-code version bump | Rebuild `nanoclaw-agent` + kill all | All groups (agent-triggerable) | +| Skill content | Kill running containers | All groups | +| Per-group apt/npm packages | `buildAgentGroupImage()` + kill | One group | +| Per-group config (MCP, mounts, provider, model, skills) | Kill that group's containers | One group | +| CLAUDE.md, working files | Nothing (live via RW mount) | One group | diff --git a/scripts/migrate-group-claude-md.ts b/scripts/migrate-group-claude-md.ts deleted file mode 100644 index dd16faf..0000000 --- a/scripts/migrate-group-claude-md.ts +++ /dev/null @@ -1,113 +0,0 @@ -/** - * One-shot migration: wire each existing group up to global memory via - * an in-tree symlink + @-import. - * - * Claude Code's @-import only follows paths inside cwd, so a direct - * `@/workspace/global/CLAUDE.md` or `@../global/CLAUDE.md` silently does - * nothing (the import line is parsed but the target file is never - * loaded into context). The working approach: - * - * 1. Symlink `groups//.claude-global.md` → - * `/workspace/global/CLAUDE.md` (container path; dangling on host, - * valid inside the container via the /workspace/global mount). - * 2. Have the group's CLAUDE.md import the symlink: - * `@./.claude-global.md`. - * - * This script: - * - Creates the symlink if missing. - * - Replaces any existing broken `@/workspace/global/CLAUDE.md` or - * `@../global/CLAUDE.md` import line with the symlink form. - * - Prepends the symlink import if neither form is present. - * - Skips entirely if `groups/global/CLAUDE.md` doesn't exist. - * - * Idempotent — safe to re-run. - * - * Usage: pnpm exec tsx scripts/migrate-group-claude-md.ts - */ -import fs from 'fs'; -import path from 'path'; - -import { GROUPS_DIR } from '../src/config.js'; - -const GLOBAL_CLAUDE_MD = path.join(GROUPS_DIR, 'global', 'CLAUDE.md'); -const GLOBAL_MEMORY_CONTAINER_PATH = '/workspace/global/CLAUDE.md'; -const GLOBAL_MEMORY_LINK_NAME = '.claude-global.md'; -const IMPORT_LINE = `@./${GLOBAL_MEMORY_LINK_NAME}`; - -// Match any existing @-import that points at global/CLAUDE.md, whether -// via absolute path, relative path, or the new symlink form. -const EXISTING_IMPORT_REGEX = - /^@(?:\/workspace\/global\/CLAUDE\.md|\.\.\/global\/CLAUDE\.md|\.\/\.claude-global\.md)\s*$/m; - -if (!fs.existsSync(GLOBAL_CLAUDE_MD)) { - console.error(`No global CLAUDE.md at ${GLOBAL_CLAUDE_MD} — nothing to migrate.`); - process.exit(1); -} - -if (!fs.existsSync(GROUPS_DIR)) { - console.error(`No groups dir at ${GROUPS_DIR} — nothing to migrate.`); - process.exit(1); -} - -const entries = fs.readdirSync(GROUPS_DIR, { withFileTypes: true }); -let updated = 0; -let alreadyWired = 0; -let missingClaudeMd = 0; -let symlinksCreated = 0; - -for (const entry of entries) { - if (!entry.isDirectory()) continue; - if (entry.name === 'global') continue; - - const groupDir = path.join(GROUPS_DIR, entry.name); - - // Symlink (idempotent — skip if already present) - const linkPath = path.join(groupDir, GLOBAL_MEMORY_LINK_NAME); - let linkExists = false; - try { - fs.lstatSync(linkPath); - linkExists = true; - } catch { - /* missing */ - } - if (!linkExists) { - fs.symlinkSync(GLOBAL_MEMORY_CONTAINER_PATH, linkPath); - console.log(`[link] ${entry.name}: created ${GLOBAL_MEMORY_LINK_NAME}`); - symlinksCreated++; - } - - // CLAUDE.md import wiring - const claudeMd = path.join(groupDir, 'CLAUDE.md'); - if (!fs.existsSync(claudeMd)) { - console.log(`[skip] ${entry.name}: no CLAUDE.md`); - missingClaudeMd++; - continue; - } - - const body = fs.readFileSync(claudeMd, 'utf-8'); - const match = body.match(EXISTING_IMPORT_REGEX); - - if (match && match[0] === IMPORT_LINE) { - console.log(`[wired] ${entry.name}: already imports ${IMPORT_LINE}`); - alreadyWired++; - continue; - } - - let newBody: string; - if (match) { - // Replace the broken import with the working form - newBody = body.replace(EXISTING_IMPORT_REGEX, IMPORT_LINE); - console.log(`[fix] ${entry.name}: rewrote ${match[0]} → ${IMPORT_LINE}`); - } else { - // Prepend fresh - newBody = `${IMPORT_LINE}\n\n${body}`; - console.log(`[ok] ${entry.name}: prepended ${IMPORT_LINE}`); - } - - fs.writeFileSync(claudeMd, newBody); - updated++; -} - -console.log( - `\nDone. updated=${updated} alreadyWired=${alreadyWired} missingClaudeMd=${missingClaudeMd} symlinksCreated=${symlinksCreated}`, -); diff --git a/src/claude-md-compose.ts b/src/claude-md-compose.ts new file mode 100644 index 0000000..3cc74c1 --- /dev/null +++ b/src/claude-md-compose.ts @@ -0,0 +1,182 @@ +/** + * CLAUDE.md composition for agent groups. + * + * Replaces the per-group "written once at init, owned by the group" pattern + * with a host-regenerated entry point that imports: + * - a shared base (`container/CLAUDE.md` mounted RO at `/app/CLAUDE.md`) + * - optional per-skill fragments (skills that ship `instructions.md`) + * - optional per-MCP-server fragments (inline `instructions` field in + * `container.json`) + * - per-group agent memory (`CLAUDE.local.md`, auto-loaded by Claude Code) + * + * Runs on every spawn from `container-runner.buildMounts()`. Deterministic — + * same inputs produce the same CLAUDE.md, and stale fragments are pruned. + * + * See `docs/claude-md-composition.md` for the full design. + */ +import fs from 'fs'; +import path from 'path'; + +import { GROUPS_DIR } from './config.js'; +import { readContainerConfig } from './container-config.js'; +import { log } from './log.js'; +import type { AgentGroup } from './types.js'; + +// Symlink targets are container paths — dangling on host (hence the readlink +// dance instead of existsSync), valid inside the container via RO mounts. +const SHARED_CLAUDE_MD_CONTAINER_PATH = '/app/CLAUDE.md'; +const SHARED_SKILLS_CONTAINER_BASE = '/app/skills'; + +const COMPOSED_HEADER = ''; + +/** + * Regenerate `groups//CLAUDE.md` from the shared base, enabled skill + * fragments, and MCP server fragments declared in `container.json`. Creates + * an empty `CLAUDE.local.md` if missing. + */ +export function composeGroupClaudeMd(group: AgentGroup): void { + const groupDir = path.resolve(GROUPS_DIR, group.folder); + if (!fs.existsSync(groupDir)) { + fs.mkdirSync(groupDir, { recursive: true }); + } + + const sharedLink = path.join(groupDir, '.claude-shared.md'); + syncSymlink(sharedLink, SHARED_CLAUDE_MD_CONTAINER_PATH); + + const fragmentsDir = path.join(groupDir, '.claude-fragments'); + if (!fs.existsSync(fragmentsDir)) { + fs.mkdirSync(fragmentsDir, { recursive: true }); + } + + // Desired fragment set. + const config = readContainerConfig(group.folder); + const desired = new Map(); + + // Skill fragments — every skill that ships an `instructions.md`. + // TODO (shared-source refactor): respect `container.json` skill selection. + const skillsHostDir = path.join(process.cwd(), 'container', 'skills'); + if (fs.existsSync(skillsHostDir)) { + for (const skillName of fs.readdirSync(skillsHostDir)) { + const hostFragment = path.join(skillsHostDir, skillName, 'instructions.md'); + if (fs.existsSync(hostFragment)) { + desired.set(`${skillName}.md`, { + type: 'symlink', + content: `${SHARED_SKILLS_CONTAINER_BASE}/${skillName}/instructions.md`, + }); + } + } + } + + // MCP server fragments — inline instructions from container.json. + for (const [name, mcp] of Object.entries(config.mcpServers)) { + if (mcp.instructions) { + desired.set(`mcp-${name}.md`, { + type: 'inline', + content: mcp.instructions, + }); + } + } + + // Reconcile: drop stale, write desired. + for (const existing of fs.readdirSync(fragmentsDir)) { + if (!desired.has(existing)) { + fs.unlinkSync(path.join(fragmentsDir, existing)); + } + } + for (const [name, frag] of desired) { + const fragPath = path.join(fragmentsDir, name); + if (frag.type === 'symlink') { + syncSymlink(fragPath, frag.content); + } else { + writeAtomic(fragPath, frag.content); + } + } + + // Composed entry — imports only. + const imports = ['@./.claude-shared.md']; + for (const name of [...desired.keys()].sort()) { + imports.push(`@./.claude-fragments/${name}`); + } + const body = [COMPOSED_HEADER, ...imports, ''].join('\n'); + writeAtomic(path.join(groupDir, 'CLAUDE.md'), body); + + const localFile = path.join(groupDir, 'CLAUDE.local.md'); + if (!fs.existsSync(localFile)) { + fs.writeFileSync(localFile, ''); + } +} + +/** + * One-time cutover from the `groups/global/CLAUDE.md` + `.claude-global.md` + * pattern. Idempotent — safe to run on every host startup. + * + * For each group dir: + * - remove `.claude-global.md` symlink if present + * - rename `CLAUDE.md` → `CLAUDE.local.md` (only if `CLAUDE.local.md` + * doesn't already exist — preserves pre-cutover content as per-group + * memory; after the first spawn regenerates `CLAUDE.md`, this branch + * is skipped because `CLAUDE.local.md` now exists) + * + * Globally: + * - delete `groups/global/` (content already in `container/CLAUDE.md`) + */ +export function migrateGroupsToClaudeLocal(): void { + if (!fs.existsSync(GROUPS_DIR)) return; + + const actions: string[] = []; + + for (const entry of fs.readdirSync(GROUPS_DIR, { withFileTypes: true })) { + if (!entry.isDirectory()) continue; + if (entry.name === 'global') continue; + + const groupDir = path.join(GROUPS_DIR, entry.name); + + const oldGlobalLink = path.join(groupDir, '.claude-global.md'); + try { + fs.lstatSync(oldGlobalLink); + fs.unlinkSync(oldGlobalLink); + actions.push(`${entry.name}/.claude-global.md removed`); + } catch { + /* already gone */ + } + + const claudeMd = path.join(groupDir, 'CLAUDE.md'); + const claudeLocal = path.join(groupDir, 'CLAUDE.local.md'); + if (fs.existsSync(claudeMd) && !fs.existsSync(claudeLocal)) { + fs.renameSync(claudeMd, claudeLocal); + actions.push(`${entry.name}/CLAUDE.md → CLAUDE.local.md`); + } + } + + const globalDir = path.join(GROUPS_DIR, 'global'); + if (fs.existsSync(globalDir)) { + fs.rmSync(globalDir, { recursive: true, force: true }); + actions.push('groups/global/ removed'); + } + + if (actions.length > 0) { + log.info('Migrated groups to CLAUDE.local.md model', { actions }); + } +} + +function syncSymlink(linkPath: string, target: string): void { + let currentTarget: string | null = null; + try { + currentTarget = fs.readlinkSync(linkPath); + } catch { + /* missing */ + } + if (currentTarget === target) return; + try { + fs.unlinkSync(linkPath); + } catch { + /* missing */ + } + fs.symlinkSync(target, linkPath); +} + +function writeAtomic(filePath: string, content: string): void { + const tmp = `${filePath}.tmp-${process.pid}`; + fs.writeFileSync(tmp, content); + fs.renameSync(tmp, filePath); +} diff --git a/src/container-config.ts b/src/container-config.ts index 90c24e9..d972842 100644 --- a/src/container-config.ts +++ b/src/container-config.ts @@ -18,6 +18,10 @@ export interface McpServerConfig { command: string; args?: string[]; env?: Record; + // Optional always-in-context guidance. When set, the host writes the + // content to `.claude-fragments/mcp-.md` at spawn and imports it + // into the composed CLAUDE.md. + instructions?: string; } export interface AdditionalMountConfig { diff --git a/src/container-runner.ts b/src/container-runner.ts index 32499bc..6f7f1d1 100644 --- a/src/container-runner.ts +++ b/src/container-runner.ts @@ -12,6 +12,7 @@ import { OneCLI } from '@onecli-sh/sdk'; import { CONTAINER_IMAGE, DATA_DIR, GROUPS_DIR, ONECLI_URL, TIMEZONE } from './config.js'; import { readContainerConfig, writeContainerConfig } from './container-config.js'; import { CONTAINER_RUNTIME_BIN, hostGatewayArgs, readonlyMountArgs, stopContainer } from './container-runtime.js'; +import { composeGroupClaudeMd } from './claude-md-compose.js'; import { getAgentGroup } from './db/agent-groups.js'; import { getDb, hasTable } from './db/connection.js'; import { initGroupFilesystem } from './group-init.js'; @@ -195,6 +196,10 @@ function buildMounts( const claudeDir = path.join(DATA_DIR, 'v2-sessions', agentGroup.id, '.claude-shared'); syncSkillSymlinks(claudeDir, containerConfig); + // Compose CLAUDE.md fresh every spawn from the shared base, enabled skill + // fragments, and MCP server instructions. See `claude-md-compose.ts`. + composeGroupClaudeMd(agentGroup); + const mounts: VolumeMount[] = []; const sessDir = sessionDir(agentGroup.id, session.id); const groupDir = path.resolve(GROUPS_DIR, agentGroup.folder); @@ -218,6 +223,13 @@ function buildMounts( mounts.push({ hostPath: globalDir, containerPath: '/workspace/global', readonly: true }); } + // Shared CLAUDE.md — read-only, imported by the composed entry point via + // the `.claude-shared.md` symlink inside the group dir. + const sharedClaudeMd = path.join(process.cwd(), 'container', 'CLAUDE.md'); + if (fs.existsSync(sharedClaudeMd)) { + mounts.push({ hostPath: sharedClaudeMd, containerPath: '/app/CLAUDE.md', readonly: true }); + } + // Per-group .claude-shared at /home/node/.claude (Claude state, settings, // skill symlinks) mounts.push({ hostPath: claudeDir, containerPath: '/home/node/.claude', readonly: false }); diff --git a/src/group-init.ts b/src/group-init.ts index 211ef1f..437d10f 100644 --- a/src/group-init.ts +++ b/src/group-init.ts @@ -6,18 +6,6 @@ import { initContainerConfig } from './container-config.js'; import { log } from './log.js'; import type { AgentGroup } from './types.js'; -// Container path where groups/global is mounted. The symlink we drop -// into each group's dir resolves to this target inside the container. -// It's a dangling symlink on the host — that's fine, host tools don't -// follow it and the container mount makes it valid at read time. -const GLOBAL_MEMORY_CONTAINER_PATH = '/workspace/global/CLAUDE.md'; - -// Symlink name inside the group's dir. Claude Code's @-import only -// follows paths inside cwd, so we can't reference /workspace/global -// directly — we symlink into the group dir and import the symlink. -export const GLOBAL_MEMORY_LINK_NAME = '.claude-global.md'; -export const GLOBAL_CLAUDE_IMPORT = `@./${GLOBAL_MEMORY_LINK_NAME}`; - const DEFAULT_SETTINGS_JSON = JSON.stringify( { @@ -36,11 +24,15 @@ const DEFAULT_SETTINGS_JSON = * every step is gated on the target not already existing, so re-running on * an already-initialized group is a no-op. * - * Called once per group lifetime: at creation, or defensively from + * Called once per group lifetime at creation, or defensively from * `buildMounts()` for groups that pre-date this code path. * * Source code and skills are shared RO mounts — not copied per-group. * Skill symlinks are synced at spawn time by container-runner.ts. + * + * The composed `CLAUDE.md` is NOT written here — it's regenerated on every + * spawn by `composeGroupClaudeMd()` (see `claude-md-compose.ts`). Initial + * per-group instructions (if provided) seed `CLAUDE.local.md`. */ export function initGroupFilesystem(group: AgentGroup, opts?: { instructions?: string }): void { const initialized: string[] = []; @@ -52,29 +44,13 @@ export function initGroupFilesystem(group: AgentGroup, opts?: { instructions?: s initialized.push('groupDir'); } - // groups//.claude-global.md — symlink into the group dir so - // Claude Code's @-import can follow it. Uses lstat to avoid tripping - // existsSync on a dangling symlink (target only resolves inside the - // container). - const globalLinkPath = path.join(groupDir, GLOBAL_MEMORY_LINK_NAME); - let linkExists = false; - try { - fs.lstatSync(globalLinkPath); - linkExists = true; - } catch { - /* missing — recreate */ - } - if (!linkExists) { - fs.symlinkSync(GLOBAL_MEMORY_CONTAINER_PATH, globalLinkPath); - initialized.push('.claude-global.md'); - } - - // groups//CLAUDE.md — written once, then owned by the group - const claudeMdFile = path.join(groupDir, 'CLAUDE.md'); - if (!fs.existsSync(claudeMdFile)) { - const body = [GLOBAL_CLAUDE_IMPORT, '', opts?.instructions ?? `# ${group.name}`].join('\n') + '\n'; - fs.writeFileSync(claudeMdFile, body); - initialized.push('CLAUDE.md'); + // groups//CLAUDE.local.md — per-group agent memory, auto-loaded by + // Claude Code. Seeded with caller-provided instructions on first creation. + const claudeLocalFile = path.join(groupDir, 'CLAUDE.local.md'); + if (!fs.existsSync(claudeLocalFile)) { + const body = opts?.instructions ? opts.instructions + '\n' : ''; + fs.writeFileSync(claudeLocalFile, body); + initialized.push('CLAUDE.local.md'); } // groups//container.json — empty container config, replaces the diff --git a/src/index.ts b/src/index.ts index 1ec8619..d3de4d9 100644 --- a/src/index.ts +++ b/src/index.ts @@ -7,6 +7,7 @@ import path from 'path'; import { DATA_DIR } from './config.js'; +import { migrateGroupsToClaudeLocal } from './claude-md-compose.js'; import { initDb } from './db/connection.js'; import { runMigrations } from './db/migrations/index.js'; import { ensureContainerRuntimeRunning, cleanupOrphans } from './container-runtime.js'; @@ -63,6 +64,9 @@ async function main(): Promise { runMigrations(db); log.info('Central DB ready', { path: dbPath }); + // 1b. One-time filesystem cutover — idempotent, no-op after first run. + migrateGroupsToClaudeLocal(); + // 2. Container runtime ensureContainerRuntimeRunning(); cleanupOrphans();