feat: add /add-ollama-provider skill and docs/ollama.md

Adds a new operational skill that routes any agent group to a local Ollama instance instead of the Anthropic API. Ollama speaks the Anthropic /v1/messages endpoint natively, so no new provider code is needed — just env var overrides and a model setting in the shared settings file. The skill also documents and applies two prerequisite source changes: - ContainerConfig gains env and blockedHosts fields (container-config.ts) - container-runner wires those fields as -e and --add-host Docker flags - Dockerfile home dir set to chmod 777 so containers running as the host uid can write ~/.claude config (discovered during implementation) docs/ollama.md covers the architecture, OneCLI proxy bypass rationale, network isolation via blockedHosts, model selection tradeoffs for Apple Silicon, and revert instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 01:11:46 -07:00
parent 5ed5b72f10
commit 47e3203809
2 changed files with 267 additions and 0 deletions
--- a/.claude/skills/add-ollama-provider/SKILL.md
+++ b/.claude/skills/add-ollama-provider/SKILL.md
@@ -0,0 +1,179 @@
 ---
 name: add-ollama-provider
 description: Route a NanoClaw agent group to a local Ollama model instead of the Anthropic API. Ollama speaks the Anthropic API natively (v1/messages), so no provider code changes are needed — just env var overrides and a model setting. Use when the user wants to run their agent locally, cut API costs, or experiment with open-weight models. See docs/ollama.md for background.
 ---
 # Add Ollama Provider
 Routes an agent group to a local Ollama instance instead of the Anthropic API.
 See `docs/ollama.md` for how this works and the tradeoffs involved.
 ## Prerequisites
 1. **Ollama is installed and running** on the host — verify: `curl -s http://localhost:11434/api/tags`
 2. **A model is pulled** — e.g. `ollama pull gemma4` or `ollama pull qwen3-coder`
 3. **The agent group already exists** — run `/init-first-agent` first if needed
 ## 1. Check source support
 The feature requires two fields in `ContainerConfig` (`env` and `blockedHosts`) and their
 corresponding wiring in `container-runner.ts`. Check if already present:
 ```bash
 grep -c 'blockedHosts' src/container-config.ts src/container-runner.ts
 ```
 If either count is 0, apply the changes in steps 1a and 1b. Otherwise skip to step 2.
 ### 1a. Extend ContainerConfig
 In `src/container-config.ts`, add to the `ContainerConfig` interface:
 ```typescript
 env?: Record<string, string>;
 blockedHosts?: string[];
 ```
 And in `readContainerConfig`, add inside the returned object:
 ```typescript
 env: raw.env,
 blockedHosts: raw.blockedHosts,
 ```
 ### 1b. Wire into container-runner
 In `src/container-runner.ts`, after the `NANOCLAW_MCP_SERVERS` block, add:
 ```typescript
 // Per-agent-group env overrides — applied last to win over OneCLI values.
 if (containerConfig.env) {
  for (const [key, value] of Object.entries(containerConfig.env)) {
    args.push('-e', `${key}=${value}`);
  }
 }
 // Blocked hosts: resolve to 0.0.0.0 so they are unreachable inside the container.
 if (containerConfig.blockedHosts) {
  for (const host of containerConfig.blockedHosts) {
    args.push('--add-host', `${host}:0.0.0.0`);
  }
 }
 ```
 ### 1c. Fix home directory permissions (if not already done)
 The container may run as your host uid (not uid 1000). Check the Dockerfile:
 ```bash
 grep 'chmod.*home/node' container/Dockerfile
 ```
 If it shows `chmod 755`, change it to `chmod 777` so any uid can write there.
 Then rebuild the container image: `./container/build.sh`
 ## 2. Identify the setup
 Ask the user (plain text, not AskUserQuestion):
 1. **Which agent group?** List available groups: `sqlite3 data/v2.db "SELECT folder, name FROM agent_groups;"`
 2. **Which Ollama model?** List available: `curl -s http://localhost:11434/api/tags | grep '"name"'`
 3. **Block Anthropic API?** Recommended yes — prevents accidental spend if config drifts.
 Record as `FOLDER`, `MODEL`, and `BLOCK_ANTHROPIC`.
 ## 3. Configure container.json
 Read `groups/<FOLDER>/container.json`. Add (or merge into) an `env` block and optionally `blockedHosts`:
 ```json
 {
  "env": {
    "ANTHROPIC_BASE_URL": "http://host.docker.internal:11434",
    "ANTHROPIC_API_KEY": "ollama",
    "NO_PROXY": "host.docker.internal",
    "no_proxy": "host.docker.internal"
  },
  "blockedHosts": ["api.anthropic.com"]
 }
 ```
 Omit `blockedHosts` if the user declined step 2.
 **Why these vars:** `ANTHROPIC_BASE_URL` redirects the Anthropic SDK to Ollama.
 `ANTHROPIC_API_KEY=ollama` satisfies the SDK's key requirement (Ollama ignores it).
 `NO_PROXY` bypasses the OneCLI HTTPS proxy for requests to `host.docker.internal`
 so they reach Ollama directly instead of going through the credential gateway.
 ## 4. Set the model
 Read the agent group's shared Claude settings:
 ```bash
 # Find the agent group ID
 AG_ID=$(sqlite3 data/v2.db "SELECT id FROM agent_groups WHERE folder='<FOLDER>';")
 SETTINGS=data/v2-sessions/$AG_ID/.claude-shared/settings.json
 ```
 Add `"model": "<MODEL>"` to that settings file. Create the file if it doesn't exist:
 ```json
 {
  "model": "gemma4:latest"
 }
 ```
 If the file already has content, merge the `model` key in — don't overwrite existing keys.
 **Why here and not container.json:** Claude Code reads its model from its own settings
 file, not from env vars. This file is bind-mounted into the container as `~/.claude/settings.json`.
 ## 5. Build and restart
 ```bash
 export PATH="/opt/homebrew/bin:$PATH"
 pnpm run build
 launchctl unload ~/Library/LaunchAgents/com.nanoclaw.plist
 launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
 # Linux: systemctl --user restart nanoclaw
 ```
 ## 6. Verify
 Send a message to the agent. Then confirm:
 ```bash
 # Ollama shows the model as active
 curl -s http://localhost:11434/api/ps | grep '"name"'
 # Container has the right env vars
 CTR=$(docker ps --filter "name=nanoclaw-v2-<FOLDER>" --format "{{.Names}}" | head -1)
 docker inspect "$CTR" --format '{{json .HostConfig.ExtraHosts}}'
 docker exec "$CTR" env | grep ANTHROPIC
 ```
 Expected: `api.anthropic.com:0.0.0.0` in ExtraHosts, `ANTHROPIC_BASE_URL=http://host.docker.internal:11434`.
 ## Reverting to Claude
 To switch back to the Anthropic API:
 1. Remove the `env` and `blockedHosts` keys from `groups/<FOLDER>/container.json`
 2. Remove `"model"` from the shared settings file
 3. Restart the service
 No rebuild needed — both files are read at container spawn time.
 ## Troubleshooting
 **Agent hangs, no response:** Ollama may be loading the model cold (large models take 10–30s).
 Watch `curl -s http://localhost:11434/api/ps` — the model appears once loaded.
 **"model not found" error in container logs:** The model name in settings.json doesn't match
 what Ollama has. Run `ollama list` on the host and use the exact name shown.
 **Responses claim to be Claude:** The model was trained on data that includes Claude conversations.
 Add a line to `groups/<FOLDER>/CLAUDE.md` telling it what model it runs on.
 **Agent responds but Ollama shows no activity:** `NO_PROXY` may not have taken effect for
 `http_proxy` (lowercase). Add both `NO_PROXY` and `no_proxy` to the env block.
--- a/docs/ollama.md
+++ b/docs/ollama.md
@@ -0,0 +1,88 @@
 # Running Agents on Local Ollama
 NanoClaw agents can be routed to a local [Ollama](https://ollama.com) instance instead of the Anthropic API. This cuts API costs to zero and keeps all inference on your hardware.
 ## How It Works
 Ollama exposes an Anthropic-compatible `/v1/messages` endpoint. The Claude Code CLI (which runs inside agent containers) uses the Anthropic SDK, which reads `ANTHROPIC_BASE_URL` to find the API host. Pointing that variable at Ollama is all that's needed — no new provider code, no changes to the agent runtime.
 ```
 ┌─────────────────────────────┐
 │  Agent container            │
 │                             │
 │  Claude Code CLI            │
 │    ↓ ANTHROPIC_BASE_URL     │
 │    http://host.docker.      │      ┌──────────────────┐
 │    internal:11434    ───────┼─────▶│  Ollama :11434   │
 │                             │      │  gemma4:latest   │
 └─────────────────────────────┘      └──────────────────┘
 ```
 `host.docker.internal` is Docker's magic hostname that resolves to the host machine from inside a container — so Ollama running on your Mac or Linux box is reachable at that address.
 ## The OneCLI Complication
 NanoClaw normally runs API calls through an OneCLI HTTPS proxy that injects real credentials in place of a placeholder key. When redirecting to Ollama you need to bypass that proxy so requests go direct. Two env vars handle this:
 - `NO_PROXY=host.docker.internal` — tells the Anthropic SDK's HTTP client to skip the proxy for that hostname
 - `no_proxy=host.docker.internal` — lowercase variant for tools that check the lowercase form
 Both are set in the agent group's `container.json` alongside `ANTHROPIC_BASE_URL`.
 ## Network Isolation
 Setting `ANTHROPIC_BASE_URL` redirects requests but doesn't prevent a misconfigured agent from accidentally reaching `api.anthropic.com` directly. The `blockedHosts` field in `container.json` adds a Docker `--add-host` flag that resolves the domain to `0.0.0.0`, making it physically unreachable from inside the container:
 ```json
 "blockedHosts": ["api.anthropic.com"]
 ```
 With this in place, even if the model setting drifts back to a Claude model name, the API call will fail immediately rather than silently billing your account.
 ## Model Selection
 The Claude Code CLI reads its model from `~/.claude/settings.json` inside the container, which NanoClaw bind-mounts from `data/v2-sessions/<agent-group-id>/.claude-shared/settings.json`. Set `"model": "gemma4:latest"` (or whatever Ollama model you've pulled) there. Use the exact name from `ollama list`.
 Model selection considerations for Apple Silicon:
 | Model | Size | Quality | Speed (M4 Pro) |
 |-------|------|---------|----------------|
 | `gemma4:latest` | 12B | Good general-purpose | Fast |
 | `qwen3-coder:latest` | 32B | Excellent for coding tasks | Moderate |
 | `llama3.2:latest` | 3B | Basic | Very fast |
 The agent uses tool calls extensively (read/write files, shell commands). Models that support tool use reliably work best. Gemma 4 and Qwen 3 Coder both handle structured tool calls well.
 ## What Changes at the Code Level
 Three files need to support this feature. See `/add-ollama-provider` for the exact changes.
 **`src/container-config.ts`** — `ContainerConfig` interface needs `env` and `blockedHosts` fields so the per-group JSON can carry them.
 **`src/container-runner.ts`** — At container spawn time, `env` entries become `-e KEY=VAL` Docker flags (applied after OneCLI's injected vars so they win), and `blockedHosts` entries become `--add-host HOST:0.0.0.0` flags.
 **`container/Dockerfile`** — The container runs as the host user's uid (e.g. 501 on macOS), not as the `node` user (uid 1000). The home directory must be `chmod 777` so any uid can write `~/.claude.json` and `~/.claude/settings.json`.
 ## Tradeoffs
 | | Ollama (local) | Anthropic API |
 |---|---|---|
 | Cost | Free | Pay-per-token |
 | Privacy | Fully local | Data sent to Anthropic |
 | Model quality | Good (open-weight) | Excellent (Claude) |
 | Cold start | 5–30s (model load) | ~1s |
 | Context window | Varies by model | 200k tokens (Sonnet) |
 | Tool use reliability | Good (large models) | Excellent |
 | Hardware req. | 16GB+ RAM | None |
 For personal automation on capable hardware, the tradeoff favors local. For complex multi-step tasks requiring large context or high reliability, Claude is still ahead.
 ## Reverting to Claude
 Remove the `env` and `blockedHosts` keys from `groups/<folder>/container.json`, remove `"model"` from the shared settings file, and restart the service. No rebuild needed.
 ## See Also
 - `/add-ollama-provider` — step-by-step skill to configure any agent group for Ollama
 - [Ollama Anthropic compatibility docs](https://ollama.com/blog/openai-compatibility) — upstream docs on the API bridge
 - `docs/architecture.md` — how the container spawn and env injection pipeline works