Files

Gary Walker 54a8648c95 feat: add model management tools to add-ollama-tool skill

Adds four new MCP tools to the existing ollama integration, consolidating
model management (from #1331) into the single add-ollama-tool skill as
requested by @gavrielc:

- ollama_pull_model  — pull a model from the Ollama registry
- ollama_delete_model — delete a local model to free disk space
- ollama_show_model  — inspect modelfile, parameters, and architecture
- ollama_list_running — list models loaded in memory with VRAM/processor info

All four tools follow the existing patterns in this file: OLLAMA_HOST env
var, ollamaFetch() with host.docker.internal fallback, log() and
writeStatus() helpers. No changes to index.ts or container-runner.ts
needed — OLLAMA_HOST is already forwarded via sdkEnv.

Also updates SKILL.md description, tool list, verify steps, and adds a
troubleshooting entry for large-model pull timeouts.

Closes #1331.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-26 12:08:54 +11:00

5.0 KiB

Raw Blame History

name, description

name	description
add-ollama-tool	Add Ollama MCP server so the container agent can call local models and manage the Ollama model library.

Add Ollama Integration

This skill adds a stdio-based MCP server that exposes local Ollama models as tools for the container agent. Claude remains the orchestrator but can offload work to local models, and can also manage the model library directly.

Tools added:

ollama_list_models — list installed models with name, size, family, and last modified date
ollama_generate — send a prompt to a specified model and return the response
ollama_pull_model — pull (download) a model from the Ollama registry by name
ollama_delete_model — delete a locally installed model to free disk space
ollama_show_model — show model details: modelfile, parameters, template, and architecture info
ollama_list_running — list models currently loaded in memory with memory usage and processor type

Phase 1: Pre-flight

Check if already applied

Check if container/agent-runner/src/ollama-mcp-stdio.ts exists. If it does, skip to Phase 3 (Configure).

Check prerequisites

Verify Ollama is installed and running on the host:

ollama list

If Ollama is not installed, direct the user to https://ollama.com/download.

If no models are installed, suggest pulling one:

You need at least one model. I recommend:

ollama pull gemma3:1b    # Small, fast (1GB)
ollama pull llama3.2     # Good general purpose (2GB)
ollama pull qwen3-coder:30b  # Best for code tasks (18GB)

Phase 2: Apply Code Changes

Ensure upstream remote

git remote -v

If upstream is missing, add it:

git remote add upstream https://github.com/qwibitai/nanoclaw.git

Merge the skill branch

git fetch upstream skill/ollama-tool
git merge upstream/skill/ollama-tool

This merges in:

container/agent-runner/src/ollama-mcp-stdio.ts (Ollama MCP server)
scripts/ollama-watch.sh (macOS notification watcher)
Ollama MCP config in container/agent-runner/src/index.ts (allowedTools + mcpServers)
[OLLAMA] log surfacing in src/container-runner.ts
OLLAMA_HOST in .env.example

If the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.

Copy to per-group agent-runner

Existing groups have a cached copy of the agent-runner source. Copy the new files:

for dir in data/sessions/*/agent-runner-src; do
  cp container/agent-runner/src/ollama-mcp-stdio.ts "$dir/"
  cp container/agent-runner/src/index.ts "$dir/"
done

Validate code changes

npm run build
./container/build.sh

Build must be clean before proceeding.

Phase 3: Configure

Set Ollama host (optional)

By default, the MCP server connects to http://host.docker.internal:11434 (Docker Desktop) with a fallback to localhost. To use a custom Ollama host, add to .env:

OLLAMA_HOST=http://your-ollama-host:11434

Restart the service

launchctl kickstart -k gui/$(id -u)/com.nanoclaw  # macOS
# Linux: systemctl --user restart nanoclaw

Phase 4: Verify

Test inference

Tell the user:

Send a message like: "use ollama to tell me the capital of France"

The agent should use ollama_list_models to find available models, then ollama_generate to get a response.

Test model management

Send a message like: "pull the gemma3:1b model" or "which ollama models are currently loaded in memory?"

The agent should call ollama_pull_model or ollama_list_running respectively.

Monitor activity (optional)

Run the watcher script for macOS notifications when Ollama is used:

./scripts/ollama-watch.sh

Check logs if needed

tail -f logs/nanoclaw.log | grep -i ollama

Look for:

[OLLAMA] >>> Generating — generation started
[OLLAMA] <<< Done — generation completed
[OLLAMA] Pulling model: — pull in progress
[OLLAMA] Deleted: — model removed

Troubleshooting

Agent says "Ollama is not installed"

The agent is trying to run ollama CLI inside the container instead of using the MCP tools. This means:

The MCP server wasn't registered — check container/agent-runner/src/index.ts has the ollama entry in mcpServers
The per-group source wasn't updated — re-copy files (see Phase 2)
The container wasn't rebuilt — run ./container/build.sh

"Failed to connect to Ollama"

Verify Ollama is running: ollama list
Check Docker can reach the host: docker run --rm curlimages/curl curl -s http://host.docker.internal:11434/api/tags
If using a custom host, check OLLAMA_HOST in .env

Agent doesn't use Ollama tools

The agent may not know about the tools. Try being explicit: "use the ollama_generate tool with gemma3:1b to answer: ..."

`ollama_pull_model` times out on large models

Large models (7B+) can take several minutes. The tool uses stream: false so it blocks until complete — this is intentional. For very large pulls, use the host CLI directly: ollama pull <model>

5.0 KiB Raw Blame History