- Add three-level isolation model (shared session, same agent, separate agent) with agent-shared session mode for cross-channel shared sessions - Create /manage-channels skill for wiring channels to agent groups - Refactor all 12 v2 channel skills: lean SKILL.md + VERIFY.md + REMOVE.md with structured Channel Info section for platform-specific metadata - Create /add-discord-v2 skill (was missing) - Add step 5a to setup SKILL.md invoking /manage-channels after channel install - Update setup/verify.ts to check all 12 channel token types - Add docs/v2-isolation-model.md explaining the isolation model - Update v2-checklist.md and v2-setup-wiring.md to reflect completed work Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7.9 KiB
7.9 KiB
NanoClaw v2 Checklist
Status: [x] done, [~] partial, [ ] not started
Core Architecture
- Session DB replaces IPC (messages_in / messages_out as sole IO)
- Two-DB split: inbound.db (host-owned) + outbound.db (container-owned) — zero cross-process write contention
- Central DB (agent groups, messaging groups, sessions, routing)
- Host sweep (stale detection via heartbeat file, retry with backoff, recurrence scheduling)
- Active delivery polling (1s for running sessions)
- Sweep delivery polling (60s across all sessions)
- Container runner with session DB mounting
- Per-session container lifecycle and idle timeout
- Session resume (sessionId + resumeAt across queries)
- Graceful shutdown (SIGTERM/SIGINT handlers)
- Orphan container cleanup on startup
Agent Runner (Container)
- Poll loop (pending messages, status transitions, idle detection)
- Concurrent follow-up polling while agent is thinking
- Message formatter (chat, task, webhook, system kinds)
- Command categorization (admin, filtered, passthrough)
- Transcript archiving (pre-compact hook)
- XML message formatting with sender, timestamp
- [~] Media handling inbound (formatter references attachments, no download-from-URL)
Agent Providers
- Claude provider (Agent SDK, tool allowlist, message stream, session resume)
- Mock provider (testing)
- Provider factory
- Codex provider
- OpenCode provider
Channel Adapters
- Channel adapter interface (setup, deliver, teardown, typing)
- Chat SDK bridge (generic, works with any Chat SDK adapter)
- Chat SDK SQLite state adapter (KV, subscriptions, locks, lists)
- Discord via Chat SDK
- [~] Slack via Chat SDK (adapter + skill written, not tested)
- [~] Telegram via Chat SDK (adapter + skill written, not tested)
- [~] Microsoft Teams via Chat SDK (adapter + skill written, not tested)
- [~] Google Chat via Chat SDK (adapter + skill written, not tested)
- [~] Linear via Chat SDK (adapter + skill written, not tested)
- [~] GitHub via Chat SDK (adapter + skill written, not tested)
- [~] WhatsApp Cloud API via Chat SDK (adapter + skill written, not tested)
- [~] Resend (email) via Chat SDK (adapter + skill written, not tested)
- [~] Matrix via Chat SDK (adapter + skill written, not tested)
- [~] Webex via Chat SDK (adapter + skill written, not tested)
- [~] iMessage via Chat SDK (adapter + skill written, not tested)
- Backward compatibility with native channels (old adapters still work)
- Channel barrel wired (src/index.ts imports barrel, skills uncomment)
- Setup flow wired to v2 channels (channel skills + /manage-channels for registration + verify.ts checks all tokens)
- Channel Info metadata in each channel skill (type, terminology, how-to-find-id, isolation defaults)
- /manage-channels skill (wire channels to agent groups with three isolation levels)
- Agent-shared session mode (cross-channel shared sessions, e.g. GitHub + Slack)
- Setup vs production channel separation
- Generate visual diagram of customized instance at end of setup
Routing
- Inbound routing (platform ID + thread ID -> agent group -> session)
- Auto-create messaging group on first message
- Session resolution (shared vs per-thread modes)
- Message writing to session DB with seq numbering
- Container waking on new message
- [~] Trigger rule matching (router picks highest-priority agent, regex/mention matching TODO)
Rich Messaging
- Interactive cards with buttons (ask_user_question)
- Native platform rendering (Discord embeds, buttons)
- Message editing
- Emoji reactions
- File sending from agent (outbox -> delivery)
- File upload delivery (buffer-based via adapter)
- Markdown formatting
- [~] Formatted /usage, /context, /cost output (commands pass through, no rich card formatting)
- Context window visibility: show position in context, approaching compaction, when compaction happens, post-compaction state
- Threading and replies support
MCP Tools (Container)
- send_message (text, optional cross-channel targeting)
- send_file (copy to outbox, write messages_out)
- edit_message
- add_reaction
- send_card
- ask_user_question (blocking poll for response)
- schedule_task (with process_after and recurrence)
- list_tasks
- cancel_task / pause_task / resume_task
- send_to_agent (writes message, routing incomplete)
Scheduling
- One-shot scheduled messages (process_after / deliver_after)
- Recurring tasks via cron expressions
- Host sweep picks up due messages and advances recurrence
- Scheduled outbound messages (no container wake needed)
- [~] Pre-agent scripts (task kind with script field, documented but not verified)
Permissions and Approval Flows
- Admin user ID per group
- Admin-only command filtering in container
- Approval flow (sensitive action -> card to admin -> approve/reject -> execute)
- Role definitions beyond admin (custom roles, per-group permissions)
- Configurable sensitive action list
- Non-main groups requesting sensitive actions
- Agent requests dependency/package install (persists via Dockerfile change, requires approval)
- Agent self-modification flow:
- Agent requests code changes by delegating to a builder agent
- Builder agent has write access to the requesting agent's code and Dockerfile
- Approval modes: approve per-edit as builder works, or approve full diff at the end
- Diff review card sent to admin showing all proposed changes
- On approval: apply edits, rebuild container image, restart agent
- On rejection: discard changes, notify requesting agent
Agent-to-Agent Communication
- [~] send_to_agent MCP tool (writes message, host-side routing TODO)
- Host delivery to target agent's session DB
- Agent spawning a new sub-agent
- Internal-only agents (no channel attached)
- Permission delegation from parent to child agent
- Specialist sub-agents (browser agent, dev agent — user's agent delegates with request/approval)
In-Chat Agent Management
- /clear (resets session)
- /compact (triggers context compaction)
- [~] /context (passes through, no rich formatting)
- [~] /usage (passes through, no rich formatting)
- [~] /cost (passes through, no rich formatting)
- Smooth session transitions: load context into new sessions, solve cold start problem
- MCP/package installation from chat
- Browse MCP marketplace / skills repository from chat
Webhook Ingestion
- Generic webhook endpoint for external events
- GitHub webhook handling
- CI/CD notification handling
- Webhook -> messages_in routing
System Actions
- register_group from inside agent (stub exists)
- reset_session from inside agent (stub exists)
Integrations
- Vercel CLI integration in setup process
- Skills for deploying and managing Vercel websites from chat
- Office 365 integration (create/edit documents with inline suggestions)
Memory
- Shared memory with approval flow (write to global memory requires admin approval)
Migration
- v1 -> v2 migration skill
- Database migration (v1 SQLite -> v2 central DB + session DBs)
- Channel credential preservation
- Custom skill/code porting
Testing
- DB layer tests (agent groups, messaging groups, sessions, pending questions)
- Channel registry tests
- Poll loop / formatter tests
- Integration test (container agent-runner)
- Host core tests
- End-to-end flow tests (message in -> agent -> message out -> delivery)
- Delivery polling tests
- Host sweep tests (stale detection, recurrence)
- Multi-channel integration tests
Rollout
- Internal testing across all channels
- Migration skill built and tested
- PR factory migrated as validation
- Blog post / announcement
- Video demos of key flows
- Vercel coordination