feat: add migrate-v2.sh — standalone v1 → v2 migration script

New entry point: `bash migrate-v2.sh` from the v2 checkout.
Replaces the old setup-embedded migration flow with a standalone
4-phase script + rewritten Claude skill for the interactive parts.

Phase 0: Bootstrap (Node/pnpm/deps via setup.sh) + find v1
Phase 1: Core state (env, DB, groups, sessions, tasks)
Phase 2: Channels (clack multiselect, auth copy, code install)
Phase 3: Infrastructure (OneCLI, auth, Docker, skills, container build)
Service switchover: stop v1 → start v2 → test → keep or revert
Phase 4: Handoff → exec claude "/migrate-from-v1"

The skill handles: owner seeding, access policy, CLAUDE.local.md
cleanup, container config validation, fork customization porting.

Key fixes found during testing:
- triggerToEngage: requires_trigger=0 must override non-empty pattern
- unknown_sender_policy defaults to 'public' (strict drops all msgs
  before owner is seeded)
- Service revert must stop v2 (parse unit name from step log, not
  early tsx one-liner that can fail)
- Session continuity: copy JSONL from -workspace-group/ to
  -workspace-agent/ and write continuation:claude into outbound.db
- container_config.additionalMounts written directly to container.json
  (same shape in v1 and v2)
- EXIT trap writes handoff.json; explicit write_handoff before exec

Includes migrate-v2-reset.sh for dev iteration and docs/migration-dev.md
for testing/debugging reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
exe.dev user
2026-05-01 20:13:38 +00:00
parent 36e731c02d
commit 1d73b2986a
13 changed files with 1976 additions and 76 deletions

View File

@@ -1,55 +1,69 @@
---
name: migrate-from-v1
description: Finish migrating a NanoClaw v1 install into v2. Run this after `bash nanoclaw.sh` has completed its automated migration step. Seeds the owner user, applies v1 access defaults, fixes any migration sub-step that didn't finish, and interviews the user about custom v1 code to port forward. Triggers on "migrate from v1", "finish migration", "v1 migration", or automatically after setup when `logs/setup-migration/handoff.json` exists.
description: Finish migrating a NanoClaw v1 install into v2. Run after `bash migrate-v2.sh` completes. Seeds the owner, cleans up CLAUDE.local.md files, reconciles container configs, and helps port custom v1 code. Triggers on "migrate from v1", "finish migration", "v1 migration".
---
# Migrate from v1 to v2
# Finish v1 v2 migration
> ⚠️ **Experimental.** This skill and the setup migration step are early. Remind the user to back up `data/v2.db` + `groups/` before making destructive changes, and prefer small, reversible edits. Not recommended yet for high-stakes production installs.
`bash migrate-v2.sh` already ran the deterministic migration. It handled:
The setup flow's `migration` step (in `setup/migrate-v1.ts`) already ran a best-effort automated pass. Your job is to finish what it couldn't do automatically, then interview the user about any custom code they had in v1 and help port it forward.
- .env keys merged
- v2 DB seeded (agent_groups, messaging_groups, wiring)
- Group folders copied (v1 CLAUDE.md → v2 CLAUDE.local.md)
- Session data copied with conversation continuity
- Scheduled tasks ported
- Channel code installed
- Container skills copied
- Container image built
Read [docs/v1-to-v2-changes.md](../../../docs/v1-to-v2-changes.md) before doing anything — it's the vocabulary for where v1 things moved to in v2.
Your job is the parts that need human judgment: triage any failed steps, seed the owner, clean up CLAUDE.local.md files, reconcile configs, and port any fork customizations.
## What the automation did
Read `logs/setup-migration/handoff.json` first — it has `overall_status`, per-step results in `steps`, and a `followups` list.
The setup flow ran these sub-steps (each as its own progression-log entry):
## Phase 0: Triage failed steps
| Sub-step | What it did |
|----------|-------------|
| `migrate-detect` | Found v1 install on disk (scanned `~/nanoclaw`, `~/.nanoclaw`, `~/Code/nanoclaw`, etc., or `$NANOCLAW_V1_PATH`). |
| `migrate-validate` | Checked v1 DB has expected tables + required columns. |
| `migrate-db` | Seeded `agent_groups` + `messaging_groups` + `messaging_group_agents` from `registered_groups`. Mapped `trigger_pattern`/`requires_trigger``engage_mode`/`engage_pattern`. Did NOT seed `users`/`user_roles`. |
| `migrate-groups` | Copied v1 `groups/<folder>/` to v2. v1 `CLAUDE.md` → v2 `CLAUDE.local.md`. v1 `container_config` JSON → `.v1-container-config.json` sidecar (don't silent-map to v2's `container.json`). |
| `migrate-env` | Merged v1 `.env` keys into v2 `.env` (never overwrote existing keys). |
| `migrate-channel-auth` | Copied non-env auth state per channel (Baileys keystore, matrix state, etc.) based on `CHANNEL_AUTH_REGISTRY` in `setup/migrate-v1/shared.ts`. |
| `migrate-channels` | Ran `setup/install-<channel>.sh` for each channel detected in `registered_groups`. |
| `migrate-tasks` | Ported active v1 `scheduled_tasks` into each session's `inbound.db` as `kind='task'` rows. Inactive tasks dumped to `inactive-tasks.json` for reference. |
Check `handoff.json``overall_status`. If `"success"`, skip to Phase 1.
## Artifacts to read first
If `"partial"`, walk `handoff.steps` — each has `status` and `log` (path to the raw log file). For each failed step:
- `logs/setup-migration/handoff.json`**start here.** Structured summary of every sub-step: `status`, `fields`, `notes`, plus detected channels, group selection, and a top-level `followups` list. The top-level `overall_status` tells you at a glance what kind of session this is.
- `logs/setup.log` — the progression log. Each `migrate-*` sub-step has one entry with status, duration, and a pointer to its raw log.
- `logs/setup-steps/NN-migrate-*.log` — raw per-sub-step stdout+stderr. Read these when a step failed or you need to understand why.
- `logs/setup-migration/schema-mismatch.json` — only exists if `migrate-validate` rejected the v1 DB shape. Describes what was missing.
- `logs/setup-migration/inactive-tasks.json` — v1 scheduled tasks we didn't migrate (completed, stopped, or unmappable schedule types).
1. Read its log file at `handoff.step_logs_dir/<step>.log`.
2. Explain what failed in one sentence.
3. Fix it if mechanical (re-run the step script, hand-write a DB insert, copy a missed file). The step scripts are at `setup/migrate-v2/<step>.ts` and accept `<v1_path>` as the first argument.
4. Use `AskUserQuestion` when a judgment call is needed.
## Flow
Common failures:
- **1b-db failed**: JID couldn't be parsed. Ask the user for the channel type, insert `agent_groups` + `messaging_groups` manually.
- **1d-sessions failed**: v2 DB wasn't seeded yet. Re-run after fixing 1b.
- **1e-tasks failed**: session doesn't exist yet. Re-run after fixing 1d.
- **2c-install-\<channel\> failed**: `git fetch origin channels` may have failed (network). Try again, or ask the user to run manually.
- **3e-container-build failed**: Docker issue. Read the build log, suggest fixes.
### Phase A — always run: owner seeding + access policy
After resolving all failures, proceed to Phase 1.
The automation deliberately did not seed `users`, `user_roles`, or flip `messaging_groups.unknown_sender_policy`. v1 has no ground truth for who the owner is, and no single source for the "anyone can message / only known users" setting. Ask the user.
## Phase 1: Owner and access
1. Read `handoff.json` `detected_channels` to know which channel(s) to address the user on.
2. Use `AskUserQuestion` to ask "Which handle on `<primary channel>` is yours?" with options pulled from context if you have any hints (e.g. recent v1 message senders), plus "Let me type it" and "Use a different channel." Build the user id as `<channel_type>:<handle>`.
3. Insert into v2 central DB (`data/v2.db`):
- `users(id, kind, display_name, created_at)` — use the channel_type as `kind`.
- `user_roles(user_id, role='owner', agent_group_id=NULL, granted_by=NULL, granted_at=now)`.
4. Ask "In v1, could anyone message your assistant, or only known users?" via `AskUserQuestion`:
- "Anyone could message it" → update every row in `messaging_groups` (for migrated channel_types) to `unknown_sender_policy='public'`.
- "Only known users" → leave `unknown_sender_policy='strict'`; walk the user through seeding `agent_group_members` rows for each trusted handle they name.
v2 auto-creates a `users` row for every sender it sees (via `extractAndUpsertUser` in `src/modules/permissions/index.ts`). By the time this skill runs, the owner's row likely already exists — it just needs the `owner` role granted.
Use the DB helpers in `src/db/agent-groups.ts`, `src/db/messaging-groups.ts`, and `src/db/user-roles.ts` rather than hand-rolling SQL — they keep the companion `agent_destinations` and indexes correct. Always init the central DB first:
**User ID format**: always `<channel_type>:<platform_handle>`. Each channel populates this differently:
- **Telegram**: `telegram:<numeric_user_id>` (e.g. `telegram:6037840640`)
- **Discord**: `discord:<snowflake_user_id>` (e.g. `discord:123456789012345678`)
- **WhatsApp**: `whatsapp:<phone>@s.whatsapp.net` (e.g. `whatsapp:14155551234@s.whatsapp.net`)
- **Slack**: `slack:<user_id>` (e.g. `slack:U04ABCDEF`)
- **Others**: `<channel_type>:<platform_id>`
**Steps:**
1. Query `users` table: `SELECT id, kind, display_name FROM users`.
2. If exactly one user exists, confirm: `AskUserQuestion`: "Is `<display_name>` (`<id>`) you?" — Yes / No, let me type it.
3. If multiple users exist, present them as options in `AskUserQuestion`.
4. If no users exist yet (service hasn't received a message), ask the user to send a test message first, then re-query.
5. Once confirmed, check `user_roles` — if the owner role already exists, skip. Otherwise insert:
```sql
INSERT INTO user_roles (user_id, role, agent_group_id, granted_by, granted_at)
VALUES ('<user_id>', 'owner', NULL, NULL, datetime('now'))
```
Use the DB helpers in `src/db/user-roles.ts` — they keep indexes correct. Init the DB first:
```ts
import { initDb } from '../src/db/connection.js';
@@ -60,61 +74,144 @@ const db = initDb(path.join(DATA_DIR, 'v2.db'));
runMigrations(db);
```
### Phase B — branch on `handoff.json: overall_status`
### Access policy
**If `overall_status === 'success'`** and `followups` is empty: go straight to Phase C (customization interview).
After seeding the owner, discuss the access policy. v2's `messaging_groups.unknown_sender_policy` controls who can interact with the bot. `migrate-v2.sh` set it to `public` so the bot would respond during the switchover test, but the user may want to tighten it.
**Otherwise (partial, failed, or non-empty followups)**: walk `handoff.steps` and `handoff.followups` top-to-bottom. For each entry:
Present the options via `AskUserQuestion`:
- Read the step's `fields` and `notes` and its raw log (`logs/setup-steps/NN-<step>.log`).
- Explain the situation to the user in one sentence, then propose a fix.
- Do the fix yourself when it's mechanical (re-running an install script, seeding a missed `agent_destinations` row, re-copying a channel's auth files, manually translating an unsupported `schedule_type`). Use `AskUserQuestion` when a judgment call is needed (is this orphan channel worth keeping? is this v1 container_config still relevant?).
1. **Public** (current) — anyone can message the bot. Good for personal DM bots.
2. **Known users only** — only users in `agent_group_members` can trigger the bot. Others are silently dropped.
3. **Approval required** — unknown senders trigger an approval request to the owner. Good for group chats where you want to vet new members.
Common cases:
If the user picks option 2 or 3, seed the known users from v1's message history. The v1 database is at `<handoff.v1_path>/store/messages.db`. It has a `messages` table with `sender` and `sender_name` columns. For each group:
- **`migrate-validate` status=failed**: the v1 DB had an unexpected shape. Read `schema-mismatch.json`. If tables are missing, the user may have run a very old or customized v1 — ask before trying to salvage. If only columns are missing, you can often proceed by hand-writing the SELECT with the columns that exist.
- **`migrate-db` status=partial, SKIPPED>0**: some `registered_groups` rows didn't seed. The `notes` field of the step entry names each failed folder. Most commonly: a JID we couldn't parse. Ask the user whether to manually wire each.
- **`migrate-channels` status=partial, some entries `not_supported`**: v1 had channels v2 doesn't ship a skill for yet. Ask the user whether to keep the `messaging_groups` rows (they'll stay orphaned until v2 grows the adapter) or delete them.
- **`migrate-channel-auth` has `files_missing`**: for WhatsApp specifically, encryption sessions often can't survive the copy — tell the user a fresh pair may be needed via `/add-whatsapp`.
- **Per-folder `.v1-container-config.json` sidecars exist**: read each, discuss with the user, and translate to v2's `groups/<folder>/container.json` format.
```sql
-- v1: unique senders per chat (excluding bot messages)
SELECT DISTINCT sender, sender_name
FROM messages
WHERE chat_jid = '<v1_jid>' AND is_from_me = 0 AND sender IS NOT NULL
```
### Phase C — customizations (fork-aware)
The `sender` value is a platform handle (e.g. `6037840640` for Telegram). Build the v2 user ID by inferring the channel type from the chat JID prefix (use `parseJid` from `setup/migrate-v1/shared.ts`) and combining: `<channel_type>:<sender>`.
NanoClaw recommends running on a fork, so most real v1 installs have at least some customizations.
For each sender:
1. Upsert into `users(id, kind, display_name)` if not already present.
2. Insert into `agent_group_members(user_id, agent_group_id)` for each agent group wired to that messaging group.
**Start with divergence detection.** In the v1 repo at `handoff.v1_path`:
Show the user the list of senders being imported and let them deselect any they don't want.
Then update the messaging groups:
```sql
UPDATE messaging_groups SET unknown_sender_policy = '<chosen_policy>'
WHERE id IN (SELECT id FROM messaging_groups WHERE channel_type IN (<migrated_channels>))
```
## Phase 2: Clean up CLAUDE.local.md
The migration copied v1's entire CLAUDE.md into CLAUDE.local.md for each group. This file now contains v1 boilerplate that v2 handles through its own composed fragments (`container/CLAUDE.md` + `.claude-fragments/module-*.md`). The user's customizations are buried inside.
For each group that has a `CLAUDE.local.md`:
1. Read the file.
2. Read the v1 template it was based on. Determine which template by checking the v1 install:
- If the group had `is_main=1` in v1's `registered_groups`, the template was `groups/main/CLAUDE.md`
- Otherwise, the template was `groups/global/CLAUDE.md`
- The v1 path is in `handoff.json` → `v1_path`
3. Diff the file against the template. Identify sections that are:
- **Stock boilerplate** (identical to template) — remove. v2's fragments cover this.
- **User customizations** (added sections, modified sections) — keep.
4. The following v1 sections are now handled by v2 fragments and should be removed even if slightly modified:
- "What You Can Do" → v2 runtime system prompt
- "Communication" / "Internal thoughts" / "Sub-agents" → `container/CLAUDE.md` + `module-core.md`
- "Your Workspace" / workspace path references → `container/CLAUDE.md`
- "Memory" (the stock version) → `container/CLAUDE.md`
- "Message Formatting" → `container/CLAUDE.md`
- "Admin Context" → v2 uses `user_roles`, not is_main
- "Authentication" → v2 uses OneCLI
- "Container Mounts" → v2 mounts are different
- "Managing Groups" / "Finding Available Groups" / "Registered Groups Config" → v2 entity model, no IPC
- "Global Memory" → v2 has `.claude-shared.md` symlink
- "Scheduling for Other Groups" → `module-scheduling.md`
- "Task Scripts" → `module-scheduling.md`
- "Sender Allowlist" → v2 uses `unknown_sender_policy` + `user_roles`
5. Fix path references in kept sections:
- `/workspace/group/` → `/workspace/agent/`
- `/workspace/project/` → these paths don't exist in v2; discuss with the user
- `/workspace/ipc/` → gone; remove references
- `/workspace/extra/` → v2 uses `container.json` `additionalMounts`; keep but note the path may change
6. Keep the `# Name` heading and first paragraph (identity) — this is the user's agent personality.
7. Show the user the proposed new CLAUDE.local.md before writing it. Use `AskUserQuestion`: "Here's what I'd keep — look right?" with options to approve, edit, or keep the original.
If a CLAUDE.local.md has no user customizations (pure template copy), write a minimal file with just the identity heading.
## Phase 3: Container config
`migrate-v2.sh` writes `container.json` directly from v1's `container_config` (the `additionalMounts` shape is identical). If the v1 config was unparseable, it falls back to a `.v1-container-config.json` sidecar.
For each group, check:
1. If `container.json` exists, read it and verify the `additionalMounts` host paths are still valid on this machine. Flag any that don't exist.
2. If `.v1-container-config.json` exists (parse failure fallback), read it, discuss with the user, and write a proper `container.json`. Then delete the sidecar.
3. Check for `env` or `packages` fields — `env` may overlap with OneCLI vault, `packages` (apt/npm) are portable.
## Phase 4: Fork customizations
Check whether the user's v1 install was a customized fork.
```bash
cd <v1_path>
git remote -v # identify the upstream remote
git log --oneline <upstream>/main..HEAD # commits ahead of upstream
git remote -v
git log --oneline <upstream>/main..HEAD 2>/dev/null
```
If the log is **empty**: stock v1. Tell the user "no customizations to port" and skip the rest of Phase C.
If no commits ahead of upstream: stock v1, skip this phase.
If the log has commits, show them to the user and offer a scope via `AskUserQuestion`:
If there are commits:
1. **Mechanical** (recommended) — copy the portable categories (skills, docs), stash the rest as reference.
2. **Full interview** — walk each commit with me, decide one-by-one. Use `Explore` sub-agents for diffs > 10 files.
3. **Reference only** — stash everything to `docs/v1-fork-reference/`, copy nothing now.
**Portability rules of thumb:**
- **Portable**: `container/skills/*`, `.claude/skills/*`, `docs/*`, top-level config. Scan each with `scanForV1Patterns` (in `setup/migrate-v1/shared.ts`) before copying — clean ones land as-is, dirty ones get a followup.
- **Not portable**: `src/*` (host) and `container/agent-runner/src/*` (agent-runner). v2's architecture is fundamentally different (Node host with split session DBs vs v1's single process + IPC file queue). Stash to `docs/v1-fork-reference/` with a README explaining the v1→v2 mapping — **don't translate**. Mechanical translation is a trap; let the user rebuild the feature on v2 primitives.
- **Already handled**: `groups/*``migrate-groups` copied these and flagged v1 patterns. Don't redo in Phase C.
- **Case by case**: `package.json` deps — check whether v2 already has each; never add to v2's lockfile without approval (supply-chain `minimumReleaseAge` applies).
When stashing, write `docs/v1-fork-reference/README.md` with commits list, stashed source files, and the suggested porting plan.
1. Show the commit list to the user.
2. `AskUserQuestion`: "How do you want to handle your v1 customizations?"
- **Copy portable items** (recommended) — copy `container/skills/*`, `.claude/skills/*`, `docs/*`. Scan each with `scanForV1Patterns` from `setup/migrate-v1/shared.ts`.
- **Full walkthrough** — go commit by commit, decide together.
- **Reference only** — stash to `docs/v1-fork-reference/` for later.
3. Source code (`src/*`, `container/agent-runner/src/*`) is NOT portable — v2's architecture is fundamentally different. Stash to `docs/v1-fork-reference/` with a README explaining what each file did. Don't translate.
## Principles
- **Never silently copy code.** Read, explain, propose, apply. Show diffs before applying when non-trivial.
- **Credentials are masked when displayed** (first 4 + `...` + last 4 characters). The handoff file doesn't contain values; keep it that way.
- **The v1 checkout is read-only.** We never delete or modify `~/nanoclaw`. If the user wants to retire it later, that's a separate conversation.
- **No migration re-runs.** The `migrate-*` sub-steps are idempotent, but re-running them from inside this skill is almost always the wrong move — finish by hand. Re-running is for when the user re-runs `bash nanoclaw.sh`.
- **`handoff.json` is source of truth across context compactions.** If the conversation gets compacted mid-work, re-read it and `git status` to recover state. Do not maintain a separate state file.
- **v1 checkout is read-only.** Never modify files under `handoff.v1_path`.
- **Show before writing.** Show diffs/proposed content before modifying CLAUDE.local.md or container.json.
- **Mask credentials** when displaying (first 4 + `...` + last 4 characters).
- **`handoff.json` is the recovery point.** If context gets compacted, re-read it and `git status` to recover state.
## When you're done
## Setup steps you can run
- Delete `logs/setup-migration/handoff.json` once every followup is cleared and the user confirms. The file is a working document, not a record — if the user wants a record, offer to move it to `docs/migration-<date>.md` before deleting.
- Tell the user: if the service is running (check `launchctl list | grep nanoclaw` on macOS or `systemctl --user status nanoclaw*` on Linux), restart it so the seeded `users` / `user_roles` / any channel installs take effect.
The setup flow at `setup/index.ts` has individual steps you can invoke if something is missing or failed:
```bash
pnpm exec tsx setup/index.ts --step <name>
```
| Step | When to use |
|------|-------------|
| `onecli` | OneCLI not installed or not healthy |
| `auth` | No Anthropic credential in vault |
| `container` | Container image needs rebuild |
| `service` | Service not installed or not running |
| `mounts` | Mount allowlist missing |
| `verify` | End-to-end health check (run after everything else) |
| `environment` | System check (Node, dirs) |
## When done
1. Run the verify step to confirm everything works:
```bash
pnpm exec tsx setup/index.ts --step verify
```
2. Delete `logs/setup-migration/handoff.json` — offer to save as `docs/migration-<date>.md` first.
3. Restart the service if running so changes take effect:
```bash
# Linux
systemctl --user restart nanoclaw-v2-*
# macOS
launchctl kickstart -k gui/$(id -u)/com.nanoclaw-v2-*
```