add startup circuit breaker and troubleshooting docs

Backs off on rapid restarts to avoid exhausting Discord gateway identify
limits and triggering Cloudflare IP bans. Resets on clean shutdown so only
crashes accumulate the counter. Also adds a troubleshooting section to
CLAUDE.md with the most useful diagnostic locations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Daniel Milliner
2026-04-28 14:01:32 +00:00
parent ae9bcb7c33
commit 2bf296b04a
3 changed files with 95 additions and 1 deletions

View File

@@ -186,7 +186,17 @@ launchctl kickstart -k gui/$(id -u)/com.nanoclaw # restart
systemctl --user start|stop|restart nanoclaw
```
Host logs: `logs/nanoclaw.log` (normal) and `logs/nanoclaw.error.log` (errors only — some delivery/approval failures only show up here).
## Troubleshooting
Check these first when something goes wrong:
| What | Where |
|------|-------|
| Host logs | `logs/nanoclaw.error.log` first (delivery failures, crash-loop backoff, warnings), then `logs/nanoclaw.log` for the full routing chain |
| Setup logs | `logs/setup.log` (overall), `logs/setup-steps/*.log` (per-step: bootstrap, environment, container, onecli, mounts, service, etc.) |
| Session DBs | `data/v2-sessions/<agent-group>/<session>/``inbound.db` (`messages_in`: did the message reach the container?), `outbound.db` (`messages_out`: did the agent produce a response?) |
Note: container logs are lost after the container exits (`--rm` flag). If the agent silently failed inside the container, there's no persistent log to inspect.
## Supply Chain Security (pnpm)