Commit Graph

402 Commits

Author SHA1 Message Date
gavrielc
8771e259a8 style(cli): apply prettier formatting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 00:42:33 +03:00
gavrielc
a597b42648 feat(cli): add remaining resources, fix descriptions from code review
New read-only resources:
- destinations (agent-to-agent ACL + routing map)
- user-dms (DM channel cache)
- dropped-messages (audit trail for dropped messages)
- approvals (in-flight approval cards)

Description fixes from reading source:
- messaging-groups: add denied_at column (router checks it)
- sessions: fix container_status (idle is unused, stopped is
  auto-restarted by sweep)
- wirings: add note that threaded adapters force per-thread

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 00:40:15 +03:00
gavrielc
6865811147 feat(cli): add CRUD helper, resource definitions, and help command
Resource-first CLI: `nc groups list`, `nc wirings get <id>`, etc.
Seven resources defined (groups, messaging-groups, wirings, users,
roles, members, sessions) with full column documentation that serves
as the single source of truth for help output and arg validation.

- CRUD helper auto-registers list/get/create/update/delete from
  declarative resource definitions with generic SQL
- Custom operations for composite-PK resources (roles grant/revoke,
  members add/remove)
- Access model: open (reads) / approval (writes) / hidden
- `nc help` lists resources; `nc <resource> help` shows fields
- Positional target IDs: `nc groups get <id>`
- Removed unused priority column from wirings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 00:33:10 +03:00
gavrielc
bc19b716bf feat(cli): wire nc CLI commands into container agent
Add delivery action handler (cli_request) so the host dispatches CLI
commands arriving from container agents via outbound.db and writes
responses back to inbound.db. Add nc MCP tool in the agent-runner
following the ask_user_question blocking pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 23:48:39 +03:00
gavrielc
13f6fc2093 merge: catch up nc-cli to main
Resolve conflict in src/index.ts shutdown sequence — keep both
stopCliServer() from nc-cli and try/finally + resetCircuitBreaker()
from main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 17:24:26 +03:00
gavrielc
24d719fb88 Merge pull request #2209 from cfis/fix/host-sweep-test-uses-in-memory-db
fix(host-sweep): orphan-claim delete missed in tests (regression from #2183)
2026-05-05 15:57:31 +03:00
gavrielc
a870e7ebf2 fix: keep resetStuckProcessingRows private, restore test wrapper
The test wrapper forwards the in-memory outDb as the writable handle,
avoiding the filesystem reopen that fails in CI. The function stays
private — the optional writableOutDb param is an internal detail, not
a public API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-05 15:56:08 +03:00
Axel McLaren
6262211af1 Add namespacedPlatformId exclusion for DeltaChat
(cherry picked from commit 5987fdc189f60b5afa76fd08c3d01ccc8a7a3a43)
2026-05-04 06:26:46 -07:00
Gabi Simons
faff9ac0e3 Merge branch 'main' into fix/host-sweep-test-uses-in-memory-db 2026-05-03 17:53:25 +03:00
Gabi Simons
60526c971b Merge branch 'main' into fix/media-only-messages 2026-05-03 15:47:11 +03:00
Ziv Daniel
0e9dadfaee fix: accept media-only messages with empty text in onNewMessage
/./ requires at least one character and silently drops messages with no
text (e.g. Telegram photo/video/file sent without a caption). Switching
to /[\s\S]*/ matches the empty string too, so media-only messages now
reach the router and then the agent.
2026-05-03 15:40:46 +03:00
Charlie Savage
e4181f5451 fix(host-sweep): regression in #2183 — orphan-claim delete missed in tests
#2183 added orphan-claim cleanup that reopens `outbound.db` by session
path (`openOutboundDbRw(session.agent_group_id, session.id)`) so the
delete runs against a writable handle even when callers pass a readonly
one. That works for the production caller — there's a real on-disk
session DB at the expected path.

The test wrapper `_resetStuckProcessingRowsForTesting` (introduced in
the same series, #2151) is called with in-memory DBs that have no
on-disk path. The reopen creates a fresh empty file at
`<DATA_DIR>/v2-sessions/ag-test/sess-test/outbound.db`, runs the delete
against that, and leaves the in-memory `outDb` (which the test reads
afterward) untouched. The two `resetStuckProcessingRows — orphan claim
cleanup` tests assert `getProcessingClaims(outDb).toEqual([])` after
the call and fail on the row that's still there.

Fix: drop the `_…ForTesting` wrapper, export `resetStuckProcessingRows`
directly with an optional `writableOutDb` parameter. When omitted
(production), the function reopens `outbound.db` RW by session path —
existing behavior, existing safety guarantee. When provided (tests, or
any future caller that already holds a writable handle), the function
uses it directly and skips the reopen. The optional parameter has a
real meaning, not a "for tests" hack.

Public API surface change: `_resetStuckProcessingRowsForTesting` is
gone, `resetStuckProcessingRows` is now exported. No other callers
inside the repo besides the test.
2026-05-02 22:54:08 -07:00
Charlie Savage
8d022fd9da fix(host-sweep): reopen outbound DB as writable for orphan claim cleanup
PR #2151 added deleteOrphanProcessingClaims() to resetStuckProcessingRows(),
but outDb is always opened readonly (openOutboundDb uses immutable: true).
The write silently failed, leaving orphan processing_ack rows that block
future message delivery for the session.

Fix: add openOutboundDbRw() alongside the existing readonly opener and use
it in resetStuckProcessingRows() to open a short-lived writable handle just
for the delete. The readonly handle is still used for all reads above.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 23:44:07 -07:00
gavrielc
dab4fb497b Merge branch 'main' into fix/host-sweep-orphan-processing-ack 2026-05-01 18:42:04 +03:00
gavrielc
fc3c11b6b9 fix(session-manager): apply outbox path-confinement to inbound attachments
Mirrors the four defenses on the outbound side onto extractAttachmentFiles:

  1. Reject unsafe messageId via isSafeAttachmentName before any inbox path
     is built. WhatsApp passes msg.key.id through raw and that field is
     client generated, so a peer can craft it; future end to end encrypted
     adapters will have the same property.
  2. lstatSync on the inbox dir refuses a pre placed symlink before
     mkdirSync would silently follow it.
  3. realpathSync + isPathInside contains the resolved dir under the
     session inbox root.
  4. writeFileSync uses the wx flag so a pre placed symlink at the file
     path is refused atomically by the kernel; EEXIST surfaces as a
     logged skip.

Threat: the session dir is mounted writable into the container at
/workspace, so a compromised agent can pre place inbox/<future msgId>/
as a symlink and wait for a chat message with a matching id to redirect
the host write. The four guards together close that window.

Consolidates with the existing isSafeAttachmentName helper from
attachment-safety.ts rather than introducing a duplicate basename
validator inside session-manager.

Co-Authored-By: Daisuke Tsuji <dim0627@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 01:27:09 +03:00
hinotoi-agent
852009dcb1 fix(container): confine outbound attachment paths 2026-05-01 01:27:09 +03:00
Gabi Simons
5cf5840426 Merge branch 'main' into feat/channel-approval-flow 2026-04-30 14:11:21 +03:00
Ethan
7ce9922cde fix(host-sweep): clear orphan processing_ack on kill to prevent claim-stuck loop
When the host kills a container (absolute-ceiling, claim-stuck, or crashed),
resetStuckProcessingRows reset messages_in but left orphan rows in
processing_ack. The next sweep tick spawned a fresh container and, on the
same tick, ran enforceRunningContainerSla against outbound.db that still
contained the previous container's claim with a hours-old status_changed
timestamp — instant kill-claim, before the agent-runner could open
outbound.db to run its own clearStaleProcessingAcks(). Loop until tries
hit MAX_TRIES.

Add deleteOrphanProcessingClaims() in session-db and call it at the end of
resetStuckProcessingRows. Safe to write outbound.db here because the host
only enters this path after killContainer (or when no container is running).

Tests in host-sweep.test.ts cover the helper plus the regression: orphan
claim from a 2h-old kill is now removed atomically with the messages_in
reset, so the next sweep tick sees an empty claims list and the freshly
respawned container survives long enough to start its agent-runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:54:42 +02:00
gavrielc
d2151ae848 Merge branch 'main' into fix/session-manager-attachment-extensions 2026-04-30 10:39:50 +03:00
gavrielc
6e5e568da1 sanitize agent sent file names to prevent path traversal 2026-04-30 10:33:46 +03:00
gavrielc
2a3be9ec7f extract attachment-naming, harden mimeType guard, add tests
Move the MIME/type-to-extension maps and derivation helpers out of
session-manager.ts into a dedicated attachment-naming module — keeps
session-manager focused on session lifecycle and gives the helpers
a natural home for unit tests alongside the existing attachment-safety
module.

Two small fixes alongside the extraction:

- extForMime now guards `typeof mime !== 'string'` before .split, so a
  buggy bridge passing `mimeType: { ... }` (object) no longer crashes
  the inbound write loop.
- deriveAttachmentName computes Date.now() once per call instead of
  twice, and tightens the explicit-name check to a string-and-truthy
  guard so non-string values fall through to derivation.

Adds attachment-naming.test.ts with 11 cases covering MIME normalization
(case + parameters), Telegram type fallback, the non-string defensive
guard, and the bare-timestamp fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 09:41:24 +03:00
Gabi Simons
2b1b138a44 Merge branch 'main' into feat/channel-approval-flow 2026-04-30 09:37:54 +03:00
gavrielc
594d1b4055 style(cli): apply prettier formatting
Pre-commit hook ran prettier on the prior commit but left the reformats
unstaged. Folding them in here so the branch is clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:03:47 +03:00
gavrielc
3a3d2ee644 feat(cli): scaffold nc CLI with list-groups command
Adds a transport-agnostic CLI control plane shared between three eventual
callers (host shell, Claude in project, container agent) — though only the
host-side socket transport is wired in this commit. Container DB transport
and approval flow land alongside the first risky command.

- src/cli/frame.ts:        wire format (RequestFrame, ResponseFrame, CallerContext)
- src/cli/registry.ts:     command registry with RiskClass
- src/cli/dispatch.ts:     transport-agnostic dispatcher
- src/cli/transport.ts:    Transport interface
- src/cli/socket-client.ts: SocketTransport against data/nc.sock
- src/cli/socket-server.ts: host-side listener (chmod 0600, line-delimited JSON)
- src/cli/format.ts:       human table / --json output modes
- src/cli/client.ts:       `nc` argv -> frame -> transport -> stdout
- src/cli/commands/list-groups.ts: first command (riskClass: safe)
- bin/nc:                  bash launcher (resolves project root via symlink)
- src/index.ts:            start/stop server + import command barrel

`data/nc.sock` is intentionally separate from `data/cli.sock` (which the
existing chat-style channel adapter still owns).

Verified end-to-end: `nc list-groups`, `nc list groups`, `--json`,
unknown-command error, host-down ENOENT message with start instructions.
typecheck clean; eslint reports only the same `no-catch-all` warnings the
rest of the codebase has.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 18:03:16 +03:00
robbyczgw-cla
b9d302524e fix(session-manager): derive attachment extension from mimeType and att.type
When a channel bridge passes an attachment without an explicit `name`,
extractAttachmentFiles fell back to `attachment-<ts>` with no extension.
Agents could not tell whether the file was a JPEG, PDF, or audio clip,
and tools keyed on extension (image viewers, exiftool, etc.) misbehaved.

Two cases are now covered:

1. Channels that set `mimeType` but no `name` (Discord/Slack documents,
   Telegram document uploads). A small MIME-to-extension table covers
   the common content types — image/*, audio/*, video/*, pdf, zip,
   txt, json. Unknown MIMEs fall back to the unsuffixed name.

2. Channels that set `att.type` but no `mimeType` (Telegram photos,
   stickers, voice, animations). The chat-sdk bridge sets a coarse
   media-class (`photo` / `sticker` / `voice` / `video` /
   `animation`) which is reliable enough to derive a canonical
   extension. Telegram GIFs are MP4 under the hood.

The existing isSafeAttachmentName security guard is preserved — the
derived name still passes through it before disk I/O. The new lookup
tables emit static values from internal maps and cannot construct a
path-traversal payload; attacker-controlled att.name continues to flow
through the same validator.
2026-04-29 15:01:09 +00:00
gavrielc
3c620bc8d0 Merge branch 'fix/credential-failure-ux' of https://github.com/qwibitai/nanoclaw into fix/credential-failure-ux 2026-04-29 17:52:17 +03:00
gavrielc
d5b48e4742 fix(credentials): address review feedback
- wakeContainer now never throws — returns Promise<boolean>, catches
  internally. Closes the regression risk for the 5 awaited callers in
  agent-to-agent, interactive, and approvals/response-handler that the
  previous version left unwrapped. Router uses the boolean to stop the
  typing indicator on transient failure; host-sweep just awaits.
- Tighten AUTH_REQUIRED_RE: anchor to start-of-string with the specific
  `·` (U+00B7) separator the CLI uses, so an agent that quotes the
  banner mid-sentence in a normal reply doesn't trip the classifier.
- Log a one-line note from writeAuthRequiredMessage so substitutions
  are visible when debugging "user got the credentials message but I
  don't see why."
- Add unit tests for ClaudeProvider.isAuthRequired covering both banner
  variants, trailing content, mid-sentence quoting, leading-prose
  quoting, alternate separators, and unrelated text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 17:51:32 +03:00
gavrielc
1dd8fabde9 Merge branch 'main' into fix/credential-failure-ux 2026-04-29 17:42:25 +03:00
gavrielc
5f34e26240 fix(credentials): translate auth errors and require OneCLI for spawn
Two related fixes for the case where credentials aren't usable:

1. Replace Claude Code's "Not logged in / Invalid API key · Please run
   /login" output with a host-aware message. The user can't run /login
   from chat, so the raw text is unhelpful. Provider gains an optional
   isAuthRequired() classifier; the poll-loop substitutes the message
   on both result-text and error paths.

2. Treat OneCLI gateway failure as a transient hard error instead of
   spawning a credential-less container. The catch in container-runner
   now propagates; router and host-sweep wrap wakeContainer to log and
   leave the inbound row pending so the next 60s sweep tick retries.
   Router also stops the typing indicator on failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 17:02:15 +03:00
Gabi Simons
5dd15c0014 Merge branch 'main' into feat/channel-approval-flow 2026-04-29 16:34:31 +03:00
Gabi Simons
db19837740 feat(permissions): richer channel-approval flow with agent selection and free-text naming
Replace the hardcoded Approve/Ignore card with a multi-step flow:
- Single agent: "Connect to [name]" / "Connect new agent" / "Reject"
- Multiple agents: "Choose existing agent" (follow-up list) / "Connect new agent" / "Reject"
- "Connect new agent" prompts for a free-text name via DM, creates immediately on reply
- Add setMessageInterceptor router hook for capturing free-text replies
- Add resolveChannelName optional method to ChannelAdapter interface

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 13:34:10 +00:00
gavrielc
336e01d2a1 fix circuit-breaker off-by-one, ENOENT, and reset-on-throw + tests
- getDelay indexed by attempt (1-based) into a 0-indexed array, so the
  leading 0 was unreachable and every "after a crash" delay was shifted
  up one slot. Use attempt - 1 so the documented schedule (0s → 0s →
  10s → 30s → 2min → 5min → 15min cap) actually holds.
- enforceStartupBackoff runs before initDb (which creates DATA_DIR), so
  on a fresh checkout fs.writeFileSync hit ENOENT. write() now
  mkdirSync's DATA_DIR first.
- shutdown() didn't run resetCircuitBreaker if teardownChannelAdapters
  threw, so a graceful exit with a teardown error would be counted as a
  crash on the next start. Wrap teardown in try/finally.
- Adds src/circuit-breaker.test.ts: state transitions, full schedule
  (parameterized), reset-window expiry, malformed file, and the
  fresh-install path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 22:51:11 +03:00
Daniel Milliner
2bf296b04a add startup circuit breaker and troubleshooting docs
Backs off on rapid restarts to avoid exhausting Discord gateway identify
limits and triggering Cloudflare IP bans. Resets on clean shutdown so only
crashes accumulate the counter. Also adds a troubleshooting section to
CLAUDE.md with the most useful diagnostic locations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 14:07:24 +00:00
gavrielc
7e37b13aab Fix path traversal in attachment handling on channel-inbound path 2026-04-28 13:26:44 +03:00
gavrielc
6591062fbb refactor: route custom Anthropic endpoint through OneCLI vault
The original approach passed ANTHROPIC_AUTH_TOKEN into the container
as an env var and disabled the proxy for the custom host (NO_PROXY) —
which works, but bypasses OneCLI entirely for that credential. The
container holds the raw secret, the gateway loses audit/rotation, and
we lose the rest of the vault's protections for this cohort.

OneCLI-native version: store the token as a generic secret with header
injection (--header-name Authorization --value-format 'Bearer {value}'
+ host-pattern matching the base URL hostname). The container only
needs ANTHROPIC_BASE_URL plus a placeholder ANTHROPIC_AUTH_TOKEN — the
proxy rewrites the Authorization header on the wire.

setup/lib/setup-config.ts — adds --anthropic-auth-token alongside the
existing --anthropic-base-url.

setup/auto.ts — runAuthStep short-circuits the auth-method prompt when
both NANOCLAW_ANTHROPIC_BASE_URL and NANOCLAW_ANTHROPIC_AUTH_TOKEN are
set: creates the OneCLI generic secret, writes ANTHROPIC_BASE_URL to
.env (so the runtime reads it), and appends `import './claude.js';` to
src/providers/index.ts (so the provider only registers when the user
has configured a custom endpoint — no branching for everyone else).

src/providers/claude.ts — drops ANTHROPIC_AUTH_TOKEN/NO_PROXY
passthrough. Reads ANTHROPIC_BASE_URL from .env, sets a placeholder
ANTHROPIC_AUTH_TOKEN in container env so the SDK includes an
Authorization header for OneCLI to overwrite.

src/providers/index.ts — removes the unconditional import; setup
appends it on demand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 00:34:31 +03:00
KeXin95
26fc3ff322 feat: pass ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN into agent containers
Users with a custom Anthropic-compatible endpoint (ANTHROPIC_BASE_URL) were
getting 401s because the OneCLI proxy injects ANTHROPIC_API_KEY=placeholder
and forwards to api.anthropic.com, overriding the custom endpoint and key.

Add a claude provider host config that reads ANTHROPIC_BASE_URL,
ANTHROPIC_AUTH_TOKEN, and CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC from .env
and passes them into the container. Also sets NO_PROXY for the custom host so
the OneCLI proxy doesn't intercept those requests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-27 00:32:16 +03:00
gavrielc
2825f657ca Merge branch 'main' into fix/register-channel-wiring 2026-04-24 17:20:29 +03:00
gavrielc
f804ebf2e9 Merge branch 'main' into fix/session-state-per-provider-and-agent-route-files 2026-04-24 17:13:06 +03:00
grtwrn
fc375ca72b fix(register): wire channels with correct engage fields, skip prefix for native IDs
setup/register.ts had two bugs that prevented new channels from being
registered via `/manage-channels`:

1. createMessagingGroupAgent was called with the legacy field names
   `trigger_rules` and `response_scope`. The SQL INSERT expects
   `engage_mode` / `engage_pattern` / `sender_scope` / `ignored_message_policy`
   (migration 010). Every register call failed with
   `RangeError: Missing named parameter "engage_mode"` after the agent
   and messaging group were partially created — leaving an orphaned pair.

   Now mirrors scripts/init-first-agent.ts:wireIfMissing:
   - Groups (is_group=1) default to engage_mode='mention' (bot only
     responds when addressed).
   - DMs (is_group=0) default to engage_mode='pattern' with '.' (respond
     to every message).
   - An explicit --trigger overrides the pattern regex.

2. The "normalize platform_id" block unconditionally prefixed
   "<channel>:" even for native IDs like WhatsApp JIDs
   ("120363408974444974@g.us"), iMessage emails ("user@example.com"),
   or Signal phones ("+15551234567") / Signal groups ("group:abc"). But
   the router (src/router.ts:158) looks up messaging_groups by the raw
   event.platformId from the adapter, which for these native adapters
   never has a prefix. So the prefixed row was never matched — the
   message was silently dropped with no "Message routed" log.

   Extracted scripts/init-first-agent.ts:namespacedPlatformId into
   src/platform-id.ts so both setup paths use the same heuristic (skip
   the prefix for IDs containing '@', starting with '+', or starting
   with 'group:'). Prevents future drift between the two paths.

Tested by: re-running `setup/index.ts --step register` for a WhatsApp
group JID, confirming the row is created with correct engage fields
and matching platform_id, then sending a test message and observing
"Message routed" with the right agent group.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 17:06:10 +03:00
glifocat
3d6837c411 chore(format): apply prettier to chat-sdk-bridge.ts
Two long-line violations introduced in d121cd1 (isGroup plumbing)
exceed the printWidth limit. CI format:check fails on every PR
opened against main until this is fixed; the fix is isolated here
so no behavior change is mixed in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 12:12:05 +02:00
Adam
fd03b89333 fix(agent-route): reject unsafe attachment filenames to prevent path traversal
Filenames in forwardAttachedFiles arrived from the source agent's
messages_out content and were used directly in path.join on both
source outbox read and target inbox write. A value like `../evil.sh`
could escape `inbox/<a2a-id>/` on the target session (and similarly
the source outbox on read), breaking session isolation — an
adversarial or hallucinating sub-agent could overwrite files in
a sibling session.

Adds isSafeAttachmentName(name) — exported so it's unit-testable —
which rejects empty, `.`, `..`, anything containing `/`, `\`, or
NUL, and anything path.basename would strip. Guard runs before any
I/O. Unsafe names are dropped with a warning log, same pattern as
missing-source-file handling; a bad filename in one attachment
doesn't kill the whole route's text delivery.

Addresses Codex Review P1 on qwibitai/nanoclaw#1967.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:45:08 +10:00
Adam
672e228876 fix(agent-route): forward file attachments between agents
Before: `send_file(to='parent')` from a sub-agent wrote the bytes to
the sub-agent's own session outbox, but agent-to-agent routing copied
only the content JSON — the target's inbound message referenced
`files: ['x.png']` but the bytes lived in a session directory the
target couldn't mount. Parent agents orchestrating sub-agents (e.g.
Design Team delegating illustration work to an Illustrator sub-agent
on Codex) received file-reference messages with nothing to forward.

Fix: on route, if the source's content has `files`, copy each referenced
file from `<source>/outbox/<src-msg-id>/` to
`<target>/inbox/<a2a-msg-id>/`, and emit `attachments` (the existing
formatter convention — see formatter.ts:223) with `localPath` relative
to `/workspace/`. The target formatter already renders these as
`[file: <name> — saved to /workspace/inbox/<a2a-id>/<name>]`, so the
target agent sees the path and can call `send_file(path=…, to=…)` to
forward onward.

Convention matches what session-manager.ts:256 already does for
base64-encoded channel-inbound attachments — same inbox layout, same
content shape. Nothing on the formatter/agent side needed to change.

## Scope

- `forwardAttachedFiles(source, target)` — pure-ish helper that copies
  files and returns the attachments array.
- `forwardFileAttachments(msg, …)` — wraps the helper for the route
  path: parses content, copies files if present, merges into any
  existing `attachments`, re-serialises.
- `routeAgentMessage` — uses the rewritten content when writing the
  target's inbound row.
- Log line now includes `forwardedFileCount` for observability.

Missing source files are skipped with a warning rather than killing
the route — a bad filename in a batch shouldn't drop the
accompanying text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:34:29 +10:00
exe.dev user
5845a5a980 fix(container-runner): honor agent_provider DB columns with session override
resolveProviderContribution read only containerConfig.provider (from each
group's container.json) and ignored both agent_groups.agent_provider and
sessions.agent_provider. The provider-install skills (opencode, codex)
and CLAUDE.md document those DB columns as the source of truth with
session-overrides-group precedence, but the code never consulted them —
so setting `agent_provider = 'codex'` on a group had no effect, and the
only way to route to a non-default provider was to edit the per-group
JSON directly. Discovered while wiring up Codex: DB update landed but
the spawned container kept running Claude.

Extract a pure `resolveProviderName(session, group, containerConfig)`
with the documented precedence:

    sessions.agent_provider
      → agent_groups.agent_provider
      → container.json `provider`
      → 'claude'

`resolveProviderContribution` now calls it. The container.json fallback
stays so existing installs that only set provider in JSON keep working.
Empty strings treated as unset to avoid footguns when a DB-backed form
writes '' for "no override."

Added unit tests covering precedence, null-fallthrough, empty-string
fallthrough, and case normalization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:47:10 +00:00
gavrielc
c1d0395d11 Merge branch 'main' into main 2026-04-23 23:04:35 +03:00
gavrielc
f351e46008 refactor(approvals): persist title+options on channel/sender approval tables
getAskQuestionRender used to hardcode the card title and option labels
for pending_channel_approvals and pending_sender_approvals in the
DB-access layer, duplicating wording that already lived in the approval
modules. That caused a visible drift between the initial card title —
picked per event in channel-approval.ts ("📣 Bot mentioned in new chat"
vs. "💬 New direct message") — and the post-click render, which
always showed the constant "📣 Channel registration".

Mirror the pattern already used by pending_approvals: add title /
options_json columns on both pending_*_approvals tables via migration
013, have the approval modules write them at creation time, and let
getAskQuestionRender just SELECT.

- Migration 013 ALTERs the two tables to add title + options_json.
- PendingChannelApproval / PendingSenderApproval types and their
  create functions grow the two fields.
- channel-approval.ts / sender-approval.ts normalize options once
  and pass both title and options_json into the insert.
- getAskQuestionRender drops the hardcoded render objects and reads
  the stored values.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:54:47 +03:00
gavrielc
ffd38f660a Merge branch 'main' into fix/pending-rows-idempotent 2026-04-23 22:37:22 +03:00
exe.dev user
97868af5a7 fix(delivery): make pending_questions/approvals insert idempotent
createPendingQuestion and createPendingApproval both run before the
adapter delivery call. When delivery fails and the retry loop reinvokes
deliverMessage with the same questionId/approvalId, the second attempt
hit UNIQUE constraint on the pending_questions.question_id (or
pending_approvals.approval_id) and threw — so the retry never reached
the send step, and every subsequent retry failed the same way until
max-attempts marked the message permanently failed.

Switch both inserts to INSERT OR IGNORE. Return bool indicating whether
a new row was actually inserted so delivery.ts can avoid logging
"Pending question created" twice for the same card.

Symptom that surfaced this: a send-layer ValidationError on one attempt
followed by SqliteError on every subsequent attempt, with the user
seeing neither the card nor a follow-up. Seen in conjunction with the
Telegram 64-byte callback_data limit (fixed separately in
#1942/chat-sdk-bridge), but the idempotency gap applies to any
transient delivery failure — rate limits, network blips, adapter 5xx —
and is worth fixing on its own.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:05:41 +00:00
exe.dev user
ff277c0d49 fix(chat-sdk-bridge): encode option index in callback_data for Telegram 64-byte cap
ask_question cards failed to deliver on Telegram whenever any option had
a non-trivial value (e.g. an ISO datetime, a URL, or a long token).
Telegram limits inline-keyboard callback_data to 64 bytes, and the
previous encoding embedded both the questionId and the full option
value in each button's actionId plus a second copy as value, producing
payloads well over the cap. The adapter threw ValidationError, delivery
was marked permanently failed, and the agent sat waiting on an answer
that never reached the user.

Fix:
  - Button id is now `ncq:<questionId>:<index>` and button value is the
    stringified index. Callback payloads shrink from ~100 bytes to ~40
    and fit Telegram's cap for any option list with <100 items.
  - Both callback-decode sites (Chat SDK `onAction` for Telegram/Slack/
    etc., and the Discord Gateway interaction handler) resolve the
    index back to the real option value via
    `getAskQuestionRender(questionId)` before dispatching to the host's
    onAction — so response handlers (pending_questions, pending_approvals)
    are unchanged and still receive the canonical value.
  - `resolveSelectedOption` helper has a backward-compat fallback:
    non-numeric tails are treated as literal values so any card
    delivered under the old encoding still resolves if the user clicks
    it after deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:56:21 +00:00
Gabi Simons
a8eb82d529 Merge branch 'main' into main 2026-04-23 18:24:24 +03:00
exe.dev user
237876c2c6 chore(format): wrap session-manager import in container-runner
Pre-commit prettier reformatted this in the working tree but didn't
re-stage. Keeping it in a separate commit to avoid amending a prior
commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:12:56 +00:00