fix(poll-loop): slash commands silently broken on warm containers

The follow-up poller filtered /clear out of every tick without acking
the row, and pushed every other slash command through plain
formatMessages() (XML wrapping). On a warm container the outer
while(true) loop never regains control, so:

  - /clear sat pending in messages_in forever (no response at all)
  - /compact, /cost, /context, /files, /remote-control arrived at the
    SDK as XML-wrapped user text and were never dispatched as commands

Both modes are invisible to host monitoring: rows are either left
pending without a processing_ack claim, or marked completed normally;
heartbeat keeps firing inside the SDK event loop.

When the follow-up poller observes any slash command (admin or
passthrough — categorizeMessage decides), end the active query so the
current turn winds down cleanly and the outer loop wakes, re-fetches
the same pending set, and runs them through the canonical path
(/clear handler + formatMessagesWithCommands raw dispatch). Leave the
rows untouched so the outer-loop fetch sees the same set the poller
saw.

Cost: each slash command on a warm container forces close+reopen of
the SDK stream — a few seconds of subprocess startup. The Anthropic
prompt cache is server-side with a 5-min TTL keyed on prefix hash, so
stream lifecycle does not affect cache lifetime; close+reopen within
5 min still gets cache hits.

Also corrects the warm-stream rationale comment on processQuery, which
implied keeping the stream open preserved cache warmth — it doesn't.

Testing evidence — cache stays warm across stream close+reopen:

  Turn 1 (warm session):
    Usage: in=6 out=245 cache_create=92 cache_read=22996
    Full cache hit (22996 tokens).

  Turn 2 — /clear arrives:
    Pending slash command — ending stream so outer loop can process
    Clearing session (resetting continuation)
    Usage: in=6 out=95 cache_create=9393 cache_read=13600
    System prompt + tool defs (~13600 tokens) still hit cache;
    conversation history is gone (continuation reset) so the new turn
    writes fresh context.

  Turn 3 — /cost arrives:
    Pending slash command — ending stream so outer loop can process
    Usage: in=0 out=0 cache_create=0 cache_read=0 wall=0.0s api=0.0s
    /cost is a CLI built-in: dispatched locally by the SDK, no API
    call. Pre-fix this would have arrived as XML-wrapped user text
    and never dispatched — confirms the broader fix works.

  Turn 4 (next chat after /cost):
    Usage: in=6 out=142 cache_create=328 cache_read=22993
    Full cache hit again (22993 tokens read, 328 written). Despite the
    /cost-induced stream close+reopen, the server-side prompt cache
    survived: the new sdkQuery() resumed the same continuation, the
    request prefix matched the cached entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mike Nolet
2026-05-02 07:33:07 +02:00
parent 1b08b58fcd
commit 1ebb2dc8d2
2 changed files with 38 additions and 10 deletions

View File

@@ -66,6 +66,18 @@ export function isClearCommand(msg: MessageInRow): boolean {
return text.toLowerCase().startsWith('/clear');
}
/**
* True for any chat that needs the outer loop's command path: /clear plus
* admin/passthrough slash commands the SDK can only dispatch when they are
* a query's first input. Used by the follow-up poller to bail out and let
* the outer loop reopen the query.
*/
export function isRunnerCommand(msg: MessageInRow): boolean {
if (msg.kind !== 'chat' && msg.kind !== 'chat-sdk') return false;
const cat = categorizeMessage(msg).category;
return cat === 'admin' || cat === 'passthrough';
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
function extractSenderId(msg: MessageInRow, content: any): string | null {
const raw: string | null = content?.senderId || content?.author?.userId || null;