Skip to content

Conversation

@ThomasK33
Copy link
Member

Summary

  • Stream agent_report tool args from tool-call-delta so sub-agent workspaces show a live, in-flight report preview.
  • Add a dedicated agent_report tool renderer with a compaction-like fade preview while executing.
  • Delay auto-deleting reported leaf task workspaces by 5s (and disable input + show a banner) so the final output is visible before cleanup.

Testing

  • make static-check
  • bun test src/node/services/taskService.test.ts
  • bun test src/common/utils/tools/agentReportArgsText.test.ts

📋 Implementation Plan

Live-stream sub-agent agent_report output in UI

Problem

  • When a sub-agent calls the agent_report tool, the UI buffers the entire tool call.
  • The sub-agent workspace often disappears immediately after completion, so the user sees a “stuck” workspace with no visible progress.

Goal

  • Make report generation feel alive by streaming partial report text into the sub-agent workspace UI while the agent_report tool call is in-flight.
    • (Optional follow-up) Mirror the same stream into the parent workspace’s task UI so you don’t need to click into the child workspace.
  • Preserve the final, full report for history/traceability.
  • UX should resemble the existing “compaction” streaming experience (live preview + fade/scroll treatment).

Non-goals

  • Changing the semantics of agent_report (it is still one logical tool call that completes once).
  • Reworking the entire sub-agent lifecycle / scheduling.

Recommended approach: stream agent_report from tool-call-delta (frontend-first)

Why this works: the backend already emits tool-call-delta events containing the model’s incremental argsTextDelta, but the renderer currently ignores them for display. We can reconstruct a “live” preview from those deltas and render it with the same fade/preview affordances as compaction.

1) Add a tiny incremental extractor for reportMarkdown

  • Add a small pure helper in common code (so it can be unit-tested easily):
    • Input: accumulated argsText (string)
    • Output: { reportMarkdown: string | null, title: string | null } where values are best-effort and may be incomplete.
  • Implementation detail: scan for JSON key(s) ("reportMarkdown", optionally "title") and decode a partial JSON string (handles escapes like \\n, \\", and tolerates truncated escape sequences).

Net new LoC (product): ~80–140

2) Teach StreamingMessageAggregator to materialize/patch an in-flight tool part

Files/symbols:

  • src/browser/utils/messages/StreamingMessageAggregator.ts
    • handleToolCallDelta() (currently token-tracking only)
    • handleToolCallStart() (currently skips duplicates)

Changes:

  1. Maintain runtime-only state (not persisted) such as:
    • pendingToolArgsTextByToolCallId: Map<string, string>
    • toolCallIdsWithDeltas: Set<string> (to avoid double-counting tokens at start)
  2. In handleToolCallDelta():
    • If data.toolName !== "agent_report", keep current behavior (no UI changes).
    • Coerce data.delta → string; if empty, bail.
    • Ensure a tool part exists even before tool-call-start:
      • If the message has no matching dynamic-tool part yet, create one with:
        • state: "input-available"
        • timestamp: data.timestamp (so it appears in-order)
        • input: { reportMarkdown: "" } (placeholder)
      • Also start timing (pendingToolStarts.set(toolCallId, timestamp)) so the tool duration includes “report writing”.
    • Append delta into pendingToolArgsTextByToolCallId.
    • Run the extractor and patch the tool part input to { reportMarkdown, title? } (best-effort).
    • invalidateCache() so the tool UI updates live.
  3. In handleToolCallStart():
    • If a tool part already exists (created from deltas), update its input with data.args instead of skipping.
    • Avoid double-counting tool-input tokens:
      • If toolCallIdsWithDeltas contains this toolCallId, skip trackDelta(..., "tool-args") for the start event.

Net new LoC (product): ~60–120

3) Render a dedicated AgentReportToolCall with “compaction-like” preview

Files:

  • Add src/browser/components/tools/AgentReportToolCall.tsx
  • Update src/browser/components/Messages/ToolMessage.tsx to route toolName === "agent_report" to the new component.

UI behavior:

  • Default expanded while status === "executing".
  • While executing:
    • Show a live-updating preview of args.reportMarkdown.
    • Wrap preview in CompactingMessageContent (max-height + fade) to mimic compaction.
    • Render as plain text (recommended for perf) or MarkdownCore in streaming mode if we want full fidelity.
  • When completed:
    • Render the final markdown normally (static render).
    • Show title if present (fallback: first # Heading in markdown).

Net new LoC (product): ~120–220

4) Tests + validation

  1. Unit tests (pure):
    • New test file beside the extractor (e.g. *.test.ts).
    • Cases:
      • chunk boundaries split inside \\n, \\uXXXX, and \\"
      • reportMarkdown appears after other fields
      • missing/invalid JSON → no crash, just null
  2. (Optional) small unit test for StreamingMessageAggregator:
    • Feed a sequence of stream-starttool-call-delta (agent_report) events and assert getDisplayedMessages() contains a tool message whose args update.

Manual QA:

  • Spawn a sub-agent that writes a long report; confirm:
    • report preview streams in child workspace
    • tool duration looks non-zero
    • once complete, parent still receives the final report and auto-cleanup still works

Required UX fix: delay auto-delete of reported task workspaces (5s)

Background: TaskService.cleanupReportedLeafTask() currently auto-deletes leaf task workspaces immediately once they reach taskStatus: "reported". That makes the workspace appear “stuck” and then suddenly vanish.

5) Backend: schedule deletion without blocking report propagation

Files/symbols:

  • src/node/services/taskService.ts
    • finalizeAgentTaskReport() (must still deliver report immediately)
    • cleanupReportedLeafTask() (currently deletes immediately)

Plan:

  1. Do not change the report delivery order:
    • Keep deliverReportToParent() + resolveWaiters() happening before any deletion work.
  2. Change cleanup so the first call schedules deletion and returns immediately:
    • Add an in-memory guard like scheduledTaskWorkspaceDeletes: Map<string, NodeJS.Timeout>.
    • In cleanupReportedLeafTask(workspaceId), if the workspace is eligible for deletion:
      • setTimeout(() => void this.cleanupReportedLeafTask(workspaceId, { skipDelay: true }), 5000)
      • return without deleting anything.
  3. In the delayed pass (skipDelay: true), run the existing leaf-first deletion loop:
    • This lets us “perform all the deletes” (the leaf + any newly-leaf reported ancestors) in one go.

Defensive notes:

  • Timer firing should be a best-effort no-op if the workspace is already gone.
  • Re-check eligibility at delete time (status still reported, no children).

Net new LoC (product): ~40–80

6) Frontend: disable input + show “will delete in 5 seconds” message

Files:

  • src/browser/components/AIView.tsx
  • src/browser/components/ChatInput/index.tsx (likely no changes; reuse disabledReason)

Plan:

  1. Detect the state:
    • isReportedAgentTask = Boolean(meta?.parentWorkspaceId) && meta?.taskStatus === "reported"
  2. When isReportedAgentTask:
    • Disable the ChatInput (disabled={... || isReportedAgentTask})
    • Provide disabledReason="Completed — this agent task workspace will be deleted in 5 seconds."
    • Render a small banner near the input mirroring the same message (so it’s visible even if the placeholder is missed).
  3. Confirm the parent still gets the agent_report result immediately:
    • The 5s delay is only for workspaceService.remove(...); it must not block the tool call’s resolution.

Net new LoC (product): ~20–60


Optional follow-ups

A) Mirror the streaming preview into the parent workspace

  • Forward agent_report deltas from child → parent (new event plumbing).
  • Likely best implemented as a new UI-only event type (similar to bash-output) rather than overloading tool-call-delta.
Alternatives considered
  1. Backend emits a new agent-report-output event
  • Pro: parser lives in one place; parent mirroring becomes easier.
  • Con: new ORPC schema + event type + backend state; more moving parts.
  1. Prompt-only workaround: stream report as assistant text, then call agent_report
  • Pro: almost no UI work.
  • Con: duplicates the report in history and increases token usage; still doesn’t fix tool-call buffering in general.

Generated with mux • Model: openai:gpt-5.2 • Thinking: xhigh

ThomasK33 and others added 4 commits December 23, 2025 18:46
Change-Id: I520925e1dc97aa8e35baaa84bc2da70e604c1edf
Signed-off-by: Thomas Kosiewski <[email protected]>
Makes the Code Review "diff refresh" mechanism robust and predictable,
driven by file-modifying tool-call events (`bash`, `file_edit_*`) rather
than polling or filesystem watchers.

## Key Changes

### RefreshController improvements
- **Coalescing rate-limit** instead of resetting debounce — prevents
starvation when tool events are frequent
- **Manual refresh always works** — `requestImmediate()` bypasses pause,
hidden, and min-interval checks
- **Loop prevention** — 500ms minimum interval between refreshes
(scheduled events only, not manual)
- **Trigger tracking** — `LastRefreshInfo` records what caused each
refresh

### Review panel refresh
- **Non-blocking manual refresh** — immediately bumps `refreshTrigger`,
origin fetch runs in background
- **Origin fetch deduplication** — file tree and diff loads share the
same `git fetch` call
- **Refresh tooltip** — shows "Last: 2m ago via manual" for debugging

### GitStatusStore
- **Active workspace priority** — 1s debounce vs 3s for background
workspaces
- **File modification subscription** — refreshes on tool completion
events

## Bug Fixes
- Fixed storybook test failure: `requestImmediate()` was blocked by
MIN_REFRESH_INTERVAL when components subscribed immediately after
`syncWorkspaces()` — now bypasses interval check for manual/immediate
requests

Signed-off-by: Thomas Kosiewski <[email protected]>

---
_Generated with `mux` • Model: `anthropic:claude-opus-4-5` • Thinking:
`high`_
Change-Id: I8aef86611e4dda560cca7b2e727b081570c2373a
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I4db05b84cbb17d2046c853957f685ad9dc7a59d7
Signed-off-by: Thomas Kosiewski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants