fix(agent): auto-retry mid-stream + tighten system prompt by code-crusher · Pull Request #13 · MatterAIOrg/OrbCode

code-crusher · 2026-07-02T09:02:39Z

Summary

Two changes shipped on release/v0.3.4 and rolled up into one PR against main.

1. `fix(agent): allow auto-retry mid-stream when partial output can be rolled back`

Previously streamWithRetry only retried before the first chunk, because once any text or reasoning had streamed, re-issuing the request would duplicate on-screen output. The user-visible effect was that a dropped connection after partial progress surfaced as a failed step, even though the model was happy to continue.

This adds an optional onRestart callback to streamWithRetry. When the caller can cleanly undo the partial output (cleared buffers, reset accumulators) it returns true and the stream is re-issued.

The main agent loop installs rollbackForRetry, which resets assistantText, reasoningOpen, reasoningStart, reasoningDetails, and pending tool calls, then emits a new stream-reset event. It declines the restart if a reasoning row was already committed to the transcript, since that cannot be undone.
The compaction path installs a simpler reset that just clears its in-memory summary buffer (compaction only streams text and commits once at the end).
The UI handler for stream-reset clears textBufferRef, streamingText, reasoningBufferRef, and streamingReasoning, and resets the busy label to Working so the spinner reflects the restarted attempt.

Files: src/core/events.ts, src/core/agent.ts, src/ui/App.tsx.

2. `refactor(prompts): rewrite system prompt for speed and editing discipline`

Replaces the "always gather exhaustive context" guidance with a "gather enough, then act" principle. The model is now told that a small, localized change typically needs about 3-6 tool calls and that further exploration after the edit point is identified is waste. The TODO list rule is scoped to multi-step tasks (3+ steps) instead of being mandatory for any work.

The file_edit / multi_file_edit section adds an explicit editing-discipline block: copy old_string verbatim from a same-turn read, treat earlier reads as stale after a successful edit, and never guess at a corrected old_string when a multi_file_edit batch fails.

The read_file and search_files sections collapse their repetitive parameter tables and examples into a short reference plus a "Reading Strategy" / "Search Hygiene" ruleset (read whole regions in one call, budget re-reads, verify the output matches the parameters sent, exclude test/spec/mock paths by default, scope path narrowly).

Two new cross-cutting sections are added: Verifying tool results and avoiding loops (check that outputs match the sent parameters; do not repeat an identical failing call) and Plan before editing (write the full change plan once, then execute edits in one batched pass with a single typecheck/build at the end).

Also fixes a "prefer to let the user to that" typo and a few list-formatting inconsistencies in the TODO list section.

Files: src/prompts/system.ts.

Test plan

Start a long generation, drop the network mid-stream, confirm the agent auto-retries without duplicating output and finishes the step.
Confirm a reasoning row that has already been committed is not silently lost on a mid-stream retry (it should surface as an error rather than a duplicated row).
Run the compaction path against a large transcript and confirm a mid-stream drop still produces a single committed summary.
Spot-check the system prompt on a 1-line edit (no todo list), a 5-step task (todo list), and a wide refactor (no exhaustive re-search after the edit point is identified).

Bump @matterailab/orbcode from 0.3.3 to 0.3.4 in package.json and fold the AGENTS.md context cap bump (~60 -> ~150 lines, covering project structure, architecture, business-logic mapping, and code patterns/conventions without truncation) into the 0.3.3 changelog entry it shipped under.

…oning phase on first content Two related correctness/resilience fixes in the agent's per-turn streaming pipeline: 1. Transient stream failures are now retried automatically. Connection drops before the first chunk (DNS/socket reset/TLS, plus 5xx, 408, 429) are retried up to 3 times with exponential backoff capped at 8s. Real 4xx client errors are not retried. Retries only apply before any output is produced — once chunks have streamed we can't safely retry without duplicating on-screen content, so the error propagates. A user abort is never retried, and the backoff delay is interruptible so Ctrl+C doesn't get stuck waiting it out. A 'Connection to the model failed (...). Retrying n/3 in Ns…' line is emitted via the system event channel so the user sees progress. 2. The 'Thought for Ns' timer now reflects only the thinking phase. Previously, a single boolean 'hadReasoning' flag was set on the first reasoning delta and only checked after the stream ended, so a reasoning segment followed by text would report the entire reasoning+answer span as thinking time. Reasoning is now modeled as an open/close segment: it opens on the first reasoning delta and closes on the first text delta, tool call, or stream end — matching the on-screen 'Thinking' block behavior and supporting interleaved reasoning/content correctly.

Adds a '/task' slash command that lets the user pull a prior session from the same directory into the current conversation as context. Behavior: - '/task' (no argument) opens a SessionPicker over all sessions for the current cwd except the active one. - On selection, the previous task's user/assistant messages are extracted (user messages unwrapped from <user_query> tags) and wrapped in a <previous_task title='...'> block inside a prompt asking the model to summarize it. The summary is then presented in the current conversation as the reference. - Conversations longer than ~8000 chars are truncated with a marker so the prompt stays well under context limits. - If no previous tasks exist in this directory, a friendly info row is shown instead of opening an empty picker. Implementation: - New 'taskPickerSessions' state in App holds the candidate list when the picker is open; it's added to the existing 'no-modal' guard so other modals (MCP picker, link manager, etc.) don't stack. - 'handleTaskSelect' reuses the existing 'runTurn' path — the prompt is the user message, and the model produces the summary. - SessionPicker gains an optional 'title' prop (default unchanged) so the same component reads correctly for both '/resume' and '/task'.

…lled back Previously streamWithRetry only retried before the first chunk, because once any text or reasoning had streamed, re-issuing the request would duplicate on-screen output. The user-visible effect was that a dropped connection after partial progress surfaced as a failed step, even though the model was happy to continue. This adds an optional onRestart callback to streamWithRetry. When the caller can cleanly undo the partial output (cleared buffers, reset accumulators) it returns true and the stream is re-issued. The main agent loop installs a rollbackForRetry handler that: - resets assistantText, reasoningOpen, reasoningStart, reasoningDetails - clears pending tool calls - emits a new 'stream-reset' event so the UI can drop its partial streaming/reasoning buffers - declines the restart if a reasoning row was already committed to the transcript, since that cannot be undone The compaction path installs a simpler reset that just clears its in-memory summary buffer, because compaction only streams text and commits once at the end. The UI handler for 'stream-reset' clears textBufferRef, streamingText, reasoningBufferRef, and streamingReasoning, then resets the busy label back to 'Working' so the spinner reflects the restarted attempt.

…line Replaces the 'always gather exhaustive context' guidance with a 'gather enough context, then act' principle. The model is now told that a small, localized change typically needs about 3-6 tool calls and that further exploration after the edit point is identified is waste. This also tightens the TODO list rule to multi-step tasks (3+ steps) instead of mandating one for any size of work. The file_edit / multi_file_edit section adds an explicit editing discipline block: copy old_string verbatim from a same-turn read, treat earlier reads as stale after a successful edit, and never guess at a corrected old_string when a multi_file_edit batch fails. The read_file and search_files sections collapse their repetitive parameter tables and examples into a short reference plus a 'Reading Strategy' / 'Search Hygiene' set of rules (read whole regions in one call, budget re-reads, verify the output matches the parameters sent, exclude test/spec/mock paths by default, scope path narrowly). Two new cross-cutting sections are added: 'Verifying tool results and avoiding loops' (check that outputs match the sent parameters, do not repeat an identical failing call) and 'Plan before editing' (write the full change plan once, then execute edits in one batched pass with a single typecheck/build at the end). Also fixes two minor copy issues: 'prefer to let the user to that' typo and a few list-formatting inconsistencies in the TODO list section.

matterai-app · 2026-07-02T09:03:13Z

Summary By MatterAI

🔄 What Changed

Implemented automatic retry logic for transient LLM streaming failures with exponential backoff and state rollback. Introduced a new /task slash command to reference and summarize previous conversations. Significantly refactored and tightened the system prompt to improve context gathering and editing discipline.

🔍 Impact of the Change

Improves agent resilience against network instability and API timeouts. Enhances user experience by allowing seamless context injection from past tasks. Reduces token usage and improves instruction following through a more concise system prompt.

📁 Total Files Changed

Click to Expand

File	ChangeLog
Version Bump `package.json`	Incremented version to 0.3.4.
Retry Logic `src/core/agent.ts`	Added `streamWithRetry` with backoff and state rollback safeguards.
Event Type `src/core/events.ts`	Added `stream-reset` event to handle UI buffer clearing during retries.
Prompt Refactor `src/prompts/system.ts`	Tightened instructions, emphasizing context gathering and verbatim editing.
Task Command `src/ui/App.tsx`	Implemented `/task` command and logic to summarize previous sessions.
UI Component `src/ui/components/SessionPicker.tsx`	Added customizable title prop to the session selection component.

🧪 Test Added/Recommended

🔒 Security Vulnerabilities

No critical security vulnerabilities detected. Input validation for the /task command is handled via internal session listing.

matterai-app Bot added 5 commits July 1, 2026 17:17

code-crusher merged commit f0b29b4 into main Jul 2, 2026
1 check was pending

code-crusher deleted the release/v0.3.4 branch July 2, 2026 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agent): auto-retry mid-stream + tighten system prompt#13

fix(agent): auto-retry mid-stream + tighten system prompt#13
code-crusher merged 5 commits into
mainfrom
release/v0.3.4

code-crusher commented Jul 2, 2026

Uh oh!

matterai-app Bot commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

code-crusher commented Jul 2, 2026

Summary

1. fix(agent): allow auto-retry mid-stream when partial output can be rolled back

2. refactor(prompts): rewrite system prompt for speed and editing discipline

Test plan

Uh oh!

matterai-app Bot commented Jul 2, 2026

Summary By MatterAI

🔄 What Changed

🔍 Impact of the Change

📁 Total Files Changed

🧪 Test Added/Recommended

Recommended

🔒 Security Vulnerabilities

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `fix(agent): allow auto-retry mid-stream when partial output can be rolled back`

2. `refactor(prompts): rewrite system prompt for speed and editing discipline`