feat(agent-core): detect stalled turns and force text-only recovery#1312
feat(agent-core): detect stalled turns and force text-only recovery#1312flame4 wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: 40579ac The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
e2f029d to
6e30841
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b95a98d184
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| private async runGitStatus(cwd: string): Promise<string> { | ||
| try { | ||
| const proc = await this.agent.kaos.exec('git', '-C', cwd, 'status', '--porcelain'); |
There was a problem hiding this comment.
Detect content changes in already-dirty files
When a turn keeps editing a file that is already modified or untracked, git status --porcelain stays identical (for example, M src/foo.ts) even though the file contents changed; Edit/Write successes also often return short outputs below the 60-character information-gain threshold. In that common single-file refactor case, eight real edits can be classified as stalled and the next step is forced into text-only mode, preventing the agent from making further needed changes. Please include a content-sensitive signal (e.g. diff/hash/mtime for dirty paths) or otherwise count successful write/edit tool results as progress.
Useful? React with 👍 / 👎.
6e30841 to
75f04a8
Compare
Add a ProgressDetector that watches external state (git status, background tasks) and information gain (new non-trivial tool outputs) to detect when a turn is spinning without progress. After 8 consecutive idle steps, the harness injects a system reminder and forces the next model step to run with no tools available, requiring a text-only response. Successful Edit and Write tool results are now counted as progress even when their output is short, so repeated edits to the same already-dirty file are not misclassified as stalled. The stall threshold and minimum information-gain length are configurable via loop_control.progress_stall_threshold and loop_control.progress_min_info_gain_length. This prevents the no-op tool loops seen with commands like Bash(:), Read /dev/null, and echo placeholders, where the model keeps emitting tool calls instead of responding to the user. - packages/agent-core/src/agent/turn/progress-detector.ts (new) - packages/agent-core/src/agent/turn/index.ts - packages/agent-core/src/loop/turn-step.ts - packages/agent-core/src/loop/types.ts - packages/agent-core/src/config/schema.ts - packages/agent-core/test/agent/turn/progress-detector.test.ts (new) - packages/agent-core/test/config/configs.test.ts Co-authored-by: Kimi <kimi@moonshot.cn>
75f04a8 to
40579ac
Compare
|
@chatgpt-codex-connector Thanks for the review. Addressed the P2 feedback: |
|
To use Codex here, create an environment for this repo. |
Related Issue
Resolve #1314
Problem
In long-running turns the model can fall into a tool-use loop where it emits placeholder/no-op calls instead of answering the user. The existing
ToolCallDeduplicatoronly catches exact same-step duplicates, so loops with varied but meaningless calls are not stopped.Anonymized excerpt from a stuck turn (session
0a8e1647-edc1-4cf4-a25b-d11a6cbba943):The turn kept running
Bash(':'),Bash('true'),Read('/dev/null'), andechoplaceholders without changing any file or returning useful new information, eventually exhaustingmax_steps_per_turn.What changed
Added a
ProgressDetectorthat measures progress from external, observable state instead of interpreting model intent:git status --porcelainin the working directory.EditandWritetool results are always counted as progress, even when their output is short, so repeated edits to an already-dirty file are not misclassified as stalled.If a configurable number of consecutive steps pass without either signal, the harness:
{ tools: [] }), so the model can only produce text.Two new
loop_controloptions are exposed:progress_stall_threshold8progress_min_info_gain_length60Example
config.toml:Files changed
packages/agent-core/src/agent/turn/progress-detector.ts(new)ProgressDetectorclass with snapshot-taking, output hashing, and stall counting.packages/agent-core/src/agent/turn/index.tsProgressDetectorper turn with config-driven threshold and min length.afterSteprecords progress and triggersforceTextModeafter threshold.beforeStepinjects reminder and returns{ tools: [] }whenforceTextModeis set.packages/agent-core/src/loop/types.ts/turn-step.tsBeforeStepResult.toolsso the harness can override the tool set.packages/agent-core/src/config/schema.tsprogressStallThresholdandprogressMinInfoGainLengthtoLoopControlSchema.packages/agent-core/test/agent/turn/progress-detector.test.ts(new)packages/agent-core/test/config/configs.test.tsCOMPLETE_TOMLfixture and round-trip assertions for the newloop_controlkeys.Checklist
gen-changesetsskill, or this PR needs no changeset. (Added.changeset/progress-detector-stalled-turns.mdfor@moonshot-ai/kimi-code.)gen-docsskill, or this PR needs no doc update. (No user-facing docs change; behavior is internal to the agent loop.)Co-authored-by: Kimi kimi@moonshot.cn