agentscodextoken-efficiency

Why Your AI Coding Agent Doesn't Know When It's Done

29 May 2026

AI coding agents burn tokens in polling loops and still declare victory while CI is red and review threads are unresolved — a structured control plane replaces that nondeterminism with a single deterministic done-signal at every stage of the dev loop.

Your AI coding agent just opened a PR, told you it was ready, and went quiet. But CI is red, three review threads are unresolved, and it has absolutely no idea.

This is a specific failure mode with a specific cause. The model is not bad at coding. Agents operating without a structured feedback channel have no reliable way to distinguish I think I am done from I am objectively done. Those are very different states, and for software that ships to production, the gap between them is where incidents live.

The victory lap problem

An agent that just pushed a commit has genuinely limited information about whether the task is complete. It can infer from its own context that the code change looks correct. It cannot reliably know whether CI passed on the exact branch it pushed to, whether a reviewer left a blocking comment three minutes ago, or whether the merge queue rejected the PR because of a conflict it did not anticipate.

So the agent makes a judgment call. On small tasks with short context windows, that call is often right. On medium-to-large PRs, it fails in predictable ways. One Hacker News thread on this exact problem attracted 35 comments, with the creator reporting that Claude Code would push changes, respond to reviews, wait for CI, but never really know when it was done — and for larger PRs it would run out of context window or just get lazy and declare everything good when tests are still failing.

This is not an edge case. It is the default behavior of any agent that treats done as a feeling derived from its own output rather than a fact derived from external, structured signals. Users on r/ClaudeCode describe agents cycling endlessly through failure loops — Development, CR fail, rewrite, CR fail, rewrite, CR pass, PR fail, rewrite — with no convergence signal, requiring human intervention every time.

The token tax of uncertainty

The other symptom is waste. When an agent lacks a deterministic done-signal, it substitutes polling loops and context re-reads. It checks CI by re-reading terminal output in full. It re-reads the PR description to reconstruct what it was supposed to do. It spawns sub-agents to explore whether tests are actually passing. One r/ClaudeAI user described Claude Code spawning wild Explore agents that burn an insane amount of tokens before launching a Plan agent. This is uncertainty laundering, not reasoning.

The pattern reproduces across tasks. Agents without a structured feedback channel spend a disproportionate share of their budget reconstructing state they should already have access to. That is the mechanical explanation for the 93% reduction in reasoning tokens AgentRail achieves versus plain Codex on the same tasks. The reasoning tokens do not disappear because the model gets smarter. They disappear because the agent no longer needs to reason about state it receives as structured data.

What a control plane changes

The fix is to make done a deterministic fact rather than an agent judgment. A control plane covering the full dev loop — issue intake to PR submission to CI feedback to review to ship — gives the agent a structured API to query at each stage. The agent does not poll CI output and parse it as free text. It calls a single endpoint and gets a typed response: passing or failing, and if failing, which specific checks. It does not scan review threads for actionable comments. It receives structured review state: unresolved threads, blocking labels, approval status.

With that signal, the agent has a clear done-condition at every stage and stops iterating the moment the condition is met. The false all-clear signals disappear because the agent is no longer guessing at state it now has direct access to.

AgentRail is built around this architecture. It is a local-first control plane for Claude Code, Codex, and Cursor that replaces the nondeterministic polling pattern with a single deterministic signal at every gate in the dev loop. Install it with npm install -g @agentrail-core/cli and then agentrail init. Read more at https://agentrail.app

If your agents are declaring victory while CI burns, the model is not the bottleneck. The missing piece is structure.