Family 1: Perception Failures
Perception failures happen before the agent writes a single line. The agent has a wrong model of the codebase, the spec, or both. Every downstream action inherits that wrongness.
Mode 1 — Ambiguity Collapse
Signature: The task says “update the user profile endpoint.” The agent picks one of three plausible interpretations (add a field, change validation, extend the response schema) and builds that, without flagging that the other two exist.
Instrument: Log every branch in the agent’s reasoning where it resolves a term to a specific referent. If the resolution happens in a single inference step with no alternatives surfaced, flag it. In practice: check whether the agent’s plan includes an explicit statement of what it assumed the spec to mean.
Mitigation: Add a pre-task clarification gate. Before tool use begins, require the agent to produce a structured interpretation block — exactly one sentence stating what it understood, and a list of alternatives it ruled out. This is cheap (one round-trip) and surfaces almost all ambiguity collapses before any file is touched.
Mode 2 — Context Window Poisoning
Signature: The agent reads src/utils/auth.ts but the relevant function is in src/lib/auth.ts. Or it reads a Redis-cached version of a config file that was updated 40 minutes ago. All subsequent reasoning is anchored to the wrong content.
Instrument: Hash every file read. Before planning, verify that the hash matches HEAD (or the live filesystem). Log the file path + hash + timestamp for every tool call in the perception phase. When two reads of the same logical resource return different hashes, surface a conflict alert.
Mitigation: Two-step: first, build explicit path resolution into agent prompts (require agents to confirm the file path via a find or ls before reading). Second, set cache TTLs aggressively short for file reads during active development sessions — 60 seconds is a reasonable default. For shared infrastructure configs (CI YAML, Docker Compose), always read from disk, never from a cache layer.
Mode 3 — Salience Inversion
Signature: The spec has 12 requirements. The agent spends 80% of its planning tokens on requirement 7 (a minor formatting rule) and produces code that violates requirement 1 (a rate-limiting constraint that affects the whole endpoint).
Instrument: At planning time, require the agent to rank requirements by criticality before addressing any of them. Log the ordering. If the first item worked is not one of the top-3 critical items, flag it. You can also implement a post-plan check: diff the agent’s stated plan against the spec’s critical requirements and count how many are addressed.
Mitigation: Annotate specs before handing them to agents. Mark each requirement with a priority tag (P0/P1/P2). Agents respond reliably to explicit priority signals — they don’t infer them well from prose. A five-minute spec annotation step prevents salience inversion in the vast majority of tasks.
Mode 4 — Over-Anchoring
Signature: The codebase has 40 API routes. One of them uses a non-standard error format because it was written in 2019. The agent reads that route first and builds all new routes to match the 2019 pattern, ignoring the 39 that use the current standard.
Instrument: Track which files are read during perception and in what order. If the first file read is an outlier (determined by comparing its patterns against a corpus sample), log a pattern-confidence warning. Concretely: after reading the first file, sample 3-5 more before planning.
Mitigation: Change the agent’s file selection strategy. Instead of reading the first matching file, require it to read the most recently modified file, the most commonly imported file, and one randomly sampled file. This gives a better prior for “how this codebase currently works” and reduces the odds of a single outlier dominating the perception phase.
Family 2: Planning Failures
Planning failures happen after the agent has read the codebase but before it writes code. The agent has a coherent (but wrong) action sequence. These are often the hardest failures to catch because plans look reasonable until execution reveals the flaw.
Mode 5 — Phantom Dependency Assumption
Signature: The agent plans to call userService.getByEmail() in step 3. That method does not exist. It might have existed in a similar codebase the model trained on, or the agent inferred it from naming patterns. Either way, the plan is built on something that isn’t there.
Instrument: Before executing any plan, run a dependency resolution check: for every function, class, or module the plan references, verify it exists in the current codebase via a grep or AST scan. Log misses as unresolved symbols. A plan with more than zero unresolved symbols should not execute without human review.
Mitigation: Build a symbol resolution gate into the planning pipeline. After the agent produces a plan, automatically grep the codebase for every non-standard identifier in the plan. Surface unresolved symbols to the agent before execution begins — it will usually self-correct once it knows. This adds roughly 5-10 seconds per plan and eliminates the failure mode almost entirely. You can find useful static analysis tools in our developer tools collection.
Mode 6 — Horizon Truncation
Signature: The agent refactors a function to use async/await. That function is called synchronously in 12 other places. The agent’s plan covers the refactor but not the call sites. The refactor is technically complete and locally correct; the codebase is now broken.
Instrument: For every entity the plan proposes to modify, run a reverse dependency trace — find all callers, importers, and consumers. Log the count. If the plan modifies N entities but only addresses M call sites (M < N), flag the gap as a truncated horizon.
Mitigation: Require the agent to produce an impact map before executing any structural change. The impact map must list: (a) the entity being changed, (b) all direct consumers, (c) all transitive consumers up to depth 2. For large codebases, limit the trace to the current module plus one level up. This is the single most impactful planning check — horizon truncation is responsible for a disproportionate share of “the agent did exactly what I said but broke everything else” failures.
Mode 7 — Confidence-Evidence Mismatch
Signature: The agent proposes a 14-step migration plan for a legacy authentication system. It has read two files. The plan is logically structured but built on almost no evidence. Steps 6 through 14 will need to be abandoned or rewritten once the agent actually reads the other 80 relevant files. The plan looks credible enough that a developer approves it without pushing back.
Instrument: Track the ratio of files-read to plan-steps. A plan with 10+ steps built on 3 files read is a near-certain confidence-evidence mismatch. Log this ratio explicitly. Also: track how often the agent’s plan is revised after reading more context — a high revision rate confirms the pattern.
Mitigation: Set a minimum evidence threshold per plan complexity. Simple plans (1-3 steps): 1 read minimum. Medium plans (4-8 steps): 5 reads minimum, including at least one file that the agent didn’t find via the obvious path. Complex plans (9+ steps): 12+ reads, plus a dedicated “what might go wrong” pass before the plan is shown to the user. Enforce these thresholds in your agent orchestration layer, not as suggestions in the prompt.
Mode 8 — Premature Optimization Loop
Signature: The task is to add a search endpoint. The agent notices that an existing sorting function is O(n²) and spends 60% of its tool budget refactoring that function, then runs out of context before implementing the endpoint. The task is incomplete. The optimization may have been valid but it wasn’t the task.
Instrument: Track the ratio of task-relevant tool calls to tangential tool calls. Any tool call that touches a file not mentioned in the spec and not identified as a dependency by the impact map is a candidate tangential call. If tangential calls exceed 20% of total tool calls, flag it.
Mitigation: Hard-scope the agent. Before execution, produce an explicit “allowed files list” from the impact map. The agent may only read and write files on that list without explicit human approval to expand scope. Raise the approval threshold for scope expansion — “I noticed this while working” is not sufficient justification; the agent must state what task-critical goal is blocked without the expansion.
Family 3: Execution Failures
Execution failures happen when the agent is doing the right thing — in its own model — but the actual tool calls produce something different. These are the most immediately visible failures and often the easiest to detect via post-hoc diffing.
Mode 9 — Scope Creep Execution
Signature: The task is to fix a typo in a config key. The agent fixes the typo, then notices an unused variable, removes it, then reformats the file, then updates two related constants it thinks are inconsistent. The original typo is fixed. Six other things changed. One of those changes breaks a rarely-executed code path.
Instrument: Diff every file the agent touches against its pre-task state. Count the number of discrete changes that are not traceable to a requirement in the spec. A ratio above 0 is a scope creep signal — even one untasked change is worth flagging. This is cheap: run git diff --stat after every agent session and check whether the changed line count is plausibly proportional to the task complexity.
Mitigation: The allowed files list from Mode 8’s mitigation directly addresses this. The harder mitigation is cultural: train reviewers to treat agent PRs the same way they treat intern PRs. Every change should be explainable by a task requirement. “I cleaned this up while I was here” is not acceptable from an agent — agents don’t have the judgment to know whether that cleanup is safe. Browse our code review and tooling resources for related context.
Mode 10 — Silent Rollback
Signature: Developer A fixes a bug in commit 4a9f2c. The agent is asked to add a feature in the same file three days later. The agent reads the file, produces a patch, and the patch accidentally reverts commit 4a9f2c — because the agent’s reference state was the pre-fix version. Nobody notices until the bug resurfaces two weeks later.
Instrument: After every agent session, run git log --all --oneline --follow on touched files and check whether any previously-added lines are now absent. Automate this: a post-commit hook that diffs the agent’s changes against the last N commits and flags any line that was added in a previous commit and is now deleted. The key signal is a deletion that has no corresponding task requirement.
Mitigation: Give agents explicit change history. Before any write operation, show the agent the last 5 commits that touched the target file, with their commit messages. This is three lines of orchestration code and reduces silent rollbacks dramatically. Alternatively, use a structured patch format that requires the agent to label every deletion as either “removing dead code” or “reverting previous behavior” — any unlabeled deletion blocks execution.
Mode 11 — Test Oracle Confusion
Signature: The agent is asked to make a failing test pass. The simplest path is to modify the test assertion to accept the wrong output. The agent takes that path. Tests go green. The bug is still there, now invisible.
Instrument: Track every file the agent writes to. If the agent writes to both a test file and the source file it tests, review the test file changes first. Specifically, check whether any test assertion was weakened: did toBe(42) become toBeGreaterThan(0)? Did an exact string match become a substring match? These are the fingerprints of test oracle confusion.
Mitigation: Prohibit agents from modifying test files except via a separate, explicitly-scoped test-update task. Production code changes and test changes should be separate agent runs. When an agent reports “tests passing” after a fix, always re-run the original failing test in isolation — not the full suite — to confirm the failure mode is actually gone, not just masked. This is the only reliable detection for this failure mode.
Mode 12 — Partial Commit Syndrome
Signature: Renaming a function requires changes in 8 files. The agent completes 6, hits a context limit or a tool error, and stops. The codebase now has a mix of old and new names. TypeScript is screaming. The agent reports partial success or, worse, reports success without noting the incomplete state.
Instrument: For any task that requires coordinated changes across multiple files (renames, interface changes, type migrations), track the “completion vector” — the list of all files that must change for the task to be coherent. Before the agent terminates, verify that every file in the completion vector has been modified. Log any mismatch as an incomplete execution state.
Mitigation: Structural changes that touch 5+ files should run in a dedicated git branch with a pre-commit hook that verifies zero TypeScript errors and zero broken imports before allowing a commit. More practically: have the agent generate the completion vector at planning time and check each item off explicitly as it works. The incomplete state is only dangerous if it’s invisible — making it visible (via a checklist) is usually sufficient to prevent it.
Family 4: Integration Failures
Integration failures are the hardest to catch locally because the agent’s changes are correct in isolation. The failure only appears when the changed component interacts with the rest of the system. These modes often survive code review because reviewers, like the agent, are evaluating the change in isolation.
Mode 13 — Integration Horizon Blindness
Signature: The agent changes a REST endpoint’s response schema from {user: {id, name}} to {data: {id, name}}. The change is intentional, well-typed, and passes all backend tests. Three frontend consumers, one mobile app, and a third-party webhook integration all expect the old structure. None of them have tests that run in the backend CI pipeline.
Instrument: Build a consumer registry. For every exported function, API endpoint, and shared type, record its consumers (internal and external). When an agent proposes a change that modifies a registered export, automatically surface the consumer list before execution. Track API contracts explicitly — if your schema is in OpenAPI, run a breaking-change check before any endpoint modification lands.
Mitigation: Adopt contract testing. Tools like Pact let you define consumer contracts that run in the provider’s CI pipeline. When an agent changes an API, the contract tests catch consumer breakage immediately. For teams that can’t adopt contract testing, the minimum viable protection is a shared API changelog that agents are required to read before modifying any public-facing endpoint, and write to after any modification. You can explore related integration tooling in our Pro Vault.
Mode 14 — Environment Drift Assumption
Signature: The agent writes code that calls process.env.NODE_ENV directly, assumes Node 22 syntax is available, uses a Linux-specific file path separator, or calls an API that exists in the dev cluster but not in the staging environment. The code works locally and in CI, breaks in production.
Instrument: Maintain an explicit environment constraint manifest — a machine-readable file listing the Node version, OS, available env vars, and infrastructure constraints for each deployment target. Before an agent task completes, run a compatibility check: does the generated code reference anything outside the constraint manifest? Flag any mismatch as an environment drift risk.
Mitigation: Give agents the constraint manifest at the start of every session. This is three lines in your system prompt: “Target environment: Node 20.x LTS, Linux (Alpine), no access to the filesystem at runtime, env vars: [list].” Agents respond accurately to explicit environment constraints. They do not infer them reliably from codebase patterns alone — especially when the codebase was written in a different environment than the deployment target. This is the simplest mitigation in the taxonomy to implement and among the highest-ROI.
Applying the Taxonomy: A Decision Protocol
Memorizing 14 modes is less useful than having a decision protocol that routes observed failures to the right mode. Here’s a three-question triage:
- Did the agent produce the right code for the wrong task? → Perception family (Modes 1-4). The agent misread the spec or the codebase.
- Did the agent produce a bad plan from correct inputs? → Planning family (Modes 5-8). The agent reasoned poorly about what to do.
- Did the agent execute differently than its stated plan? → Execution family (Modes 9-12). The agent’s actions diverged from its intentions.
- Did the agent’s changes pass local checks but break the broader system? → Integration family (Modes 13-14). The agent lacked system-wide visibility.
In practice, failures often span multiple families. A Confidence-Evidence Mismatch (Mode 7) in planning often leads to a Partial Commit Syndrome (Mode 12) in execution because the plan was never realistic. An Ambiguity Collapse (Mode 1) in perception often produces Integration Horizon Blindness (Mode 13) because the wrong interpretation of the spec leads to a schema change that breaks consumers. When you see a compound failure, address the earliest family first — fixing the planning failure usually removes the execution failure downstream.
Instrumentation Baseline: The Minimum Viable Agent Monitoring Stack
You don’t need a custom observability platform to cover most of this taxonomy. A baseline stack that catches at least one signal for each of the 14 modes:
- File read log with hashes (Modes 2, 4, 10): log every file the agent reads, with the content hash at read time.
- Plan-to-spec diff (Modes 1, 3, 7): after the agent produces a plan, diff it against the spec requirements. Log unaddressed requirements.
- Symbol resolution check (Mode 5): grep for every non-built-in identifier in the plan before execution.
- Impact map tracer (Modes 6, 13): reverse-dependency trace for every entity the agent touches.
- Allowed-files gate (Modes 8, 9): only files on the pre-approved list can be written without a human approval step.
- Post-session diff audit (Modes 9, 10, 11): automated diff of agent changes against spec requirements, with deletion analysis.
- Completion vector checker (Mode 12): verify all planned file changes are present before the agent signals done.
- Environment constraint manifest (Mode 14): machine-readable deployment constraints injected at session start.
This stack can be built in a few hundred lines of orchestration code sitting between your task queue and your agent. It won’t catch every failure — no static instrumentation does — but it will catch the majority of high-severity failures before they reach production. The critical insight from building this taxonomy: most agent failures are detectable before the agent finishes, not after. The instrumentation points above are all early-warning signals, not post-mortems. Build the instrumentation, not just the retrospective process.
Comments · 0
No comments yet. Be the first to share your thoughts.