English · Español

Lab 03 — Failure mode tour: induce four classic agent bugs¶

Read theory/01-react-and-planning.md, theory/02-memory.md. Do not consult solutions/.

Objective¶

Deliberately induce four classic agent failure modes — looping, hallucinated tools, scratchpad leakage, budget exhaustion — observe each, document the symptom, and verify your GrammarTutorAgent either prevents or gracefully reports each. This is the negative-space lab: a tour of what goes wrong so you know what to defend against.

Setup¶

Use the agent from Lab 01. Construct adversarial Planner variants that trigger each failure mode. No new infrastructure needed; just MockPlanner subclasses that emit the wrong steps.

Tasks¶

Failure mode 1 — looping (same action repeated)¶

Construct a LoopingPlanner that emits the same ToolCall(tool="conjugate", args={"verb": "go", "person": "3sg", "tense": "past_simple"}) on every step.

Run the agent:

agent = GrammarTutorAgent(planner=LoopingPlanner(), ...)
result = agent.correct("He goed to school.")

Expected behavior:

The agent's duplicate-action detector should trigger on step 2.
The agent returns a CorrectionResult with corrected=None, rationale=["agent looped"], in_scope possibly True (we just couldn't decide).
Total steps spent: ≤ 2 (not the full max_steps).

Assertions:

assert len(result.tool_trace) <= 2
assert "loop" in " ".join(result.rationale).lower()

If your agent runs through all 8 steps before halting, the duplicate detection isn't wired in.

Failure mode 2 — budget exhaustion (no looping, just slow)¶

Construct an IndecisivePlanner that emits a different ToolCall on every step — never a FinalAnswer. Effectively the planner can't decide.

Run the agent. Expected: the agent runs all max_steps (default 8) and returns a budget_exhausted result with a clear rationale.

Assertions:

assert len(result.tool_trace) == 8  # full budget spent
assert "budget" in " ".join(result.rationale).lower()
assert result.corrected is None

Failure mode 3 — hallucinated tool name¶

If your planner is implemented per Lab 00 (with JSONSchemaMask), the planner cannot emit an unknown tool — the enum mask prevents it. To test the safety net, construct a HallucinatingPlanner that bypasses the mask (returns a hand-built ToolCall(tool="this_tool_does_not_exist", args={})).

Run the agent. Expected: the dispatcher (Phase 31 MCP client) raises a clean ToolNotFoundError; the agent catches this and either:

(a) Reports a clean failure (returns CorrectionResult with rationale "unknown tool: this_tool_does_not_exist"), OR
(b) Re-prompts the planner with the error as an observation and tries again.

Default: option (a) for Phase 32. Option (b) is a retry pattern worth knowing about but adds complexity.

Assertions:

assert result.corrected is None
assert any("unknown" in r.lower() or "does not exist" in r.lower() for r in result.rationale)

Failure mode 4 — scratchpad leakage¶

This is the one that hides until production: the scratchpad is accidentally shared across correct() calls. To test:

agent = GrammarTutorAgent(planner=MockPlanner(scripts_for_two_sentences), ...)
result_a = agent.correct("He goed to school.")
result_b = agent.correct("I has a book.")

# The scratchpad for B should contain only B's steps.
# If the bug is present, B's trace will contain steps from A's correction.
for step in result_b.tool_trace:
    assert step.args.get("verb") in {"have", "has", None}, \
        f"scratchpad leaked: step references '{step.args.get('verb')}' from previous correction"

Expected: the test passes (scratchpad is local to each correct() call).

If the test fails, the bug is the one described in theory/02-memory.md §"A common pitfall: memory leak across corrections" — fix by constructing ScratchpadMemory() inside correct().

Task 5 — write a short post-mortem note¶

In learners/borja/phase-32/notes.md, write a 2-3 paragraph summary:

Which of the 4 failure modes did your implementation already handle correctly?
Which required a code change?
What other failure modes should you be testing for, but aren't yet? (Possible answers: planner emitting malformed args, tool returning ill-typed data, MCP timeout mid-call, agent state corrupted across runs.)

This note becomes part of PHASE_32_REPORT.md.

Measurements to capture¶

For each of the 4 failure modes: assertion pass/fail, observed agent behavior, time spent before halting.
Total runtime of the failure-mode tour (should be ~seconds, not minutes — looping must terminate fast).

Save to experiments/<date>-phase-32-failure-tour/results.json.

Acceptance¶

All 4 failure-mode tests written.
Looping detector triggers on step 2.
Budget exhaustion halts cleanly at step max_steps.
Unknown tool returns a clean failure result.
Scratchpad is local to each correct() call (no leakage).
Post-mortem note in learners/borja/phase-32/notes.md.
Failure-mode results saved.

Pitfalls to expect¶

Looping detection thresholds. Detecting "same action twice in a row" catches the simplest loops. A more sophisticated detector looks for cycles of length 2+ (A → B → A → B). Phase 32 implements length-1 only; the higher-order cycles are an extension. Document the limit.
Budget-exhaustion result confusion. If corrected=None and in_scope=True, what does the user see? Phase 32's policy: the rationale should explicitly say "could not decide within budget" — don't conflate this with "no correction needed."
Scratchpad leak detection is fragile. The test in Failure mode 4 relies on Sentences A and B having disjoint vocabulary. If the test sentences happen to share a verb, the test passes by coincidence. Use clearly distinct verbs in the test (go for A, have for B).
HallucinatingPlanner should be obviously synthetic. Mark it with a docstring: "Test only; bypasses JSONSchemaMask to verify dispatcher's defence." Otherwise a future maintainer might mistake it for a real planner template.

Next: Phase 33 — Inference Serving: From FastAPI to Continuous Batching (after /quiz 32 and /phase-report 32).