English · Español

Phase 32 — Quizzes¶

🇪🇸 Espejo legible de data/quizzes/phase-32-agents.yaml.

Source of truth: data/quizzes/phase-32-agents.yaml.

q-32-01 — The five states of the §A13 grammar-tutor loop (free)¶

In order, name the five states of the §A13 grammar-tutor agent loop and indicate which one is the unique exit edge to a final answer.

Answer

**`observe → reason → tool_call → observe → … → answer`**. The `reason → answer` transition is the unique success-terminal edge; the failure-terminal is the cap-exceeded path returning the fallback message.

q-32-02 — Why caps are non-negotiable (free)¶

Your colleague suggests removing max_turns because "the LLM will know when to stop". Refute in two sentences.

Answer

LLMs are trained to be helpful, which on uncertain inputs translates to "call one more tool", not "give up". Without a hard cap the agent's **termination** guarantee depends on the LLM's calibration — a property we cannot structurally guarantee, especially on adversarial inputs.

q-32-03 — The four canonical agent failure modes¶

Match each failure mode to its mitigation: (i) infinite loop, (ii) hallucinated tool name, (iii) tool-error blindness (retrying with same args), (iv) wrong-answer-with-citations.

(i) → hard max_turns / max_tool_calls cap
(ii) → JSON-Schema mask with tool: enum: [...]
(iii) → prepend [error] markers in scratchpad observations
(iv) → final-answer prompt must reference the tool result by id

Answer

**All four.** Each failure has a specific structural mitigation; layered together they form the agent's correctness contract.

q-32-04 — Where alignment plugs in¶

Per the cross-reference to extension track X3 (RLHF/DPO), what does X3 improve in the Phase 32 agent that the Phase 32 loop itself cannot?

X3 reduces tool-call latency.
X3 trains the reason step's quality and the "give up" calibration, by training the underlying LLM on preference pairs.
X3 replaces the MCP server with a faster transport.
X3 removes the need for caps on the loop.

Answer

**Choice 2.** The agent loop is *structural*; the LLM's reasoning quality is *empirical*. X3 improves the empirical part. The structural caps and masks stay regardless.

q-32-05 — Out-of-scope handling on §A13¶

The agent receives the prompt "Conjugate swim in past simple". Since swim is not in the §A13 20-verb list, what is the correct behaviour?

Generate the best-guess form ('swam') from the model's parametric memory.
Decline politely, stating that the verb is not in the supported set, and offer the closest in-scope verb if any.
Call the conjugate tool with arguments that fail schema validation, expecting an error response.
Loop until the model invents a plausible answer.

Answer

**Choice 2.** Out-of-scope inputs must be declined explicitly. The §A13 microscopic scope is a contract; honest declination is the correct behaviour and counts as a SUCCESS in the eval harness, not a failure.