English · Español
Phase 29 — Quizzes¶
🇪🇸 Espejo legible de
data/quizzes/phase-29-rag.yaml. Respuestas detrás de bloques<details>.
Source of truth: data/quizzes/phase-29-rag.yaml.
q-29-01 — Why RAG over fine-tuning the facts in (free)¶
A teammate proposes "just fine-tune Mini-GPT on the irregular-verb table until it memorizes it; we don't need RAG." List two reasons this is the wrong choice at scale, even though it might work on the §A13 KB.
Answer
(a) **Updates**: every KB change requires re-fine-tuning. (b) **Citations**: the model can't say *which* fact it used. At scale (millions of docs, weekly updates), RAG dominates; fine-tuning is for behaviour, not facts.q-29-02 — Why FlatVectorStore over HNSW for Phase 29¶
The §A13 KB has ~50 chunks. We use brute-force cosine search (FlatVectorStore), not HNSW. What is the asymptotic crossover scale above which a tree-based index becomes faster than the linear scan?
- ≈ 100 vectors
- ≈ 1 000 vectors
- ≈ 10⁴ vectors
- ≈ 10⁶ vectors
Answer
**Choice 3 (≈ 10⁴).** HNSW pays its log-factor + per-node overhead above ~10⁴ vectors. Below that, brute-force on contiguous numpy arrays wins on raw throughput and is much easier to debug.q-29-03 — Reciprocal Rank Fusion constant (free)¶
RRF combines two ranked lists with RRF(c) = Σ 1/(60 + rank(c)). Why 60 specifically, and what would change with a much smaller constant like 5?
Answer
**60** is Cormack et al.'s empirically robust default that damps the gap between top-1 and top-5 ranks. **5** would make rank-1 contributions dominate, amplifying single-source mistakes. Much larger constants flatten the fusion, making it less discriminative.q-29-04 — What skipping retrieval costs you¶
You ablate the retriever — rag_answer sends the bare query to Mini-GPT. Which symptoms should you observe on the §A13 lookup eval set?
- Accuracy drops by ≥ 30 percentage points.
- Faithfulness metric drops to ~0 (nothing to cite).
- Mini-GPT regularizes irregular verbs (e.g., 'writed' for 'wrote').
- Latency increases by ≥ 10× (no retrieval to short-circuit).
Answer
**Choices 1, 2, 3.** Accuracy and faithfulness collapse because parametric memory is insufficient. Latency actually *decreases* (no retrieval step) — the perverse incentive that makes skipping retrieval tempting until you check correctness.q-29-05 — Faithfulness ≠ accuracy (free)¶
Define both metrics in one sentence each, then give one example scenario where you'd have high faithfulness AND low accuracy.