Skip to content

English · Español

Phase 33 — Quizzes

🇪🇸 Mirror legible para humanos del banco de preguntas. La fuente canónica YAML está en data/quizzes/phase-33-inference-serving.yaml; el portal de la Fase 41 carga ese archivo.

This page mirrors data/quizzes/phase-33-inference-serving.yaml for human reading. The portal seeder is the source of truth.

q-33-01 — Which term dominates the §A13 grammar tutor's latency on i5-8250U?

For a typical single-client §A13 grammar-tutor request on Borja's i5-8250U (NumPy backend, K≈20 decode tokens), which component of the latency budget dominates p50?

  • JSON parsing and Pydantic validation
  • BPE tokenization of the input sentence
  • The auto-regressive decode loop (K · t_decode_step) ← correct
  • TLS handshake on the inbound connection

Why: Theory 05's budget puts ~130 ms of the ~150 ms p50 in the decode loop.

q-33-02 — Why does throughput collapse without batching at C=8?

Select every reason why the single-request handler that calls model.forward() directly fails to sustain 8 concurrent clients on a 4C/8T CPU.

  • Each request gets its own matmul; BLAS overhead is paid per-request, not amortized. ← correct
  • Requests serialize on the GIL-bound NumPy thread; concurrency doesn't increase parallelism. ← correct
  • FastAPI's event loop becomes the bottleneck, not the model.
  • TCP backlog overflows before the model is even reached.

q-33-03 — What does the KV-cache buy at the §A13 grammar-tutor scale?

Free response. Acceptable answers contain decode.

The cache avoids re-computing attention over the prefix on every decode step, dropping t_decode_step from ~18 ms to ~6.5 ms — about half the total.

q-33-04 — Which health-check endpoint should the load balancer poll?

  • /healthz (liveness)
  • /readyz (readiness) ← correct
  • /metrics
  • /correct

Why: /readyz signals "ready to take traffic" and returns 503 under backpressure so the LB shifts traffic to other replicas. /healthz is for orchestrator-level restart decisions.


See theory/05-latency-budget-i5-8250u.md and the break/ exercises for the practical grounding.