English · Español

Fase 21 — Quizzes (espejo)¶

🇪🇸 Las preguntas canónicas viven en data/quizzes/phase-21-inference-sampling.yaml.

q-21-01 — Aritmética del corte top-p¶

Prompt (EN): Given probabilities (sorted descending): [0.50, 0.20, 0.15, 0.10, 0.05], what is the smallest top-p set for p = 0.8?

A. {first 1 token}
B. {first 2 tokens}
C. {first 3 tokens}
D. {first 4 tokens}

Correcta: C. Acumulado: 0.50 → 0.70 → 0.85 → 0.95. El primer prefijo con acumulado ≥ 0.8 es {first 3}.

q-21-02 — Por qué p=1.0 admite basura¶

Prompt (EN): Why does top_p = 1.0 produce occasional garbage outputs even when the model assigns very low probability to those tokens?

A. The model's softmax is broken.
B. No truncation; the tail of the distribution is occasionally sampled despite low probability.
C. Temperature is implicitly 0.
D. The tokenizer is producing invalid tokens.

Correcta: B. Sin filtro, el muestreo (sampling) aterriza de vez en cuando en tokens de cola; incluso \(p = 10^{-4}\) se vuelve visible a lo largo de 1000 generaciones.

q-21-03 — Temperatura frente a top-p¶

Prompt (EN): In one or two sentences, explain when temperature scaling and top-p sampling produce different outputs, and which is more appropriate when you want to admit "creative but plausible" continuations.

Respuesta libre. Menciones esperadas: la temperatura suaviza/agudiza la distribución pero mantiene el soporte completo; top-p trunca y renormaliza. Top-p suele preferirse cuando se busca plausibilidad — la temperatura por sí sola nunca anula los tokens de cola.

q-21-04 — Truncamiento adaptativo frente a tamaño fijo¶

Prompt (EN): Select every statement that correctly characterizes top-p sampling versus top-k sampling.

A. Top-p keeps an adaptive number of tokens based on the distribution's entropy.
B. Top-k keeps a fixed number of tokens regardless of entropy.
C. On a confident (peaked) distribution, top-p with p = 0.95 keeps fewer tokens than top-k with k = 50.
D. Top-p is always faster than top-k.

Correctas: A, B, C. Top-p añade un paso de ordenación pero el coste es comparable; D es falsa.

q-21-05 — Beam search frente a muestreo (sampling)¶

Prompt (EN): For the §A13 grammar tutor (which proposes corrections to a learner's sentence and benefits from showing multiple plausible alternatives), is beam search or top-p sampling the better choice?

A. Beam search — it produces the highest-likelihood outputs.
B. Top-p sampling — it produces diverse outputs whose distribution matches the model's assigned plausibility.
C. Greedy — it is deterministic.
D. Either works equally well.

Correcta: B. Beam search da los N beams de mayor verosimilitud, que tienden a ser casi duplicados entre sí (pequeñas variaciones del mismo completado). Top-p extrae muestras diversas cuya composición refleja la incertidumbre del modelo.