English · Español

Phase 36 — Quizzes¶

🇪🇸 Espejo legible del banco de preguntas; la fuente canónica es data/quizzes/phase-36-frontier-architectures.yaml.

q-36-01 — Why does Switch-style MoE add an auxiliary loss?¶

Free response. Acceptable answers contain collapse.

Router collapse: without aux loss, the router learns to send every token to one expert, dead-weighting the others.

Free response. Acceptable answers contain expert.

The single active expert still learns a reasonable FFN; main loss keeps falling. Only val loss or per-expert token counts surface it.