English · Español

Theory 05 — Spaced repetition¶

🇪🇸 Una pregunta de examen fallada no se descarta — se convierte en una tarjeta de revisión. El algoritmo SM-2 (Wozniak, 1990) determina cuándo se vuelve a presentar, basándose en lo bien que se recordó. La intuición: el olvido es exponencial, así que el espaciado óptimo también lo es. Una tarjeta correcta hoy se ve mañana, pasado, en 6 días, luego 14, 30, 60 — siempre justo antes de que se olvide. Curva de retención maximizada con coste de tiempo mínimo.

Why spaced repetition for exam failures¶

A failed exam question is high-signal data: it identifies a precise gap in the learner's mastery. Re-presenting it immediately (the next day) and at increasing intervals as memory consolidates yields a retention curve far above re-reading the source material. The literature on this is decades old; Ebbinghaus (1885) characterized the forgetting curve, Wozniak (1985, 1990) formalized SM-1 through SM-2 to optimize against it, Ye et al. (2024) extended the family with FSRS-5. The grammar tutor curriculum exercises a discrete set of conjugations — exactly the data type for which spaced repetition's gains are largest.

The portal makes the choice automatic: any exam question answered incorrectly (exam_responses.correct_at_first_try = FALSE) seeds a review_cards row. The learner sees the card scheduled on the "Today's reviews" widget; grading it 0–5 advances or resets its interval. No explicit "add to my deck" action is needed.

The SM-2 algorithm¶

SM-2's state per card is two real numbers and one date:

Field	Type	Initial value	Meaning
`ease`	REAL	2.5	Easiness factor; multiplier on next interval.
`interval_days`	INT	1	Days until next review.
`due_on`	DATE	today + 1	When the card next appears.

The grading input is q ∈ {0, 1, 2, 3, 4, 5} per the original paper:

0 — blackout; no memory.
1 — incorrect; the right answer felt familiar.
2 — incorrect; the right answer was on the tip of the tongue.
3 — correct, but with serious difficulty.
4 — correct, after hesitation.
5 — perfect recall.

The update rule, written for our table:

def sm2_update(card, q):
    if q < 3:
        # Failure — restart the interval.
        card.interval_days = 1
        card.ease = max(1.3, card.ease - 0.20)
    else:
        # Success — multiply the interval.
        if card.repetition_count == 0:
            card.interval_days = 1
        elif card.repetition_count == 1:
            card.interval_days = 6
        else:
            card.interval_days = round(card.interval_days * card.ease)
        # Easiness drifts with quality, bounded below at 1.3.
        delta = 0.10 - (5 - q) * (0.08 + (5 - q) * 0.02)
        card.ease = max(1.3, card.ease + delta)

    card.due_on = today + card.interval_days
    card.last_reviewed_at = now()

The portal does not track repetition_count as a separate column. It's inferred from last_reviewed_at IS NULL (count = 0) and last_reviewed_at AND interval_days = 1 (count = 1, freshly reset). Two states suffice for SM-2's branching.

The ease drift formula¶

The line:

delta = 0.10 - (5 - q) * (0.08 + (5 - q) * 0.02)

unpacks to:

q	(5 − q)	delta	ease change
5	0	+0.10	grows
4	1	0.00	flat
3	2	−0.14	shrinks
2	3	−0.32	(failure branch fires; ease −= 0.20 instead)
1	4	−0.54	(failure branch fires)
0	5	−0.80	(failure branch fires)

The ease tracks the learner's perception of difficulty per card. A card consistently graded 4 stays at ease 2.5; a card graded 5 each time creeps up; a card graded 3 each time creeps down. Once at 1.3 (floor), the algorithm essentially flattens and the card stays on a tight cadence forever — by design; some cards are just hard.

Interval geometry¶

A card that's graded 4 each time, with ease = 2.5, follows:

Review	Interval (days)	Date (if started 2026-05-23)
1	1	2026-05-24
2	6	2026-05-30
3	15 (6 × 2.5)	2026-06-14
4	38 (15 × 2.5)	2026-07-22
5	95 (38 × 2.5)	2026-10-25
6	238 (95 × 2.5)	2027-06-20

Six successful reviews push a card a year out. The retention probability at each review is empirically ~90% (Wozniak's calibration; replicated by Anki users at scale).

Card retirement¶

The portal retires cards when both:

interval_days > 180
AND ease > 2.8

A retired card disappears from the daily review queue. The grammar tutor's question pool is small enough that retiring cards is desirable — otherwise the learner accumulates an ever-growing tail of "yes, I still know eat → ate" reviews. Retired cards live in the table for audit; a retired_at column is added in the alembic migration phase-41-add-card-retirement.py.

FSRS-5 as the future option¶

The Free Spaced Repetition Scheduler (FSRS-5, Ye et al. 2024 — successor to FSRS-4.5) replaces SM-2's heuristic update with a per-user-fitted three-parameter retention model:

\[R(t; S) = \exp\left(-\frac{t}{S}\right)^{-w_0}\]

where \(t\) is days since last review, \(S\) is stability (analogous to interval), \(w_0\) is a global scaling parameter fitted from the learner's review history. The scheduler chooses next-review timing to hit a target retention probability (default 90%), which is a more direct objective than SM-2's interval-multiplier-based scheme.

The Phase 41 decision: SM-2 for the MVP, FSRS-5 deferred to a future amendment. Three reasons:

Cold-start. FSRS-5 needs ~50 historical reviews per learner to fit its parameters reliably. The portal launches with zero reviews; SM-2's heuristic is calibrated for zero-history scenarios.
Implementation cost. SM-2 is 20 lines of Python. FSRS-5 is ~300 lines plus a per-user fitting loop. The MVP budget doesn't justify the latter without evidence the former underperforms.
Validation cost. Comparing the two requires shipping both and A/B-testing per learner over months. The grammar tutor's per-learner data volume makes that infeasible at single-learner scale.

If/when a second or third learner joins (per §A3 of the addendum), the aggregate review history may justify the FSRS-5 upgrade. The migration path is documented in BLUEPRINT.md §6: FSRS-5 reads the same review_cards table, adds stability and difficulty columns, and dual-runs against SM-2 for one phase before cutover.

Card lifecycle¶

The complete lifecycle of a review card, as state machine:

[seed]
   │
   ▼
[new]               ← exam_response with correct_at_first_try = FALSE inserts
   │                  with ease=2.5, interval_days=1, due_on=today+1
   ▼
[due]               ← due_on <= today; appears on the Today's reviews widget
   │
   ├──q≥3──► [reviewed-success]   → interval grows, due_on advances
   │           │
   │           └──interval>180 AND ease>2.8──► [retired]
   │
   └──q<3──► [reviewed-failure]   → interval=1, ease shrinks, due_on=today+1
               │
               └──► [due] again tomorrow

The state is computed from (due_on, last_reviewed_at, ease, interval_days); no status column is needed.

The right rail on theory pages (and a dedicated review page) shows:

Due today: N
[ Start review ]

where N = SELECT COUNT(*) FROM review_cards WHERE student_id = ? AND due_on <= CURRENT_DATE AND ease ≤ 2.8.

The widget refreshes on each page load. Starting a review session moves through cards in due_on ASC order (oldest-due first), one at a time, with the per-card UI from theory 04 screen 4.

A review streak counter appears next to the widget when the user has cleared the daily queue: "Today's reviews complete · 12-day streak." The streak is a side-table (review_streaks) computed nightly; not in the data-model scope of this phase (it's a Phase 41+1 enhancement).

How exam failures seed cards¶

The integration point: when an exam_responses row is inserted with correct_at_first_try = FALSE, a trigger or application-level hook ensures a review_cards row exists for (student_id, exam_question_id):

INSERT INTO review_cards (
    student_id, exam_question_id, ease, interval_days, due_on
)
VALUES (?, ?, 2.5, 1, DATE('now', '+1 day'))
ON CONFLICT (student_id, exam_question_id) DO NOTHING;

The ON CONFLICT DO NOTHING is critical: if the learner has already failed this question once before, the existing card's state is preserved. The new failure is captured in exam_responses but doesn't reset the SR state — the card is already in the system.

The Phase 30 (structured exam answers) rubric format includes a difficulty hint that, in the FSRS-5 future, will become the initial difficulty value. SM-2 ignores it.

Per-card analytics¶

The MVP shows none, but the data is collected. A future analytics view computes:

Retention by phase. What fraction of cards from phase 32 are recalled at q ≥ 4 vs phase 11?
Time-to-retire. Median number of reviews before a card retires.
Ease distribution. Histogram across the deck — heavy left tail suggests the question set is too hard.

These views are useful at multi-learner scale. The single-learner MVP can compute them ad hoc against the SQLite file via a Marimo notebook (Phase 4's tool).

What this design does NOT include¶

No "leech" detection. Anki's leech rule (a card lapsed N times triggers manual review) is a future enhancement.
No undo. Once a card is graded, the grade is committed. Adding undo means a write-ahead log; out of scope.
No filtered decks. Anki's filtered/temporary decks (e.g., "all cards I failed last week") have no analog in the MVP.
No cross-learner stats. Even when N > 1, the data is per-learner; aggregate stats are deliberately not surfaced (privacy).
No mobile push notification. "You have 5 reviews due" emails / pushes — anti-goal at the deployment layer.

One-paragraph recap¶

A failed exam question becomes a review_cards row in SM-2 state. Each subsequent review applies the SM-2 update rule: q ≥ 3 multiplies the interval by the current ease factor (which drifts up or down), q < 3 resets the interval to one day and shrinks ease. Cards retire when both interval > 180 and ease > 2.8. FSRS-5 is the modern alternative, deferred until the learner population justifies the implementation cost. The daily review widget on the right rail is the only surface; the algorithm itself is twenty lines.

Next: theory/06-bilingual-policy-in-the-portal.md — how §0.6 plays out in code paths.