English · Español

02 — Decision survival: which architectural choices held up?¶

🇪🇸 Cuarenta fases significan cientos de pequeñas decisiones. ¿Cuáles aguantaron sin revisión? ¿Cuáles se revisaron una vez? ¿Cuáles murieron? Es una medida objetiva de la calidad del juicio temprano.

The hypothesis¶

A project of 40 phases generates roughly 50-200 architectural decisions — file structure, library choice, algorithm pick, abstraction boundary. Some held without revision; some were quietly revised when the next phase needed them to be; some were explicitly torn down.

Survival rate = (decisions held without revision) / (total decisions).

A high survival rate is suspicious — either you weren't taking enough risks, or you weren't being honest about revising things. A low survival rate is fine if you can articulate why — it means you were learning. The shape of the survival curve over time tells you about your judgment trajectory.

The data source¶

PROPOSAL_REVIEW.md at the project root is the ledger. Every architectural proposal that needed a "Claude should review this" tag landed there. So did every "Borja decided to revise X" entry.

Plus: the Revisions log (§8) at the bottom of every PHASE_NN_PLAN.md.

Plus: the journal entries in learners/borja/journal/ where the format includes a decisions: field.

The four states¶

Classify each decision into one of four states:

Held (H) — was made, never revised, still in force at Phase 40.
Revised once (R1) — was made, revised exactly once at a later phase.
Revised many (RN) — revised 2+ times across the project.
Discarded (D) — explicitly rolled back; the system at Phase 40 looks as if this decision was never made.

For each decision, also note:

When made — the phase.
Scope — file-level, module-level, project-level.
What forced the revision (if applicable) — a later constraint, a bug, a learning, a corpus change.

Example classifications¶

Decision	Made in	State	Why
Use `uv` instead of `pip`	Phase 00	H	Stayed in force; no friction.
9 C-string functions as corpus	Phase 12 (orig)	D	Discarded at A13 amendment for the verb-grammar corpus.
Pre-LN over Post-LN transformer	Phase 17	H	Held; gradient stability matched expectations.
Hand-built autograd in Phase 07-08 before PyTorch	Phase 00	H	The decision that defined the whole curriculum.
Sampling without KV cache in Phase 21	Phase 21	R1	Replaced in Phase 22; the gap was intentional (pedagogy).
Single-process FastAPI for serving	Phase 33	H	Multi-process was discussed and rejected; held.
Use Pydantic v1 (early phases)	Phase 06	R1	Migrated to Pydantic v2 in Phase 30 (for JSONSchema features).
In-memory learner memory only	Phase 32	H	Held; persistence punted to Phase 39 capstone (out of scope).

What the survival curve might look like¶

If you plot survival rate vs phase of original decision:

Decisions made early (Phases 0-5) often have higher survival because they're foundational (tooling, language choice).
Decisions made in the middle (Phases 10-25) have the lowest survival — this is when the curriculum bends to fit reality.
Decisions made late (Phases 30-39) tend to have artificially high survival because there hasn't been time to revise them.

A "healthy" pattern: ~70-90% survival on tooling, ~40-60% on architecture, ~20-50% on algorithm choices.

Reading the contributing factors¶

For each revised decision, write 1-2 sentences:

What new information forced the revision? ("Phase 22's KV cache made the Phase 21 sampler obsolete as designed.")
Was the revision in the original plan? ("Yes — Phase 21 explicitly noted 'no cache here; Phase 22 fixes it.'")
Surprise level (1-5): 1 = expected; 5 = jarring.

The surprise distribution is more useful than the survival rate. High-surprise revisions are where your model of the system was wrong. Those are the lessons to extract for future projects.

The bias to watch¶

People tend to over-classify decisions as Held because:

They forget the small revisions.
They prefer to think they were right.
They confuse "the spirit of the decision held" with "the literal decision held."

Counter-bias: scan the git history. If a file has been changed more than 5 times across phases, the decisions that file embodies are not Held.

The opposite bias to watch¶

People tend to over-classify decisions as Revised because:

Every tiny tweak feels like a revision in the moment.
Hindsight makes early decisions look naive.

Counter-bias: only count revisions that changed the contract of a module or the interface of an API. Renaming a variable doesn't count.

The output¶

For Phase 40's lab 02, this analysis goes in the postmortem (§3 contributing factors) as a small table or waterfall chart. Don't overdo it — 10-20 decisions classified is enough. The point is to see the pattern, not to be exhaustive.

What this analysis is not good for¶

Predicting what the next project will need to revise. Each project has its own pattern.
Judging whether the curriculum was well-designed. A high revision rate could mean the design was bad or that the journey was honest about learning.
Comparing Borja's decision quality to anyone else's. It's a within-project measure only.

What this file does NOT cover¶

The specific decisions to classify. That's the lab's content.
The shape of the residual risk acceptance. Next file.

Next: 03-residual-risk-and-offramps.md