Skip to content

English · Español

Motivation — why hygiene before AI

The trap

Every ML curriculum that skips Phase 0 ends the same way: someone gets a result, can't reproduce it the next morning, and spends two days re-doing the previous week. Or worse — produces a result no one can verify, declares it "good enough," and ships a model with a silent regression. The discipline of reproducibility, hygiene, and threat-aware tooling is the only thing that lets us trust the next 40 phases.

We are not learning Python tooling for its own sake. We are learning it so that the questions a future-us asks — "why did the loss spike on epoch 7?", "did this run use bf16 or fp32?", "what was the seed?" — have answers we can read off disk, not guess at.

🇪🇸 La fase 0 no va de Python. Va de poder responder, a la mañana siguiente, qué exactamente corrí ayer. Si no puedo, no puedo iterar.

What this phase buys you

By the end of Phase 0, every script we ever run in this repo will:

  1. Seed all RNGs at entry (seed_everything(seed) from src/utils/seeding.py).
  2. Persist {seed, versions, config, git sha, hardware} to experiments/<date>-<topic>/manifest.json.
  3. Be type-checked (mypy --strict on src/).
  4. Be linted (ruff check + ruff format).
  5. Be testable (pytest, deterministic, with the autouse seed fixture).
  6. Have a vetted dependency tree (uv.lock, scanned by pip-audit, and the source scanned by bandit).
  7. Have its outputs stripped from notebooks (nbstripout + nbqa-ruff + nbqa-mypy via pre-commit).
  8. Run inside a customized Claude Code session that loads CLAUDE.md and uses the project's slash commands + subagents.

If any one of these breaks, the per-phase ritual stops until it's fixed. That's the point.

What this phase does not cover

  • No NumPy. No math. No models. No tokenizers. No code outside src/utils/.
  • No "let's also pin MLflow / DVC / marimo / quarto". They're in pyproject.toml as opt-in groups, installed only when their driving phase arrives (per PROPOSAL_REVIEW.md §6).
  • No web UI for the learner workspace. Directory layout onlylearners/<name>/ works for one learner and N learners alike. No auth, no DB, no hosted service.

The pedagogical contract you're agreeing to

Phase 0 is the first time you sign the contract from LYNX_CORTEX.md §0.1 with actions, not just words. The contract has six clauses; the one most often violated is clause 2: build before abstracting. You will be tempted in Phase 4 to "just import SciPy for the SVD" — and you'll be right that it would be faster. But the contract says: write the Jacobi rotations by hand first, then compare against SciPy. Phase 0 is where you wire up the gates that make that comparison auditable.

Why each tool below is non-negotiable

Tool What it prevents
uv + uv.lock "works on my machine" — different transitive deps produce different numerics
seed_everything Non-reproducible loss curves, flaky tests, "phantom" regressions
ruff (lint + format) Style debates in commits; subtle bugs (unused imports, mutable defaults)
mypy --strict Tensor-shape bugs, Optional mishandling, silent Any propagation
pytest + autouse seed Tests that pass once and fail tomorrow
pre-commit "I'll fix that lint later" — later never comes
bandit pickle.load(untrusted), subprocess(shell=True, user_input), weak crypto
pip-audit Known CVEs in transitive deps shipping in our lockfile
nbstripout Notebook diffs that are 99% output noise; leaked secrets in cell outputs
CLAUDE.md + .claude/ Every future Claude Code session forgets the rules without this

If a Phase 0 reviewer can't audit your scripts and reproduce them tomorrow, Phase 0 is not done.

01-reproducibility.md — the mechanics of seeds, lockfiles, and manifests.