Skip to content

English · Español

Phase 39 — Capstone: The Miniature Production System

Requires: 38 — Cost, Capacity, Operations, MLOps Teaches: integration · end-to-end-demo · architecture-diagrams · cost-reporting Jump to any chapter from the phase reference index.

Chapter map

🇪🇸 La fase 39 no añade módulos: compone. Aquí los 38 fragmentos del repo se ensamblan en un único servicio HTTP — el grammar tutor — que arranca en frío con just demo, corrige una frase en inglés en menos de cinco segundos, emite trazas, métricas y coste, y demuestra tres filas del modelo de amenazas. Si Borja consigue que un visitante haga git clone && just demo y vea todo eso en 90 segundos, el currículum ha terminado.

Where this phase lives in the curriculum

  • Spec anchor: LYNX_CORTEX.md §4 / PHASE 39 (lines 816–833).
  • Topic anchor: LYNX_CORTEX_ADDENDUM.md §A13 — grammar tutor over 20 verbs × 5 tenses × 3 persons with paired Spanish translations. Phase 39 is the deployable form of the tutor.
  • Method anchor: §A12 — this phase is pre-written: plan + theory + lab statements before phase open; solutions just-in-time.
  • Plan: PHASE_39_PLAN.md at repo root.

What the capstone produces

A single command — just demo — runs a 90-second scripted scenario against a cold-started stack:

  1. Brings up miniserve + Prometheus + Grafana + Tempo + the MLflow tracking server via docker-compose.
  2. Loads the Phase 28 LoRA grammar tutor (whichever canonical SHA was promoted in Phase 38 lab 04).
  3. Sends a sequence of English sentences ("Yesterday I goed to the store", "She have eaten", "I will written it") through POST /v1/grammar/correct.
  4. Streams the corrections back with per-token reasoning, populates the Grafana dashboard, emits the cost per request.
  5. Replays three rows of security/THREATS.md: a prompt-injection attempt, an oversized body, and a malicious tool-argument payload dispatched through the MCP sandbox (Phase 31). Each is caught; each is annotated in the threat model with Phase 39 demo: verified.
  6. Emits a dated evaluation report under experiments/39-end-to-end/eval-YYYY-MM-DD.json.
  7. Tears the stack down cleanly.

Plus, committed to the repo:

  • docs/ARCHITECTURE.md — C4 (context + container) + sequence diagram.
  • docs/DONE_ENOUGH.md — the ≤ 20 binary capstone checks.
  • docs/README.md — the visitor-facing quickstart.
  • scripts/demo/run.py — the narrated demo runner.
  • infra/compose/full-stack.yml — the composed observability stack.
  • infra/grafana/dashboards/capstone.json — the single Grafana dashboard.
  • PHASE_39_REPORT.md — the capstone reflection with the per-phase mapping table.

Theory chain (read in order)

  1. theory/00-integration-and-done-enough.md — what "integration" actually means; how to write a closed DoD checklist.
  2. theory/01-architecture-of-the-tutor.md — C4 model walk-through; how the eight building-block modules compose; the contracts between them.
  3. theory/02-end-to-end-data-flow.md — one request, every layer, with byte counts; the latency budget; the percentile-addition fallacy.
  4. theory/03-cost-and-observability-stitching.md — how Phase 34's per-request cost tracker, Phase 38's CpQU table, Prometheus, Grafana, and Tempo all join into one dashboard.
  5. theory/04-security-and-threat-model-closeout.md — which threat-model rows the demo exercises and why those three; what's left for Phase 40.
  6. theory/05-demo-script-and-acceptance.md — anatomy of a good demo script; narration, idempotency, determinism, error surfacing; recording with asciinema.

Lab chain (do in order)

  1. lab/00-cold-start-bringup.md — first cold start from a fresh checkout; resolve every missing-config error.
  2. lab/01-end-to-end-grammar-tutor-request.md — single request walked through every layer; trace tree captured; cost identity verified.
  3. lab/02-load-and-shadow.md — 10-concurrent load test with the Phase 38 shadow LoRA variant running alongside the baseline.
  4. lab/03-security-runthrough.md — three threat-model rows replayed and annotated.
  5. lab/04-demo-script.md — finalize scripts/demo/run.py, record the asciinema cast.

Definition of Done (binary)

Stated in full in PHASE_39_PLAN.md §7. Eight checks, all binary, all automated.

What this phase does NOT cover

  • No new src/<module>/. Composition only.
  • No GPU. Phase 35's GPU vocabulary is documented; the demo runs CPU-only on Borja's i5-8250U.
  • No new ML technique. No new training, no new fine-tune, no new sampler.
  • No multi-user auth. Single-user local demo. Auth ≥ Phase 40 reading-list item.
  • No multi-region, no load balancer, no service mesh. One node, one process.
  • No "polish" refactors. Phase 40 hardening picks those up.

What to do when you finish

Write PHASE_39_REPORT.md per LYNX_CORTEX.md §7.6. Then open Phase 40 (the postmortem).

Further reading

Optional — enrichment, not required to pass the phase.