English · Español
Lab 04 — The demo script: finalize and record¶
🇪🇸 El último lab del currículum. Aquí se cierra
scripts/demo/run.pycon sus siete bloques, se graba el cast de asciinema, y se publica el "report card" final: la tabla DE-001…DE-020 con todos los checks en verde. Si esa tabla termina en verde, el curriculum ha terminado.
Goal¶
Finalize scripts/demo/run.py (Theory 05 §anatomy). Record an asciinema cast of one clean run. Commit docs/demo-recording.md pointing at the cast. Verify the final acceptance table — all ≤ 20 DE checks green.
Why this lab is last¶
Labs 00–03 built and verified each piece. Lab 04 stitches the narration together, records the artifact a stranger sees, and signs off the phase.
This is the lab where the curriculum becomes show-able. After this lab, Borja can hand someone the repo URL, say "run just demo," and walk away.
Deliverables¶
scripts/demo/run.py— the seven-block script from Theory 05, complete and tested.scripts/demo/recorder.py— wraps the run with asciinema; emitsexperiments/39-demo-script/cast-YYYY-MM-DD.cast.experiments/39-demo-script/transcript.jsonl— structured transcript of the recorded run (each [Phase NN] line as a JSON event).experiments/39-demo-script/cast-YYYY-MM-DD.cast— the asciinema recording.docs/demo-recording.md— markdown page with the embedded asciinema player + a textual summary for accessibility.docs/README.md— the quickstart, refreshed:git clone && uv sync --frozen && just demo, one screenshot of the final acceptance table.tests/integration/test_demo_end_to_end.py— final integration test; runsjust demo-cold, asserts exit 0, all DE-checks pass, transcript well-formed.
Step 1 — Finalize scripts/demo/run.py¶
The seven-block skeleton from Theory 05 §anatomy is fleshed out. Each block has a contract:
| Block | Function | Contract |
|---|---|---|
| 1 — Preflight | assert_environment_ready() |
Aborts with clear message if uv, docker, ports, or lockfile aren't ready |
| 2 — Stack bring-up | bring_up_stack() |
Delegates to just demo-cold-up; polls health for 30 s |
| 3 — Narration | narrate_loaded_components() |
Prints the [Phase NN] startup lines from a config table |
| 4 — Happy path | send_and_verify() × 3 |
Three canonical requests; each prints stage-by-stage timings |
| 5 — Security | replay_injection(), replay_oversized_body(), replay_mcp_sandbox() |
Three replays from Lab 03 |
| 6 — Acceptance | run_acceptance_checks() |
Reads docs/DONE_ENOUGH.md, runs each check, prints the table |
| 7 — Wrap-up | print_summary(), emit_eval_report() |
Total cost, p95, accuracy summary; writes eval-YYYY-MM-DD.json |
The whole script is < 400 lines. Most of the work is delegated to helpers from previous labs; this script is the conductor.
Narration as a structured event stream¶
Every printed line is also a JSON event written to experiments/39-demo-script/transcript.jsonl. Format:
{"t": 0.5, "phase": 12, "event": "load_corpus", "msg": "Loading verb corpus from DVC..."}
{"t": 1.1, "phase": 11, "event": "tokenizer_ready", "msg": "BPE tokenizer ready (vocab=2048).", "vocab_size": 2048}
{"t": 2.3, "phase": 28, "event": "model_loaded", "msg": "Loading Mini-GPT base + LoRA grammar adapter (rev=sha:a1b2c3).", "lora_rev": "sha:a1b2c3"}
...
Two outputs from one print call: the human-readable text on the terminal and the machine-readable event in the JSONL. This lets the CI test parse the transcript without scraping the terminal.
A small narrate() helper does both:
def narrate(phase: int, event: str, msg: str, **fields):
t = time.time() - DEMO_START_T
print(f"[t={t:.1f}s] [Phase {phase:02d}] {msg}")
transcript.write_event({"t": t, "phase": phase, "event": event, "msg": msg, **fields})
Step 2 — Acceptance table renderer¶
Block 6 reads docs/DONE_ENOUGH.md's 20 rows, runs each automation, prints the table:
=================================================================
Phase 39 — Acceptance Checks
=================================================================
| ID | Check | Pass |
|--------|-------------------------------------------------|------|
| DE-001 | Stack starts within 30 s | ✓ |
| DE-002 | miniserve responds on :8080 within 5 s | ✓ |
| DE-003 | First request completes within 10 s | ✓ |
| DE-004 | p95 latency over 3-sentence battery < 5 s | ✓ |
| ... | ... | ... |
| DE-020 | Demo exits with status 0 | ✓ |
=================================================================
Result: 20/20 passed.
=================================================================
If any check fails, the row marker is ✗, the table is followed by a # Failures: section enumerating each, and the script exits 1.
Step 3 — Recording with asciinema¶
scripts/demo/recorder.py:
import subprocess, datetime
from pathlib import Path
def record_demo():
today = datetime.date.today().isoformat()
out_dir = Path("experiments/39-demo-script")
out_dir.mkdir(parents=True, exist_ok=True)
cast = out_dir / f"cast-{today}.cast"
cmd = [
"asciinema", "rec",
"--command", "just demo-cold",
"--idle-time-limit", "2",
"--title", f"Lynx Cortex — capstone demo {today}",
str(cast),
]
subprocess.run(cmd, check=True)
print(f"Recorded: {cast}")
Idle-time-limit of 2 s prevents the cast from including long pauses (model loading, etc.) at real time — the playback feels paced.
Run:
The cast file is < 100 KB for a 90-second recording; small enough to commit to the repo.
Fallback: plain transcript¶
If asciinema isn't available, the textual transcript.jsonl is the fallback. The docs/demo-recording.md page links both: the asciinema player (preferred) and a markdown rendering of the JSONL (accessible).
Step 4 — docs/demo-recording.md¶
# Demo recording
This is the canonical recording of the Phase 39 capstone demo.
## Watch (asciinema)
<script id="asciicast-2026-06-XX" src="../experiments/39-demo-script/cast-2026-06-XX.cast" async></script>
## Read (transcript)
(Below is the human-readable rendering of `transcript.jsonl`. Times are wall-clock from the start of the recording.)
| t (s) | Phase | Event |
|---|---|---|
| 0.0 | 39 | demo_start |
| 0.5 | 12 | load_corpus |
| 1.1 | 11 | tokenizer_ready |
| ... | ... | ... |
| 87.3 | 39 | acceptance_table_printed (20/20 passed) |
| 88.1 | 39 | eval_report_emitted (experiments/39-end-to-end/eval-2026-06-XX.json) |
| 89.4 | 39 | demo_complete (exit=0) |
## Reproduce
The transcript table is auto-generated by scripts/demo/render_transcript.py from the JSONL; committed to docs alongside the cast.
Step 5 — Refresh docs/README.md¶
The visitor-facing quickstart. Three sections:
# Lynx Cortex
A first-principles AI systems curriculum: 40 phases from a transistor to a deployed
grammar tutor that corrects English verb conjugations and provides Spanish
translations for 20 verbs × 5 tenses × 3 persons.
## 90-second demo
```
git clone https://github.com/borjatarraso/lynx-cortex
cd lynx-cortex
uv sync --frozen
just demo
```
You will see: the stack come up, three grammar corrections, three security defenses
fire, and a 20-row acceptance table all green. Total: ~90 seconds.
Recording: [docs/demo-recording.md](demo-recording.md)
## Curriculum
[40-phase roadmap](../ROADMAP.md). Each phase has a `theory/` and `lab/` directory
under `docs/phase-NN-*/`.
## Architecture
[docs/ARCHITECTURE.md](ARCHITECTURE.md) — C4 context + container + sequence diagrams.
## Definition of "done enough"
[docs/DONE_ENOUGH.md](DONE_ENOUGH.md) — the 20 binary checks the demo verifies on
every run.
The README's first action is just demo — the demo is the entrypoint.
Step 6 — Final integration test¶
tests/integration/test_demo_end_to_end.py:
import json, subprocess
from pathlib import Path
def test_demo_runs_clean():
"""`just demo-cold` exits 0; all DE checks pass; transcript well-formed."""
result = subprocess.run(["just", "demo-cold"], capture_output=True, text=True, timeout=180)
assert result.returncode == 0, f"demo failed: {result.stdout[-2000:]}"
transcript = Path("experiments/39-demo-script/transcript.jsonl")
assert transcript.exists(), "transcript not written"
events = [json.loads(line) for line in transcript.read_text().splitlines() if line.strip()]
assert events[-1]["event"] == "demo_complete", "demo didn't reach completion event"
assert events[-1]["exit"] == 0, "demo exited non-zero"
# Verify acceptance table.
acceptance = next(e for e in events if e["event"] == "acceptance_table_printed")
assert acceptance["passed"] == acceptance["total"], \
f"acceptance failed: {acceptance['passed']}/{acceptance['total']}"
This is the final guarantor. CI runs it; PR must pass.
Step 7 — One clean recording¶
The recording for docs/demo-recording.md must be from a clean run:
- Bring down any prior stack:
just demo-cold-down. - Clear cached artifacts that would falsely accelerate the recorded run:
rm -rf ~/.cache/lynx-cortex-warmup/. - Record:
uv run python scripts/demo/recorder.py. - Verify the recording:
asciinema play experiments/39-demo-script/cast-YYYY-MM-DD.cast. - If a glitch is visible (terminal resize, accidental keypress), re-record. The cast is a product; ship a clean one.
What "done" looks like¶
-
scripts/demo/run.pycomplete; seven blocks present and contract-tested. -
scripts/demo/recorder.pycomplete; produces a cast on demand. -
experiments/39-demo-script/transcript.jsonlwritten cleanly during demo runs. -
experiments/39-demo-script/cast-YYYY-MM-DD.castcommitted. -
docs/demo-recording.mdwritten with both the cast embed and the textual transcript. -
docs/README.mdrefreshed with the 90-second quickstart and the recording link. -
tests/integration/test_demo_end_to_end.pypasses in CI. - The acceptance table prints 20/20 green on
just demo.
Common pitfalls¶
- "It works locally; the recording was clean." The recording must be from
just demo-cold, notjust demo(which leaves prior state). Clean state = repeatable result. - Embedding the cast inline in README.md. README is rendered by many tools; only mkdocs handles
<script>tags well. Put the embed indocs/demo-recording.md(which is markdown rendered by mkdocs); link from README. - Forgetting the textual transcript. The cast is great for visual learners; some viewers (screen readers, search indexers, future-you grepping git log) need text. Both.
- Letting
narrate()print without writing to transcript. A debugprintin the middle of the demo doesn't appear in the JSONL; the CI parses JSONL only and may not catch a regression. All narration goes through the helper. - Skipping the "loud failure" verification from Lab 03 in the recording. Optional — the clean demo is for the show; the loud-failure verification is captured in Lab 03's experiment log. Keep them separate.
End of Phase 39 lab sequence. Next:
- Open Phase 39 via
/phase-start 39. - Walk Labs 00 → 04 in order.
- Write
PHASE_39_REPORT.md(the capstone reflection; structured perLYNX_CORTEX.md§7.6). - Then
/phase-start 40— the postmortem.