English · Español
How it all fits together¶
Lynx Cortex looks like several products — a documentation site, an interactive learner portal, and a set of offline books — but it is built from one content tree with three renderers and shipped down three independent deploy lanes. This page is the map: where the single sources of truth live, how the docs pipeline turns them into a static site, how the FastAPI portal reuses the same content, how credentials work, and how the three lanes deploy without stepping on each other.
Where this page sits
This is the published, bilingual version of the engineering reference kept
at ARCHITECTURE.md in the repository root. The root file is the terse
source of truth for contributors; this page is the readable tour. When the
two disagree, the root file wins — but they are meant to stay in sync.
The big picture¶
One tree of Markdown and YAML feeds every output. The generators are deterministic and idempotent: the same inputs always produce the same site, portal data and books.
flowchart TB
subgraph SOT["Single sources of truth"]
DOCS["docs/phase-NN-*/<br/>README · theory · lab · break · quizzes<br/>(EN + ES mirrors)"]
META["data/curriculum/*.yaml<br/>phase_meta · phase_study_meta · phase_references"]
QYAML["data/quizzes/*.yaml<br/>data/exams/*.yaml"]
GLOSS["GLOSSARY.md / .es.md"]
end
subgraph GEN["Generators — just docs-gen"]
G["build_phase_meta · build_phase_extras<br/>build_study_plan · build_glossary_data<br/>build_lang_pairs"]
GB["build_books.py"]
end
DOCS --> G & GB
META --> G
GLOSS --> G
G --> SITE["mkdocs build --strict → site/"]
GB --> BOOKS["dist/books/*.{pdf,epub}"]
SITE -->|deploy-docs.yml| CF["Cloudflare Pages (static)"]
BOOKS -->|release-books.yml| GHREL["GitHub Release · tag=books"]
GHREL -->|gh release download| SITE
subgraph PORTAL["Portal — FastAPI"]
CURR["curriculum.py reads docs/ + phase_meta.yaml"]
DB["portal.db"]
VAULT["vault.db"]
SRS["*_review.sqlite"]
end
DOCS --> CURR
META --> CURR
QYAML -->|portal_seed_quizzes.py| DB
PORTAL -->|deploy-portal.yml| FLY["Fly.io + persistent volume<br/>+ Cloudflare front"]
Seven facts hold this whole system together:
- One content tree, three renderers — site, portal, books.
data/curriculum/phase_meta.yamlis the shared nav spine.- Quizzes flow YAML → seeder → DB; marks are computed, never authored.
- Three separate SQLite databases, deliberately kept apart.
- Books are built once, consumed everywhere.
- Credentials are stateless signed tokens — nothing stored server-side.
- Three independent, path-filtered deploy lanes.
1. The single sources of truth¶
Everything downstream is a projection of a small set of authored inputs. Edit these; never edit the generated artifacts.
| Source | What it owns | Consumed by |
|---|---|---|
docs/phase-NN-*/ |
All phase prose: README, theory, lab, break, quizzes — EN + ES mirrors | Site, portal, books |
data/curriculum/phase_meta.yaml |
Cross-phase nav spine: slug, titles, summary, requires (informational), teaches |
Site generators + portal |
data/curriculum/phase_study_meta.yaml |
Effort model for the study planner | build_study_plan.py |
data/curriculum/phase_references.yaml |
"Further reading" per phase | build_phase_extras.py |
data/quizzes/*.yaml, data/exams/*.yaml |
Quiz and exam content | portal_seed_quizzes.py → portal DB |
GLOSSARY.md / .es.md |
Concept terms and definitions | build_glossary_data.py |
Bilingual policy (§A17)
Every document is mirrored as X.md (English) and X.es.md (Spanish), and
both are equally authoritative. When an English source is edited, its
Spanish mirror is updated in the same commit. The runtime EN/ES toggle
in the header is pure client-side JavaScript driven by a generated URL map
(window.LYNX_LANG_PAIRS) — there is no server round-trip. Code
identifiers, file paths, shell commands and commit messages stay English.
2. The docs-site pipeline¶
The site is plain mkdocs-material:
docs_dir: docs, site_dir: site, theme overrides in overrides/, and only the
built-in search plugin enabled. All interactivity is generated static
JavaScript — there is no application server behind the docs.
just docs runs the whole pipeline:
flowchart LR
subgraph INPUTS["Authored inputs"]
D["docs/*.md (EN + ES)"]
M["phase_meta.yaml"]
GL["GLOSSARY.md"]
end
subgraph DOCSGEN["just docs-gen — deterministic"]
LP["build_lang_pairs<br/>→ lang-pairs.js"]
PM["build_phase_meta<br/>→ Requires/Teaches + reference.md<br/>VALIDATES slugs · coverage · acyclicity"]
PX["build_phase_extras<br/>→ concept map + further reading"]
SP["build_study_plan<br/>→ window.LYNX_STUDY + planner"]
GD["build_glossary_data<br/>→ window.LYNX_GLOSSARY tooltips"]
end
D --> LP & PM & PX & SP & GD
M --> PM & SP
GL --> GD
LP & PM & PX & SP & GD --> BUILD["mkdocs build --strict"]
BUILD --> S["site/ (static)"]
The five generators (all in scripts/, all deterministic) inject data bundles
and projected Markdown blocks into the tree before MkDocs runs:
build_phase_meta— projects the Requires / Teaches blocks into each README and builds the reference index. It is also the curriculum integrity gate: it validates slugs, coverage and acyclicity of the prerequisite graph and fails the build if they break.build_phase_extras— the per-phase concept-map widget and the "further reading" blocks.build_study_plan— thewindow.LYNX_STUDYbundle behind the interactive study planner (pace selector + Gantt + cards).build_glossary_data— thewindow.LYNX_GLOSSARYbundle that powers the hover-to-explain concept tooltips.build_lang_pairs— thewindow.LYNX_LANG_PAIRSEN↔ES URL map used by the header language toggle.
Strict by design
The build runs with --strict, so a broken internal link or an orphaned nav
entry is a hard failure, not a warning. That is why the prerequisite
graph is validated up front: a typo in a slug fails fast, locally, before it
can reach the deploy.
3. The offline books¶
The same Markdown becomes downloadable PDF and EPUB books — one
per language. The generator (scripts/build_books.py) deliberately uses no
pandoc, no LaTeX, no Node:
- WeasyPrint renders HTML → PDF with a cover, table of contents and page breaks.
- ebooklib writes the EPUB.
- matplotlib mathtext turns every
$…$expression into a cached SVG (indist/books/_mathcache) so equations render identically on any reader with no fonts to install.
Each phase is a chapter; theory, lab, break and quiz files are sections.
Crucially, the books are built once by the release-books.yml workflow and
published to the books GitHub Release — then downloaded by the docs deploy.
They are never rebuilt in the fast docs lane, which keeps that lane quick.
Why books are a release artifact, not committed
Books are regenerable from the content tree, so committing them would bloat
git history with binaries that drift out of sync. Instead CI builds them in
their own slow lane and the docs lane pulls the finished files via
gh release download. Locally, just docs builds them once if absent.
4. The learner portal¶
The portal (src/miniportal/) is a server-rendered FastAPI + SQLModel +
Jinja2 + HTMX application — no single-page app. create_app(config) is the
sole entrypoint, usable both in production (env-driven config) and in tests
(an explicit config against an in-memory database).
It reuses the curriculum rather than copying it: curriculum.py reads the
same docs/ tree and phase_meta.yaml the site is built from, read-only
— the portal never writes back into docs/. Prerequisites surface as
informational badges, never as locks.
Request path and middleware¶
Every request passes through a fixed middleware stack. Starlette executes
outermost-first, so the order added in create_app is the inverse of execution:
flowchart TB
REQ([Incoming request]) --> SEC["SecurityHeaders<br/>(outermost)"]
SEC --> OBS["RequestObservability<br/>(/metrics)"]
OBS --> RL["RateLimiter"]
RL --> BSL["BodySizeLimit"]
BSL --> INJ["InjectionFilter<br/>(innermost)"]
INJ --> ROUTERS
subgraph ROUTERS["Routers — each a build_router factory"]
direction LR
A["auth · dashboard · academic · locale"]
B["notes · quiz (+SRS) · downloads"]
C["admin · admin_overrides · exam_engine"]
D["lab_tracker · capstone_tracker · grading · obs_extended"]
end
ROUTERS --> DBS
subgraph DBS["Three separate SQLite stores"]
direction LR
P[("portal.db<br/>SQLModel main store")]
V[("vault.db<br/>AES-256-GCM · minivault")]
R[("*_review.sqlite<br/>SM-2 SRS · minireview")]
end
Three databases, kept apart¶
The three SQLite stores are intentionally separate so a compromise or corruption in one does not reach the others:
portal.db— the main SQLModel store (users, attempts, marks, notes).vault.db— an AES-256-GCM encrypted vault (minivault).*_review.sqlite— the SM-2 spaced-repetition store (minireview), raw SQLite.
Authentication and i18n¶
Auth is Argon2id (t=3, 64 MiB, p=2) with a server pepper; sessions, CSRF and
invite tokens are itsdangerous-signed from cfg.session_secret; cookies are
HttpOnly / Secure / SameSite. Authorization climbs a dependency ladder —
current_student → require_teacher_or_admin → require_admin — and mutating
routes carry a double-submit CSRF check. Interface translation is a plain Python
t() dictionary (EN + ES), not gettext.
5. Content flow and where marks come from¶
The cardinal rule: marks are computed, never authored. No one types a grade
into a file; grading/service.py derives the report from the learner's actual
attempt rows.
flowchart LR
PROSE["Phase prose / theory / labs"] --> DOCS["docs/ (one SoT)"]
DOCS --> SITE2["Site"]
DOCS --> PORTAL2["Portal"]
DOCS --> BOOKS2["Books"]
QY["data/quizzes · data/exams (*.yaml)"] -->|portal_seed_quizzes.py| PDB[("portal.db")]
LEARN([Learner attempts]) --> ATT["attempt rows in portal.db"]
ATT -->|grading/service.py| REPORT["compute_report → 5-tier band<br/>PASS_MARK = 50"]
REPORT --> CRED["Credentials"]
- Phase prose, theory and labs live once in
docs/and render to all three surfaces. - Cross-phase metadata lives once in
phase_meta.yaml, shared by the site generators and the portal. - Quizzes and exams are authored as YAML, seeded into
portal.dbbyportal_seed_quizzes.py. - Grading reads attempt rows and produces a 5-tier band with
PASS_MARK = 50.
6. Credentials — stateless and verifiable¶
Certificates, transcripts and ID cards carry stateless verifiable tokens. Nothing is stored server-side; verification is a pure HMAC check.
flowchart LR
G["grading.compute_report"] --> B["band + payload"]
B --> T["credentials.make_token<br/>base64url(json) + HMAC-SHA256(session secret)"]
T --> DOC["Certificate / transcript / id-card HTML<br/>+ fingerprint code + /verify?token= URL"]
DOC --> PUB["Public /verify"]
PUB --> CHK{"HMAC valid?<br/>(constant-time)"}
CHK -->|yes| OK["Authentic — render payload"]
CHK -->|no| NO["Reject"]
The token is base64url JSON plus an HMAC-SHA256 of the session secret, embedded
in the credential HTML alongside a human-readable fingerprint code and a
/verify?token= URL. The public /verify endpoint checks the HMAC in constant
time; because nothing is persisted, there is no database to tamper with. A
certificate is gated on legal name + accepted terms version + overall mark ≥
50.
7. Deployment — three lanes, five workflows¶
Five GitHub Actions workflows in .github/workflows/, split into three
independent, path-filtered deploy lanes plus two pure gates. A change to the
docs never triggers a portal deploy, and vice versa.
flowchart TB
subgraph GATES["Gates — no deploy"]
CI["ci.yml<br/>ruff + mypy(src) + pytest"]
PT["portal-tests.yml<br/>portal pytest + Dockerfile/compose checks"]
end
subgraph LANES["Three deploy lanes"]
DD["deploy-docs.yml<br/>generators → gh release download books<br/>→ mkdocs build --strict → wrangler pages deploy"]
RB["release-books.yml<br/>WeasyPrint + mathcache → books Release"]
DP["deploy-portal.yml<br/>flyctl deploy --remote-only"]
end
DD --> CFP["Cloudflare Pages (static site)"]
RB --> REL["GitHub Release · tag=books"]
REL -.consumed by.-> DD
DP --> FLYIO["Fly.io + persistent volume + Cloudflare front"]
| Workflow | Role |
|---|---|
ci.yml |
Lint (ruff) + types (mypy src) + tests (pytest). A gate, no deploy. |
deploy-docs.yml |
Generators → gh release download books → mkdocs build --strict → wrangler pages deploy site. "GitHub builds, Cloudflare publishes." |
release-books.yml |
Builds the four books (WeasyPrint, mathcache) and publishes the books Release. |
deploy-portal.yml |
flyctl deploy --remote-only to Fly.io (needs FLY_API_TOKEN). |
portal-tests.yml |
Portal pytest plus Dockerfile / compose checks. A gate, no deploy. |
The portal container and the single-writer invariant¶
The portal ships from docker/portal.Dockerfile: a two-stage build running as a
non-root portal user (uid 10001), venv at /opt/venv, started with
python scripts/portal_run.py. fly.toml deploys app lynx-cortex-portal in
region cdg (Paris) with a persistent volume lynx_data mounted at
/var/lib/lynx-cortex, scale-to-zero when idle, and secrets injected via
scripts/fly-secrets.example.sh.
Why a single machine is a feature, not a limit
A Fly volume binds to exactly one machine at a time, which physically
enforces the single-SQLite-writer invariant. There can never be two
processes writing the same database, because there can never be two machines
mounting the same volume. Backups are handled by portal_backup.py (SQLite
online-backup API), with _snapshot_rotate and a guarded _restore. Do not
raise the machine count past one without first moving off SQLite.
Where to go next¶
- Getting started — set up the environment and run the site locally.
- Study any chapter — the reference index that
build_phase_metagenerates, so you can jump anywhere in the curriculum. - Download (PDF & EPUB) — the offline books described in §3.