English · Español
Theory 05 — Portal threat model (forward to Phase 41)¶
🇪🇸 La Fase 37 enseña amenazas contra el tutor de gramática como sistema adversarial (inyecciones, RAG envenenado, fuzz de tools). La Fase 41 hereda esas amenazas y añade las propias de un portal multi-usuario: límite authn/authz entre estudiantes y admin, cookies firmadas vs sesiones en servidor, política sin contraseña por defecto, y elección de parámetros Argon2id en un i5-8250U.
Why this chapter exists¶
Phase 37's first four theory chapters threat-model the agent — what happens when a user types ignore previous instructions, or when the RAG index gets poisoned, or when a tool argument escapes a sandbox. Those threats apply inside the model boundary. The Phase 41 portal sits outside that boundary: it is a multi-tenant HTTP application that delivers the curriculum to many students with a teacher/admin role. Its attack surface is the HTTP surface — credentials, sessions, CSRF, replay, key-in-memory — not prompts.
This chapter teaches the four portal-relevant threat-model questions Phase 37 owes Phase 41. Each is a small lesson on its own; the portal composes them in docs/phase-41-learner-portal/theory/03-auth-and-vault.md. The split is deliberate: Phase 37 is threat-modelling vocabulary, Phase 41 is threat-modelling applied to one specific multi-user app.
The cross-link discipline mirrors theory/04-portal-building-blocks.md over in Phase 33: every section names the portal use site and the architecture chapter that composes the pieces.
§1 — The authn/authz boundary: student journal vs admin view¶
Authentication answers "who are you" (the session cookie resolves to a student_id). Authorization answers "what may you do" (a role column says student or admin). Beginners merge the two and end up with the canonical bug — a student endpoint that trusts the URL parameter (/journal/{student_id}) instead of the session-resolved id, allowing student A to fetch student B's journal by changing the path. The portal's defence is a single source of truth: the session id, and only the session id, identifies the requesting student; the URL identifies the target, and the authz layer compares the two.
For Phase 41 there are exactly two assets and two actor roles. The assets are the student journal/notes/exam attempts (each scoped to one student) and the admin progress view (cross-cuts all students). The actors are student and teacher/admin. The legal matrix is small: student → own asset (allow), student → other student's asset (deny, 404 to avoid existence leak), student → admin view (deny, 403), admin → any asset (allow, audited). Every protected route binds either Depends(require_student) or Depends(require_admin); the latter additionally writes an AuditEvent row for every admin-mediated read. That audit edge is the defence in depth — even an authorized admin read is recorded so a compromised admin account leaves a trail.
@app.get("/journal/{ymd}")
def journal_read(
ymd: str,
student: Student = Depends(require_student), # session = source of truth
db: Session = Depends(get_db_session),
):
# NEVER trust a {student_id} URL segment; always use student.id from session
entry = db.exec(
select(JournalEntry).where(
JournalEntry.student_id == student.id,
JournalEntry.on_date == parse(ymd),
)
).first()
if entry is None:
raise HTTPException(404) # not 403 — avoid existence leak
return render(entry)
Forward reference: the portal's per-student isolation invariants live in src/miniportal/BLUEPRINT.md §6 and are validated by tests/test_miniportal_isolation.py; see also docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"The three concerns".
§2 — Signed cookies vs server-side sessions¶
A session can live in two places: signed on the client (a self-contained cookie that the server verifies with an HMAC key) or stored on the server (a row in a sessions table, with the cookie carrying only an opaque id). The trade is a tight one. Signed cookies are stateless — no DB lookup per request, no cache, no replication concern. The downside is revocation: invalidating a signed cookie either means rotating the server key (logs everyone out) or maintaining a deny-list (re-introducing server state, defeating the point). Server-side sessions cost one DB lookup per request, but revocation is one DELETE FROM sessions WHERE id = ?.
For a single-process, ≤ 50-student portal on SQLite, the DB lookup is free (< 100 µs per request) and the operational story is far simpler. The portal therefore uses both: cookies are signed (so a tampered cookie is rejected without a DB hit) and a sessions table exists for revocation. The cookie attributes are non-negotiable: HttpOnly (defeats XSS-driven theft), Secure (no plaintext leak), SameSite=Strict (parallel CSRF defence). The session HMAC key is loaded at startup from PORTAL_SESSION_SECRET; rotating it is the nuclear option (logs everyone out), and the rotation procedure is documented in the auth/vault chapter.
def issue_session(student: Student, signer: itsdangerous.Signer) -> str:
payload = json.dumps({"sid": student.id, "exp": now() + 3600})
return signer.sign(payload).decode() # cookie body
def verify_session(token: str, signer: itsdangerous.Signer) -> int:
payload = signer.unsign(token, max_age=3600) # raises on tamper / expiry
return json.loads(payload)["sid"]
Forward reference: the portal's session attribute choices, including the Secure=False dev override and 30-day rolling expiry policy, are explained in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"Sessions".
§3 — The passwordless-by-default policy¶
The portal's onboarding flow is non-standard and security-relevant. When an admin creates a student, the credentials row is inserted with password_hash = NULL and a one-time invite token is generated. The student visits /invite/{token}, the server checks the token is valid and unredeemed, and renders a set-password form. POSTing the form writes the Argon2id hash and marks the token consumed. There is no temporary password ever transmitted; the secret a student must possess to set up their account is the signed, time-limited token in the invite URL.
The attack surface this exposes is narrow but real, and Phase 41 must defend each path explicitly. (a) Token replay: an attacker who sees the invite URL once must not be able to use it twice — the token's used_at column is set under a UNIQUE constraint, and the second redemption returns 410 Gone. (b) Token brute-force: the token is 32 random bytes signed with itsdangerous; the verifier checks the signature before the DB lookup, so brute-force never reaches the database. © Token expiry: tokens carry an expires_at (default 24 h) checked by the verifier; expired tokens render an "ask your teacher for a new invite" page. (d) Token leak via referrer: the invite page sets Referrer-Policy: no-referrer so following an external link from the set-password form does not leak the token. The full mitigation is encoded in lab/05-portal-replay.md of Phase 41 — Borja runs the three demo attacks (replay, expired, tampered) and confirms each returns the right status code.
@app.post("/invite/{token}")
def redeem_invite(
token: str,
new_password: str = Form(...),
db: Session = Depends(get_db_session),
):
try:
nonce = signer.unsign(token, max_age=86400).decode()
except BadSignature:
raise HTTPException(403)
invite = db.exec(
select(InviteToken).where(InviteToken.nonce == nonce, InviteToken.used_at.is_(None))
).first()
if invite is None:
raise HTTPException(410) # already used or revoked
invite.used_at = now()
student = db.get(Student, invite.student_id)
student.password_hash = argon2_hash(new_password, pepper)
db.commit()
Forward reference: the policy is described in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"No-password-by-default"; the lab that exercises it is docs/phase-41-learner-portal/lab/01-passwordless-first-login.md.
§4 — Argon2id parameter choice on the i5-8250U¶
The Argon2id parameters (memory_cost, time_cost, parallelism, hash_len) trade verifier latency against attacker cost. OWASP's 2026 recommendation is 64 MiB memory / 3 iterations / 2 threads as the starting point; the portal must calibrate against the actual hardware (Borja's i5-8250U, Kaby Lake R, 4C/8T, 2018) because the recommendation assumes a server-class CPU. The calibration target is verify time in [35, 80] ms — fast enough that login feels interactive, slow enough that online guessing (without a separate rate limit) costs ~20 attempts/second/IP.
Of the four parameters, memory_cost dominates threat-model decisions. Why? Because GPU-accelerated attackers (the realistic offline-guess threat) are bounded by GPU RAM and PCIe bandwidth, not by CPU clock. A 64 MiB working set saturates consumer GPU L2 caches; doubling to 128 MiB doubles attacker hardware cost but only adds ~30 ms to a 4 GiB-host verify. time_cost is linear in both attacker and defender cost (no asymmetry). parallelism is mostly cosmetic on a 4-core laptop. So the calibration script fixes time_cost=3 and parallelism=2, then sweeps memory_cost over {16, 32, 64, 128} MiB until the verify-time band is hit. On Borja's i5-8250U the empirical answer is 64 MiB → ~50 ms; the curve is plotted in experiments/41-argon2-calibration/curve.png. The CI test tests/integration/test_argon2_calibration.py asserts the verify time stays in band on every commit, failing if a kernel upgrade or a argon2-cffi release changes performance enough to leave the window.
from argon2 import PasswordHasher, Type
PH = PasswordHasher(
type=Type.ID,
memory_cost=65536, # 64 MiB — the load-bearing parameter
time_cost=3,
parallelism=2,
hash_len=32,
salt_len=16,
)
def hash_password(plaintext: str, pepper: bytes) -> str:
return PH.hash(pepper + plaintext.encode("utf-8"))
def verify_password(plaintext: str, stored: str, pepper: bytes) -> bool:
try:
return PH.verify(stored, pepper + plaintext.encode("utf-8"))
except (VerifyMismatchError, InvalidHashError):
return False
Forward reference: the full calibration discussion, including the tests/integration/test_argon2_calibration.py band assertion and the rotation path when CI flags a drift, lives in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"Password hashing: Argon2id".
Forward reference to Phase 41¶
| Phase 37 vocabulary (this chapter) | Phase 41 use site |
|---|---|
| Authn vs authz, session as source-of-truth (§1) | src/miniportal/auth.py::require_student, tests/test_miniportal_isolation.py |
| Signed cookies + DB sessions for revocation (§2) | src/miniportal/auth.py::issue_session, Session model |
| Passwordless-by-default + invite token (§3) | src/miniportal/routes/auth.py::redeem_invite, lab 01 / lab 05 |
| Argon2id memory-cost calibration (§4) | src/miniportal/auth.py::hash_password, experiments/41-argon2-calibration/ |
Each portal-specific threat in security/THREATS.md (T-rows for invite-token replay, CSRF on note widget, password-set abuse, weak-password defaults, vault key-in-memory, audit-log tampering) maps back to one of these four sections. The portal's lab/05-security-replay.md exercises the runtime-checked ones; the design-level ones are verified by code review.
Next: lab/00-prompt-injection-direct.md returns to Phase 37's driving concern — the pirate-payload attack against the grammar tutor.