English · Español

Theory 05 — Portal threat model (forward to Phase 41)¶

🇪🇸 La Fase 37 enseña amenazas contra el tutor de gramática como sistema adversarial (inyecciones, RAG envenenado, fuzz de tools). La Fase 41 hereda esas amenazas y añade las propias de un portal multi-usuario: límite authn/authz entre estudiantes y admin, cookies firmadas vs sesiones en servidor, política sin contraseña por defecto, y elección de parámetros Argon2id en un i5-8250U.

Why this chapter exists¶

Phase 37's first four theory chapters threat-model the agent — what happens when a user types ignore previous instructions, or when the RAG index gets poisoned, or when a tool argument escapes a sandbox. Those threats apply inside the model boundary. The Phase 41 portal sits outside that boundary: it is a multi-tenant HTTP application that delivers the curriculum to many students with a teacher/admin role. Its attack surface is the HTTP surface — credentials, sessions, CSRF, replay, key-in-memory — not prompts.

This chapter teaches the four portal-relevant threat-model questions Phase 37 owes Phase 41. Each is a small lesson on its own; the portal composes them in docs/phase-41-learner-portal/theory/03-auth-and-vault.md. The split is deliberate: Phase 37 is threat-modelling vocabulary, Phase 41 is threat-modelling applied to one specific multi-user app.

The cross-link discipline mirrors theory/04-portal-building-blocks.md over in Phase 33: every section names the portal use site and the architecture chapter that composes the pieces.

§1 — The authn/authz boundary: student journal vs admin view¶

Authentication answers "who are you" (the session cookie resolves to a student_id). Authorization answers "what may you do" (a role column says student or admin). Beginners merge the two and end up with the canonical bug — a student endpoint that trusts the URL parameter (/journal/{student_id}) instead of the session-resolved id, allowing student A to fetch student B's journal by changing the path. The portal's defence is a single source of truth: the session id, and only the session id, identifies the requesting student; the URL identifies the target, and the authz layer compares the two.

For Phase 41 there are exactly two assets and two actor roles. The assets are the student journal/notes/exam attempts (each scoped to one student) and the admin progress view (cross-cuts all students). The actors are student and teacher/admin. The legal matrix is small: student → own asset (allow), student → other student's asset (deny, 404 to avoid existence leak), student → admin view (deny, 403), admin → any asset (allow, audited). Every protected route binds either Depends(require_student) or Depends(require_admin); the latter additionally writes an AuditEvent row for every admin-mediated read. That audit edge is the defence in depth — even an authorized admin read is recorded so a compromised admin account leaves a trail.

@app.get("/journal/{ymd}")
def journal_read(
    ymd: str,
    student: Student = Depends(require_student),     # session = source of truth
    db: Session = Depends(get_db_session),
):
    # NEVER trust a {student_id} URL segment; always use student.id from session
    entry = db.exec(
        select(JournalEntry).where(
            JournalEntry.student_id == student.id,
            JournalEntry.on_date == parse(ymd),
        )
    ).first()
    if entry is None:
        raise HTTPException(404)   # not 403 — avoid existence leak
    return render(entry)

Forward reference: the portal's per-student isolation invariants live in src/miniportal/BLUEPRINT.md §6 and are validated by tests/test_miniportal_isolation.py; see also docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"The three concerns".

§2 — Signed cookies vs server-side sessions¶

A session can live in two places: signed on the client (a self-contained cookie that the server verifies with an HMAC key) or stored on the server (a row in a sessions table, with the cookie carrying only an opaque id). The trade is a tight one. Signed cookies are stateless — no DB lookup per request, no cache, no replication concern. The downside is revocation: invalidating a signed cookie either means rotating the server key (logs everyone out) or maintaining a deny-list (re-introducing server state, defeating the point). Server-side sessions cost one DB lookup per request, but revocation is one DELETE FROM sessions WHERE id = ?.

For a single-process, ≤ 50-student portal on SQLite, the DB lookup is free (< 100 µs per request) and the operational story is far simpler. The portal therefore uses both: cookies are signed (so a tampered cookie is rejected without a DB hit) and a sessions table exists for revocation. The cookie attributes are non-negotiable: HttpOnly (defeats XSS-driven theft), Secure (no plaintext leak), SameSite=Strict (parallel CSRF defence). The session HMAC key is loaded at startup from PORTAL_SESSION_SECRET; rotating it is the nuclear option (logs everyone out), and the rotation procedure is documented in the auth/vault chapter.

def issue_session(student: Student, signer: itsdangerous.Signer) -> str:
    payload = json.dumps({"sid": student.id, "exp": now() + 3600})
    return signer.sign(payload).decode()   # cookie body

def verify_session(token: str, signer: itsdangerous.Signer) -> int:
    payload = signer.unsign(token, max_age=3600)  # raises on tamper / expiry
    return json.loads(payload)["sid"]

Forward reference: the portal's session attribute choices, including the Secure=False dev override and 30-day rolling expiry policy, are explained in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"Sessions".

§3 — The passwordless-by-default policy¶

The portal's onboarding flow is non-standard and security-relevant. When an admin creates a student, the credentials row is inserted with password_hash = NULL and a one-time invite token is generated. The student visits /invite/{token}, the server checks the token is valid and unredeemed, and renders a set-password form. POSTing the form writes the Argon2id hash and marks the token consumed. There is no temporary password ever transmitted; the secret a student must possess to set up their account is the signed, time-limited token in the invite URL.

The attack surface this exposes is narrow but real, and Phase 41 must defend each path explicitly. (a) Token replay: an attacker who sees the invite URL once must not be able to use it twice — the token's used_at column is set under a UNIQUE constraint, and the second redemption returns 410 Gone. (b) Token brute-force: the token is 32 random bytes signed with itsdangerous; the verifier checks the signature before the DB lookup, so brute-force never reaches the database. © Token expiry: tokens carry an expires_at (default 24 h) checked by the verifier; expired tokens render an "ask your teacher for a new invite" page. (d) Token leak via referrer: the invite page sets Referrer-Policy: no-referrer so following an external link from the set-password form does not leak the token. The full mitigation is encoded in lab/05-portal-replay.md of Phase 41 — Borja runs the three demo attacks (replay, expired, tampered) and confirms each returns the right status code.

@app.post("/invite/{token}")
def redeem_invite(
    token: str,
    new_password: str = Form(...),
    db: Session = Depends(get_db_session),
):
    try:
        nonce = signer.unsign(token, max_age=86400).decode()
    except BadSignature:
        raise HTTPException(403)
    invite = db.exec(
        select(InviteToken).where(InviteToken.nonce == nonce, InviteToken.used_at.is_(None))
    ).first()
    if invite is None:
        raise HTTPException(410)   # already used or revoked
    invite.used_at = now()
    student = db.get(Student, invite.student_id)
    student.password_hash = argon2_hash(new_password, pepper)
    db.commit()

Forward reference: the policy is described in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"No-password-by-default"; the lab that exercises it is docs/phase-41-learner-portal/lab/01-passwordless-first-login.md.

§4 — Argon2id parameter choice on the i5-8250U¶

The Argon2id parameters (memory_cost, time_cost, parallelism, hash_len) trade verifier latency against attacker cost. OWASP's 2026 recommendation is 64 MiB memory / 3 iterations / 2 threads as the starting point; the portal must calibrate against the actual hardware (Borja's i5-8250U, Kaby Lake R, 4C/8T, 2018) because the recommendation assumes a server-class CPU. The calibration target is verify time in [35, 80] ms — fast enough that login feels interactive, slow enough that online guessing (without a separate rate limit) costs ~20 attempts/second/IP.

Of the four parameters, memory_cost dominates threat-model decisions. Why? Because GPU-accelerated attackers (the realistic offline-guess threat) are bounded by GPU RAM and PCIe bandwidth, not by CPU clock. A 64 MiB working set saturates consumer GPU L2 caches; doubling to 128 MiB doubles attacker hardware cost but only adds ~30 ms to a 4 GiB-host verify. time_cost is linear in both attacker and defender cost (no asymmetry). parallelism is mostly cosmetic on a 4-core laptop. So the calibration script fixes time_cost=3 and parallelism=2, then sweeps memory_cost over {16, 32, 64, 128} MiB until the verify-time band is hit. On Borja's i5-8250U the empirical answer is 64 MiB → ~50 ms; the curve is plotted in experiments/41-argon2-calibration/curve.png. The CI test tests/integration/test_argon2_calibration.py asserts the verify time stays in band on every commit, failing if a kernel upgrade or a argon2-cffi release changes performance enough to leave the window.

from argon2 import PasswordHasher, Type

PH = PasswordHasher(
    type=Type.ID,
    memory_cost=65536,   # 64 MiB — the load-bearing parameter
    time_cost=3,
    parallelism=2,
    hash_len=32,
    salt_len=16,
)

def hash_password(plaintext: str, pepper: bytes) -> str:
    return PH.hash(pepper + plaintext.encode("utf-8"))

def verify_password(plaintext: str, stored: str, pepper: bytes) -> bool:
    try:
        return PH.verify(stored, pepper + plaintext.encode("utf-8"))
    except (VerifyMismatchError, InvalidHashError):
        return False

Forward reference: the full calibration discussion, including the tests/integration/test_argon2_calibration.py band assertion and the rotation path when CI flags a drift, lives in docs/phase-41-learner-portal/theory/03-auth-and-vault.md §"Password hashing: Argon2id".

Forward reference to Phase 41¶

Phase 37 vocabulary (this chapter)	Phase 41 use site
Authn vs authz, session as source-of-truth (§1)	`src/miniportal/auth.py::require_student`, `tests/test_miniportal_isolation.py`
Signed cookies + DB sessions for revocation (§2)	`src/miniportal/auth.py::issue_session`, `Session` model
Passwordless-by-default + invite token (§3)	`src/miniportal/routes/auth.py::redeem_invite`, lab 01 / lab 05
Argon2id memory-cost calibration (§4)	`src/miniportal/auth.py::hash_password`, `experiments/41-argon2-calibration/`

Each portal-specific threat in security/THREATS.md (T-rows for invite-token replay, CSRF on note widget, password-set abuse, weak-password defaults, vault key-in-memory, audit-log tampering) maps back to one of these four sections. The portal's lab/05-security-replay.md exercises the runtime-checked ones; the design-level ones are verified by code review.

Next: lab/00-prompt-injection-direct.md returns to Phase 37's driving concern — the pirate-payload attack against the grammar tutor.