English · Español
Lab 03 — Inline notes anchored to paragraphs, plus the daily journal¶
🇪🇸 La página de teoría se vuelve interactiva: cada párrafo puede recibir una nota personal sin abandonar la lectura. Y al final del día, una entrada en el journal — un único archivo por día, append-only. La regla: el contenido de los usuarios pasa siempre por el sanitizer de Phase 37 antes de aparecer en pantalla.
Goal¶
Render every theory/lab page with stable per-paragraph anchors. Beside each anchored paragraph, expose an inline "note" widget that opens a SimpleMDE editor in-page, lets the user write Markdown, and persists the note keyed to (student_id, page_id, anchor_id). A separate "Notes" view lists all notes sortable by tag, phase, and date. The daily Journal is a free-form day-keyed file (one row per day, appended within the day). All Markdown rendering server-side via markdown-it-py + the Phase 37 sanitizer.
Why this lab exists¶
The CLAUDE.md three-layer log (§A4) lives in learners/<u>/journal/. Lab 03 makes the daily journal a portal-native artifact instead of a file the learner has to remember to open. The inline-note feature is the "in-context capture" that Borja's profile (learners/borja/profile.md) flags as the single most-missed feature when reading dense theory: by the time you switch to a separate notes file you've lost the train of thought.
Stored XSS is the lurking risk. Notes and journal entries are user-generated, rendered to other admin viewers, and stored in the DB. Bypassing the Phase 37 sanitizer here would let a learner inject a script that fires on the teacher's admin dashboard. The lab's tests pin this down explicitly.
Prerequisites¶
- Labs 00–02 done.
- Phase 37 sanitizer module at
src/sanitizer/exposingsanitize_html(rendered_html) -> str. markdown-it-pyalready in the lockfile from Phase 37; if not,uv add markdown-it-py.
Deliverables¶
- Alembic migration creating
notestable:(id PK, student_id, page_id, anchor_id, body_md, tags TEXT[], created_at, updated_at). - Alembic migration creating
journal_entriestable:(student_id, day DATE, body_md, updated_at), PK(student_id, day). src/miniportal/markdown.py—render(md: str) -> strcalling markdown-it + the sanitizer.src/miniportal/anchors.py—anchor_id(page_id: str, heading: str, paragraph_index: int) -> str(deterministic, slug-stable).src/miniportal/routes/notes.py— REST endpoints (GET/POST/PUT/DELETE /notes/...) +GET /noteslist view.src/miniportal/routes/journal.py—GET /journal,GET /journal/{date},POST /journal/{date}.src/miniportal/static/portal.js— minimal vanilla JS to open the inline editor (no framework).src/miniportal/templates/notes_list.html.jinja,journal_entry.html.jinja,inline_note_widget.html.jinja.tests/portal/test_note_anchor_persistence.py.tests/portal/test_journal_append_idempotency.py.tests/portal/test_markdown_xss_sanitizer.py.
Step 1 — Stable paragraph anchors¶
# src/miniportal/anchors.py
import re, hashlib
_SLUG_RE = re.compile(r"[^a-z0-9]+")
def slugify(text: str) -> str:
raise NotImplementedError("Lab 03 step 1 — lowercase, collapse runs of non-alnum to '-', strip leading/trailing '-'.")
def anchor_id(page_id: str, heading: str, paragraph_index: int) -> str:
"""Return a deterministic anchor id of the form:
<page_id>::<heading_slug>::p<paragraph_index>
Properties required:
1. Pure function — same inputs → same output, always.
2. Stable across heading renames *of unrelated* sections.
3. Renaming THIS heading invalidates the anchor (notes need re-anchoring).
4. Short enough to embed in an HTML id (<=128 chars).
"""
raise NotImplementedError("Lab 03 step 1 — compose the three parts; assert length <= 128.")
Property 3 is intentional: if the curriculum renames a section heading, the notes attached to it should flag (not silently follow) so the learner reviews whether the note still applies. Lab 03 adds an "orphaned notes" report later this lab.
Step 2 — Markdown render → sanitize¶
# src/miniportal/markdown.py
from markdown_it import MarkdownIt
from sanitizer import sanitize_html # Phase 37 module
_MD = MarkdownIt("commonmark", {"linkify": True, "html": False})
def render(body_md: str) -> str:
"""Markdown -> HTML -> Phase 37 sanitizer -> safe HTML.
`html: False` is the FIRST line of defense (no raw HTML in input);
the sanitizer is the second. Both must be on.
"""
raise NotImplementedError("Lab 03 step 2 — pass body through markdown-it then sanitize_html.")
The two-defense rule (html: False AND sanitizer) is a Phase 37 principle re-applied. Removing either one fails the XSS test.
Step 3 — notes endpoints¶
# src/miniportal/routes/notes.py
from fastapi import APIRouter, Depends, Form
router = APIRouter(prefix="/notes")
@router.get("")
async def list_notes(
student = Depends(current_student),
tag: str | None = None,
phase: int | None = None,
):
"""Render notes_list.html, sortable by tag/phase/date. Default sort: date desc."""
raise NotImplementedError("Lab 03 step 3 — query notes, render template.")
@router.post("")
async def create_note(
page_id: str = Form(...),
anchor_id: str = Form(...),
body_md: str = Form(...),
tags: str = Form(""),
student = Depends(current_student),
):
"""`tags` is comma-separated; persisted as TEXT[] (or JSON if SQLite).
Validation:
- page_id matches /^[a-z0-9-/]+$/
- anchor_id matches the format from step 1
- body_md length <= 32 KB
- tags: each tag matches /^[a-z][a-z0-9-]{0,30}$/, max 8 tags
Sanitize tags before persistence."""
raise NotImplementedError("Lab 03 step 3 — validate, insert, return 201 with the rendered HTML.")
@router.put("/{note_id}")
async def update_note(note_id: int, body_md: str = Form(...), student = Depends(current_student)):
raise NotImplementedError("Lab 03 step 3 — owner check, update, return rendered HTML.")
@router.delete("/{note_id}")
async def delete_note(note_id: int, student = Depends(current_student)):
raise NotImplementedError("Lab 03 step 3 — owner check, soft-delete (set deleted_at), return 204.")
Notes are owned by exactly one student. Even admins do not edit other learners' notes through this endpoint — they only read them in the admin view (lab 05). The CRUD is single-owner.
Step 4 — Inline widget (vanilla JS)¶
src/miniportal/static/portal.js (≤ 100 lines target):
// Locate every <p data-anchor-id="..."> ; on hover, show a "+ note" affordance.
// On click, open a SimpleMDE textarea in a small dialog overlay.
// On save, POST to /notes; on success, swap in the rendered HTML inline.
// All requests carry the CSRF token from the meta tag.
(function () {
// raise NotImplementedError equivalent: leave a `throw` so the missing impl is loud at runtime.
throw new Error("Lab 03 step 4 — implement the inline note widget (no framework).");
})();
Constraint: no framework dependency. SimpleMDE is pulled in via a single <link> + <script> from static/vendor/ (vendored once, committed to the repo — not loaded from a CDN; supply-chain rule from security/supply-chain.md).
Step 5 — The journal¶
# src/miniportal/routes/journal.py
from datetime import date as date_t
from fastapi import APIRouter, Depends, Form, HTTPException
router = APIRouter(prefix="/journal")
@router.get("")
async def journal_index(student = Depends(current_student)):
raise NotImplementedError("Lab 03 step 5 — render the list of (date, snippet) for this student.")
@router.get("/{day}")
async def journal_show(day: date_t, student = Depends(current_student)):
raise NotImplementedError("Lab 03 step 5 — render the day's entry; 404 if none.")
@router.post("/{day}")
async def journal_upsert(day: date_t, body_md: str = Form(...), student = Depends(current_student)):
"""Append-only WITHIN a day:
- If no row exists for (student, day): INSERT.
- If a row exists: APPEND body_md to the existing body with a separator.
--- 14:32 ---
...new content...
- Idempotency: if the submitted body_md matches the last appended block byte-for-byte
(modulo trailing whitespace) within the last 60 seconds, treat as no-op and return 200.
Editing PAST days is forbidden: day < today returns 409 Conflict.
"""
raise NotImplementedError("Lab 03 step 5 — upsert with append + idempotency window + past-day guard.")
The "append-only within a day, no editing past days" rule mirrors CLAUDE.md §3. The portal enforces it because Borja explicitly wanted the journal to feel like a logbook, not a wiki.
Step 6 — Orphaned notes report¶
A note becomes orphaned when its anchor_id no longer exists on the referenced page_id (the heading was renamed or the paragraph removed). A daily background sweep flags such notes:
# src/miniportal/services/notes_sweeper.py
def find_orphans(db, anchors_by_page: dict[str, set[str]]) -> list[int]:
raise NotImplementedError("Lab 03 step 6 — return ids of notes whose anchor_id is not in anchors_by_page[page_id].")
Lab 05 surfaces these on the admin dashboard; for lab 03 we just write the sweeper + a test.
Step 7 — Tests¶
# tests/portal/test_note_anchor_persistence.py
def test_note_round_trip():
raise NotImplementedError("Create a note via POST /notes; GET it back; assert body_md and rendered HTML match.")
def test_anchor_id_is_deterministic():
raise NotImplementedError("anchor_id('p/theory/01', 'Why caches', 2) called twice -> equal.")
def test_anchor_changes_when_heading_renamed():
raise NotImplementedError("Different heading text -> different anchor_id (notes 'detach' on rename).")
# tests/portal/test_journal_append_idempotency.py
def test_first_entry_inserts():
raise NotImplementedError("POST /journal/<today> creates one row.")
def test_second_entry_same_day_appends_with_separator():
raise NotImplementedError("Second POST appends; body contains '---' separator + two timestamps.")
def test_same_content_within_window_is_noop():
raise NotImplementedError("POST identical body_md twice within 60s -> only one append.")
def test_past_day_edit_rejected():
raise NotImplementedError("POST /journal/<yesterday> -> 409.")
# tests/portal/test_markdown_xss_sanitizer.py
def test_script_tag_stripped():
raise NotImplementedError("Submit body_md='<script>alert(1)</script>'; assert rendered HTML contains no <script>.")
def test_javascript_url_stripped():
raise NotImplementedError("Submit a Markdown link [click](javascript:alert(1)); rendered href does not contain 'javascript:'.")
def test_onerror_attribute_stripped():
raise NotImplementedError("Submit ' <img src=x onerror=alert(1)>'; rendered HTML has no onerror.")
def test_safe_markdown_preserved():
raise NotImplementedError("Submit normal **bold** and `code`; rendered HTML contains <strong> and <code>.")
What "done" looks like¶
- Migrations apply.
-
anchor_idis deterministic and length-bounded. - Notes CRUD round-trips; owner check enforced; soft-delete works.
- Journal append-only-within-day works; past-day edits rejected; idempotency window verified.
- All three XSS tests pass; sanitizer is invoked from a single helper, not duplicated.
-
mypy --strictandbanditclean for the new modules. - SimpleMDE vendored under
src/miniportal/static/vendor/— no CDN loading. - Orphaned-notes sweeper has a unit test; the result list is correct on a seeded fixture.
Common pitfalls¶
- CDN-loading SimpleMDE. A compromised CDN means a script injection on every page render. Vendor it; pin the hash.
html: Trueon markdown-it. A single character flip and your sanitizer is now the only defense — and the first time it has a bug, you ship a stored XSS. Both defenses on, both tested.- Computing anchor_id at render time on the client. The server must compute and emit the anchor in the rendered HTML. Computing it in JS allows the client to fabricate anchors and create notes against pages they never saw.
- Letting
tagsbe free-form. Unbounded tags become a UI nightmare and a stored-XSS vector if rendered raw. Constrain shape, max count, sanitize. - Allowing past-day journal edits because "it's just an off-by-one." It is not. The append-only-within-day rule is the contract; if Borja relaxes it, the journal stops being a logbook and becomes a wiki — and his audit trail rots.
- Idempotency window too long. A 5-minute window means a legitimate "I had two distinct thoughts in three minutes" gets collapsed. 60 s is the lab default; tune per usage feedback.
Next: lab/04-quizzes-exams-and-replay.md — quiz YAML, grading, SM-2 review.