Skip to content

English · Español

Break 00 — Emit logs without request-id correlation, debug a multi-tenant trace

🇪🇸 Sin request-id, los logs de dos peticiones concurrentes se interleavan y no puedes separar "qué le pasó al estudiante A" de "qué le pasó al estudiante B". Este /break quita ese campo, dispara una carga de 8 clientes y muestra que la traza es ilegible.


What you'll do

Remove the request_id field from the structured log emitter so concurrent requests interleave their log lines indistinguishably. Drive the server with a multi-tenant load script and watch the audit log become useless.

Step 1 — Locate the logger

src/miniobs/logging.py            # the structured logger (Phase 34 lab 01)
src/miniserve/middleware.py       # the request-id middleware that injects it

Step 2 — Introduce the bug

In src/miniobs/logging.py, the JSON formatter currently includes request_id from the context-local. Drop it:

# OLD
extra = {
    "request_id": ctx.get("request_id"),
    "student_id": ctx.get("student_id"),
    "ts": ts,
    ...
}

# NEW (the broken version)
extra = {
    # "request_id": ctx.get("request_id"),   # commented out
    "student_id": ctx.get("student_id"),
    "ts": ts,
    ...
}

One line commented out. The middleware still generates the id; the formatter just stops including it.

Step 3 — Record the break

learners/borja/phase-34/notes/breaks.md:

- bug-id: 34-01
  concept: log correlation / multi-tenant trace
  symptom: under 8 concurrent students, the audit log shows interleaved lines
           with no way to attribute "which log line belongs to which request".
  hidden_cause: logging.py's JSON formatter no longer emits the request_id field.
                The middleware still generates the id, but it never reaches stdout.
  hint_1: "Tail the audit log under load. Can you trace a single request end-to-end?"
  hint_2: "What field would you `jq` on to filter for one request?"
  hint_3: "diff src/miniobs/logging.py against last week's commit. What disappeared?"
  fix_diff: uncomment the request_id line in logging.py's formatter.

Step 4 — Verify it's observable

Run lab 02's traced load test with 8 concurrent students. Expected before the break:

$ jq 'select(.request_id == "req-abc123")' audit.log
{"ts":"2026-05-23T10:00:01Z","request_id":"req-abc123","student_id":"alice","event":"tutor.received"}
{"ts":"2026-05-23T10:00:01Z","request_id":"req-abc123","student_id":"alice","event":"tutor.tokenized"}
{"ts":"2026-05-23T10:00:01Z","request_id":"req-abc123","student_id":"alice","event":"tutor.completed","latency_ms":148}

After the break:

$ jq 'select(.request_id == "req-abc123")' audit.log
# (empty — the field doesn't exist)

$ tail -n 8 audit.log
{"ts":"...","student_id":"alice","event":"tutor.received"}
{"ts":"...","student_id":"bob","event":"tutor.received"}
{"ts":"...","student_id":"alice","event":"tutor.tokenized"}
{"ts":"...","student_id":"carol","event":"tutor.received"}
{"ts":"...","student_id":"bob","event":"tutor.tokenized"}
{"ts":"...","student_id":"alice","event":"tutor.completed","latency_ms":148}
{"ts":"...","student_id":"carol","event":"tutor.completed","latency_ms":1320}
{"ts":"...","student_id":"bob","event":"tutor.completed","latency_ms":910}

Lines belong to the same student in some cases (alice) but you can't tell which of two alice-requests ended at 148 ms, especially when alice retries.

The tests tests/phase34/test_logging.py::test_request_id_present go red.

Step 5 — The teaching moment

Request-id correlation is the single most important field in a multi-tenant system. Without it:

  • Distributed tracing (Phase 34 theory 03) is impossible.
  • Customer-support ("what happened to student X at 10:03?") requires guessing.
  • Debugging a single latency outlier from a flame graph is non-deterministic.

The fix is one uncommented line; the lesson is that the log schema is an interface, just like the API schema, and removing a field from it is a breaking change.

Hard rules respected

  • Single, instructive bug.
  • Reversible in 1 line.
  • Observable: the test assertion goes red, the jq filter returns empty.
  • No security impact.
  • No test modified.

Next: when green, read ../theory/03-tracing-and-logging.md for the canonical schema.