English · Español
Lab 01 — Sync vs async: the blocking-handler pitfall¶
🇪🇸 Carga el servicio de lab 00 con 50 clientes concurrentes. Mide latencias. Cambia
defporasync defsin offloading. Mide de nuevo: catástrofe. Aplicato_thread. Re-mide: arreglado.
Objective¶
Demonstrate the async def + blocking-call pitfall described in theory/01-async-and-the-event-loop.md. Load-test three handler variants and produce a side-by-side latency CDF.
Setup¶
- Lab 00's
src/miniserve/app.py. uv add httpx(oraiohttp) for the load generator.- A list of 100 verb-correction prompts from the §A13 corpus.
Tasks¶
- Write the load generator at
scripts/loadtest.py:
import asyncio, time, httpx
async def one_request(client, payload, results):
t0 = time.perf_counter()
r = await client.post("/correct", json=payload)
results.append((time.perf_counter() - t0, r.status_code))
async def run_load(concurrency: int, total: int, payloads: list[dict]):
results = []
async with httpx.AsyncClient(base_url="http://127.0.0.1:8000", timeout=30) as c:
sem = asyncio.Semaphore(concurrency)
async def bounded(p):
async with sem:
await one_request(c, p, results)
await asyncio.gather(*(bounded(payloads[i % len(payloads)]) for i in range(total)))
return results
-
Variant A — sync handler (lab 00 baseline). Keep
def correct(req): .... Start server withuv run uvicorn miniserve.app:app --workers 1. Run loadtest withconcurrency=50, total=200. -
Variant B — async handler with blocking call. Change to:
@app.post("/correct")
async def correct(req: CorrectRequest) -> CorrectResponse:
result = agent.correct(req.sentence, learner_id=req.learner_id) # blocking!
return ...
Restart server. Re-run loadtest. Expect catastrophe: p95 should be ~5-10× worse.
- Variant C — async handler with
to_thread. Change to:
import anyio
@app.post("/correct")
async def correct(req: CorrectRequest) -> CorrectResponse:
result = await anyio.to_thread.run_sync(
agent.correct, req.sentence, req.learner_id
)
return ...
Restart server. Re-run loadtest. Expect recovery — similar to variant A.
- Plot a latency CDF with all three variants on the same axes (
scripts/plot_cdf.py). x-axis: latency (ms), log scale; y-axis: cumulative fraction.
Annotate p50, p95, p99 on each curve.
- Write a short note (5-10 lines, in lab notes) explaining:
- Why variant B is so much worse than A and C.
- Why A and C are roughly equivalent.
- When you'd prefer C over A (hint: when the handler also does
awaiton async I/O for other reasons — e.g., logging to a remote sink, calling a database).
Measurements¶
Save to experiments/<date>-phase-33-lab-01/:
latencies_sync.json,latencies_async_blocking.json,latencies_async_tothread.json— arrays of (latency, status_code).latency_cdf.png— the side-by-side CDF.summary.md— your written observations.manifest.json— seeds, versions, concurrency, total requests.
Acceptance¶
- All three variants achieve ≥ 99% HTTP 200 under the load test (no timeouts).
- Variant B's p95 is at least 3× worse than variants A and C.
- Variants A and C have p95 within 20% of each other.
- The CDF plot clearly shows three distinct curves.
Pitfalls¶
- Forgetting
--workers 1. With multiple uvicorn workers, the blocking-handler issue is partially masked (each worker has its own event loop). We're studying the per-process behavior; pin workers to 1. - Cold start contaminating the measurement. Send 10 warmup requests before recording. The first request is always slow (lazy imports, JIT, cache misses).
- Network overhead. Run the loadtest on
127.0.0.1to avoid network jitter. We're measuring server behavior, not TCP. - httpx default timeout = 5s — too short. Set to 30s, otherwise variant B will report timeout errors that look like server errors.
- Not enough samples. With
total=200andconcurrency=50, each batch is ~4 deep. For p99 stability, push total to 500+.
Stretch¶
- Repeat the experiment with
--workers 4. How does the picture change? - Add a
time.sleep(0.05)inside the handler (to simulate an additional I/O delay) and re-run. The sync threadpool variant should degrade more thanasync + to_thread, because the threadpool size is bounded.
Next: 02-static-batching.md