English · Español
00 — The 2026 AI-Lab Interview Landscape¶
🇪🇸 Mapa del loop de entrevistas en laboratorios de IA en 2026: pantalla telefónica, ML-systems, coding, lectura de paper, design, behavioral. Cada lab del
lynx-cortexmapea a uno o más tipos de ronda.
The standard loop (Anthropic / OpenAI / DeepMind / Google Brain / xAI / Cohere / Mistral)¶
A 2026 AI-engineer / research-engineer loop is 5 to 7 hours, typically split across two days:
| Round | Duration | What is graded |
|---|---|---|
| 0. Recruiter screen | 30 min | Resume signal, role/team fit, comp expectations |
| 1. Phone / video screen | 60 min | One ML concept question + one coding question (LeetCode-medium or implement-a-primitive) |
| 2. ML systems | 60 min | Open-ended: "design a chatbot for 10M DAU", capacity math, failure modes |
| 3. Coding (implementation) | 60 min | Implement attention / BPE / LoRA / DPO loss in NumPy or PyTorch from scratch |
| 4. Paper / depth | 45-60 min | Read a recent paper (sometimes pre-shared, sometimes cold), discuss strengths/weaknesses |
| 5. Behavioral / values | 45 min | STAR-format anecdotes; for Anthropic, expect "what is your view on AI safety" |
| 6. Hiring manager / bar raiser | 30-45 min | Calibration, sells you on the team, surfaces concerns |
For research scientist roles, swap round 3 for a research-project deep dive (45 minutes of you presenting prior work, 45 minutes of them probing it).
What each round actually screens for¶
Phone screen — baseline competence¶
- Not a hard filter on cleverness; it filters out candidates who don't know what attention is.
- Coding portion is "can you write code that compiles" not "can you golf this in 5 lines".
- Failure mode: spending 45 minutes on the coding half and 5 on the concept. They are equal-weight.
ML systems — can you reason under partial information¶
- They will not give you full requirements. You must elicit them.
- The graded skill: capacity math (Little's law, tokens/s, GPU memory budget), failure-mode enumeration (OOM, NCCL hang, thermal throttle), cost discipline (CpQU — cost per quality unit).
- See
theory/02-systems-design-for-llms.mdfor the 5 canonical prompts.
Coding — implementation muscle¶
- The "depth filter". Many candidates know attention conceptually but cannot write
softmax(Q @ K.T / sqrt(d_k)) @ Vfrom a blank file in 20 minutes. - Whether they let you use PyTorch varies. Anthropic phone screens often demand NumPy; on-sites allow PyTorch.
- See
theory/04-coding-drills.mdfor the 12 staple drills.
Paper read — can you read like a researcher¶
- The graded skill: in 20 minutes, identify (a) the claim, (b) the method, © what is not tested.
- They want to hear "the ablation on §5.2 doesn't control for tokenizer differences", not "the paper is impressive".
- See
theory/03-paper-read-drill.mdfor the 20-minute protocol on 4 canonical papers.
Behavioral — taste, judgment, and ownership¶
- At Anthropic specifically: expect "tell me about a time a model behaved badly and you investigated".
- STAR is the format, but the substance is what tradeoffs did you make. "I shipped X" is half the answer; "I shipped X instead of Y because Z" is the full one.
- See
theory/05-behavioral-and-storytelling.mdfor 10 pre-written anecdotes from the lynx-cortex journey.
How lynx-cortex labs map to interview rounds¶
| Lab / Phase artifact | Interview round it prepares |
|---|---|
| Phase 04 — Calculus & Optimization | Phone screen (gradient questions); paper round (DPO derivation) |
| Phase 07 — Scalar autograd | Coding round (implement backward) |
| Phase 08 — Tensor autograd | Coding round (broadcasting, shape bugs) |
| Phase 11 — Tokenization BPE | Coding round (drill 02) |
| Phase 15 — Attention | Coding round (drill 01), ML-systems (attention cost math) |
| Phase 16 — Positional encodings | Coding round (RoPE drill 10) |
| Phase 17 — Mini-GPT | Paper round (Vaswani 2017) |
| Phase 19 — Training dynamics | Behavioral ("hard debug") |
| Phase 20 — Evaluation harness | ML-systems (offline vs online eval) |
| Phase 21 — Inference / sampling | Coding round (top-p drill 07) |
| Phase 22 — KV cache | Coding round (drill 04), ML-systems (memory budget) |
| Phase 26 — Quantization | ML-systems (cost per token) |
| Phase 27 — Modern attention | Coding round (FlashAttention conceptual) |
| Phase 28 — LoRA / QLoRA | Coding round (drill 05) |
| Phase 29 — RAG | ML-systems prompt 3 |
| Phase 32 — Agents | ML-systems prompt 2 |
| Phase 33 — Inference serving | ML-systems prompt 1, coding drill 12 (continuous batcher) |
| Phase 34 — Observability & cost | ML-systems (CpQU) |
| Phase 35 — Distributed | ML-systems (NCCL deadlock) |
| Phase 37 — Security & safety | Anthropic behavioral round |
| Phase 38 — MLOps | ML-systems (multi-tenant fine-tuning) |
| X3 — RLHF / DPO | Coding drill 06 (DPO loss), paper round (Rafailov 2023) |
What this module does not cover¶
- LeetCode grinding. Use NeetCode 150 separately if your role demands DSA.
- Comp negotiation. Use
levels.fyiand a recruiter friend. - Visa / immigration. Out of scope.
Next file¶
→ 01-whiteboard-ml-questions.md: 25 questions, 3-paragraph answers, 3-level follow-up trees.