English · Español
06 — Company-Specific Prep: Signals by Lab
🇪🇸 Lo que cada laboratorio prioriza en entrevistas: Anthropic (alineamiento, IA constitucional, "explica por qué los modelos fallan"); OpenAI (escala + juicio de producto); DeepMind (profundidad de investigación + matemáticas); Google Brain (papers + producción); xAI (pragmatismo de ingeniería); Cohere / Mistral (multilingüe + retrieval).
How to read this file
Each section has three blocks:
1. What they value — culture and research priorities, distilled from public statements, papers, and engineering blog posts (2024-2026).
2. What to expect in the loop — the specific questions / formats their interviewers tend to favor.
3. The lynx-cortex prep — which phase / module / drill is best leverage for that company.
These are heuristics from public information, not contracts. Calibrate against the recruiter's brief.
Anthropic
What they value
- Alignment-first. The published research output is dominated by alignment work: Constitutional AI, RLHF, scalable oversight, interpretability (circuits, sparse autoencoders), red-teaming, model behavior reasoning. If you do not have an opinion on alignment, you will struggle.
- Honest uncertainty. Anthropic's writing style is deliberately measured — "we don't know yet"-style epistemic humility. Engineers / researchers who project false confidence get filtered.
- Long-context, careful reasoning. Their published model releases (Claude 3.5, Claude 4, Claude Opus) emphasize reasoning quality over benchmark-chasing. Interviewers care about your reasoning process, not just the answer.
- Safety as engineering, not theatre. They expect you to think about failure modes concretely — not "AI might be unsafe" but "this specific prompt class causes this specific failure".
What to expect in the loop
- Phone screen. Standard coding (attention, BPE, or similar) + one ML concept.
- ML systems round. Often includes a constraint like "design a system that won't accidentally output harmful content" — capacity plus safety.
- Paper round. Likely a recent Anthropic paper (CAI, Constitutional Classifiers, Influence Functions, Scaling Monosemanticity, Sleeper Agents). Read these.
- Behavioral round — the Anthropic-distinctive one. Expect questions like:
- "Tell me about a time a model behaved badly. What did you investigate?"
- "What is your view on AI safety?"
- "When have you been wrong about something important?"
- "Explain why language models hallucinate."
- Bar raiser / hiring manager. Often probes calibration: "How confident are you in this answer? What would change your mind?"
Specific topics to be fluent on
| Topic |
Source |
| Constitutional AI |
Bai et al. 2022 |
| RLHF / RLAIF |
Christiano 2017, Lee 2023 |
| Scalable oversight (debate, weak-to-strong) |
Bowman 2022, Burns 2023 |
| Mechanistic interpretability |
Olah / Conmy / Bricken (circuits, SAEs) |
| Sleeper Agents / deceptive alignment |
Hubinger 2024 |
| Sycophancy and reward hacking |
Sharma 2023 |
| Model Spec / behavior policies |
Anthropic's published behaviors |
lynx-cortex prep
- X3 RLHF/DPO module is the highest-leverage prep. Be able to derive DPO from RLHF on a whiteboard (see whiteboard Q14, drill 06).
- Phase 37 (Security & Safety) + X3 theory/05 (Constitutional AI) for the behavioral round.
- Behavioral anecdote 8 ("model behaved badly") is the highest-leverage story.
- Whiteboard Q24 (Constitutional AI) — be able to explain the SL-CAI / RL-CAI distinction.
OpenAI
What they value
- Scale and engineering velocity. OpenAI ships fast and at scale; engineers are expected to be product-savvy as well as technically deep.
- Product judgment. "What should we ship?" is a real interview question. They want builders who think about user value, not just elegant code.
- Practical scaling laws intuition. Compute, data, parameter budgets, inference economics. They invented Chinchilla's predecessor (Kaplan 2020) and operate at the frontier of inference scaling.
- Less alignment-public, more capabilities-public than Anthropic. Their public research mix is heavier on capabilities and product (GPT-4 system card, o1 reasoning, Sora, Realtime API).
What to expect in the loop
- Phone screen. Coding-heavy. Implement something. Speed matters.
- ML systems round. Often "design an inference service at our scale" — capacity math from prompt 1 of
02-systems-design-for-llms.md is the bullseye.
- Coding round. Implement attention / sampling / a primitive in PyTorch. Speed is graded.
- Product / judgment round. "If you were the PM for ChatGPT, what would you prioritize?" — they want to see opinions backed by data.
- Research / paper round. May ask about scaling laws, mixture-of-experts, multimodal training.
Specific topics
| Topic |
Source |
| Scaling laws |
Kaplan 2020, Hoffmann 2022, Hoffmann 2024 update |
| RLHF (InstructGPT) |
Ouyang 2022 |
| Mixture of experts |
Switch Transformer, GShard, GPT-4 rumored architecture |
| Multimodal |
GPT-4V system card, Sora technical report |
| Reasoning (o1, o3) |
OpenAI o1 / o3 system cards |
| Inference / batching |
Continuous batching, speculative decoding |
lynx-cortex prep
- Phase 33 (inference serving) + drill 12 (continuous batcher) for the systems round.
- Phase 38 (MLOps) for cost-discipline answers (CpQU framing).
- Paper-pitch cards 4 (GPT-3), 10 (Chinchilla), 13 (InstructGPT) memorized.
- Have a product opinion: pick one feature of an OpenAI product, argue why it works or doesn't.
Google DeepMind
What they value
- Research depth. DeepMind has the strongest research-publication culture of any major lab. Even engineering roles are expected to read papers fluently.
- Math. RL, theory, optimization, information theory. The bar is the highest for math fluency among the labs.
- Long-horizon problems. AlphaFold, AlphaProof, AlphaGeometry — DeepMind chases hard, long-running scientific goals.
- Production rigor. Now merged with Google Brain — production scale is also part of the bar.
What to expect in the loop
- Phone screen. Coding + research-paper short discussion.
- Research round (for RS, RE roles). Present your prior work; expect deep probing on methodology, ablations, and "what would you do differently".
- Math / theory round. Optimization, information theory, RL fundamentals. Be ready to derive policy gradient on a whiteboard.
- Coding round. Standard.
- Paper round. Likely a DeepMind paper — Chinchilla, Flamingo, Gemini, Gato, AlphaFold, AlphaProof, Gemma.
Specific topics
| Topic |
Source |
| Chinchilla scaling |
Hoffmann 2022 |
| RL fundamentals |
Sutton & Barto; PPO; SAC; DQN |
| Distributed training |
Megatron, JAX/Flax, DeepSpeed |
| AlphaGo lineage |
Silver 2016, AlphaZero, MuZero |
| Gemini architecture |
Gemini technical report |
lynx-cortex prep
- Phase 04 (calculus & optimization) is the highest leverage. Derive everything.
- Phase 19 (training dynamics) — DeepMind cares about training stability.
- X3 module — RL fundamentals (theory/01).
- Paper-pitch card 12 (Chinchilla) memorized; be able to derive the scaling law fit.
Google Brain (within Google DeepMind)
What they value
- Paper-prolific — historically the most-publishing single team in AI.
- Production scale — they ship to Google Search, Workspace, Pixel.
- Eclectic — covers vision, NLP, robotics, healthcare AI, hardware (TPU). Bring your specialty.
- Post-2023, organizationally merged with DeepMind; some signals overlap.
What to expect in the loop
- Similar to DeepMind but with more emphasis on production infrastructure (TPU, JAX, GShard).
- More likely than DeepMind to ask about MLOps, multi-tenant serving, latency tuning.
Specific topics
| Topic |
Source |
| Transformer (original) |
Vaswani 2017 |
| BERT / T5 / PaLM |
Devlin 2018, Raffel 2019, Chowdhery 2022 |
| Mixture of experts |
Shazeer 2017, Switch Transformer |
| Pathways / JAX |
Pathways paper, JAX docs |
| TPU architecture |
TPU papers, MLPerf submissions |
lynx-cortex prep
- Same as DeepMind, plus Phase 33 (inference serving) and Phase 35 (distributed).
- Paper-pitch card 1 (Attention) memorized.
xAI
What they value
- Engineering pragmatism. Less paper-prolific than DeepMind, more focused on training & deploying at scale fast.
- Hardware sense. Grok was trained on a Memphis cluster (Colossus, 100k+ H100s). They want engineers who understand GPU networking, NCCL, RDMA.
- Iteration speed. xAI shipped Grok, Grok-1.5, Grok-2, Grok-3 quickly. They reward people who ship.
- Pragmatic alignment. Less focused on theoretical alignment than Anthropic; more "we shipped a product, here are the guardrails".
What to expect in the loop
- Phone screen. Coding-heavy.
- Systems round. Likely "how would you train a frontier model on N GPUs" — capacity math, distributed strategies (FSDP, ZeRO, tensor / pipeline / sequence parallelism), failure handling.
- Coding round. Implementation under time pressure.
- Culture round. Less STAR, more "tell us what excites you about Grok".
Specific topics
| Topic |
Source |
| Distributed training |
Megatron, FSDP, ZeRO, DeepSpeed |
| NCCL / RDMA / Infiniband |
NVIDIA networking docs |
| FlashAttention |
Dao 2022, FA-2, FA-3 |
| Llama-style architectures |
Llama 2, Llama 3 technical reports |
| Grok system cards |
xAI published model docs |
lynx-cortex prep
- Phase 35 (distributed) is the highest leverage.
- Phase 23 + 24 (GPU + CUDA / Triton) — be able to write a Triton kernel sketch.
- Drill 03 (gradient checkpointing) and drill 12 (continuous batcher).
Cohere
What they value
- Enterprise multilingual. Cohere's positioning is API-first, business-facing, multilingual-strong.
- Retrieval excellence. Their Embed and Rerank models are best-in-class for many production RAG use cases.
- Practical / deployable. Less "frontier model race", more "make production retrieval work".
What to expect in the loop
- Strong emphasis on RAG / retrieval systems design.
- Multilingual tokenization and evaluation questions.
- Enterprise integration — authn/authz, multi-tenancy, data residency.
Specific topics
| Topic |
Source |
| Dense retrieval |
DPR, BGE, E5, Cohere Embed v3 |
| Reranking |
Cross-encoder, Cohere Rerank |
| Hybrid retrieval |
BM25 + dense fusion |
| Multilingual tokenization |
XLM-R, Aya |
| Long-context retrieval |
Recursive retrieval, hierarchical RAG |
lynx-cortex prep
- Phase 29 (RAG) is the highest leverage.
- Phase 11 (tokenization) with attention to multilingual ratios.
- Whiteboard Q19 (RAG) and Q20 (vocab) memorized.
Mistral
What they value
- Open-weight pragmatism. Mistral has shipped strong open weights (Mistral 7B, Mixtral, Mistral Large). They blend French research culture with European startup intensity.
- Efficient architectures. Sliding-window attention, mixture of experts, grouped-query attention. Mistral's reputation is "smart architecture choices, not just bigger".
- Multilingual. Especially European languages.
What to expect in the loop
- Strong focus on architecture choices and ablations.
- "Why GQA vs MHA vs MQA" is a typical question.
- Open-weight ecosystem fluency (Hugging Face, llama.cpp, vLLM).
Specific topics
| Topic |
Source |
| Sliding-window attention |
Mistral 7B paper |
| Mixture of experts |
Mixtral paper |
| Grouped-query attention |
Ainslie 2023 |
| Mistral Large |
Mistral technical reports |
| Open-weight tooling |
Hugging Face, vLLM, llama.cpp |
lynx-cortex prep
- Phase 27 (modern attention) for sliding-window and FlashAttention.
- Phase 36 (frontier architectures) for MoE.
- Paper-pitch card 15 (Mistral 7B) memorized.
Cross-company quick-reference matrix
| Topic |
Anthropic |
OpenAI |
DeepMind |
Brain |
xAI |
Cohere |
Mistral |
| Alignment depth |
★★★ |
★★ |
★★ |
★★ |
★ |
★ |
★ |
| Math depth |
★★ |
★★ |
★★★ |
★★ |
★★ |
★★ |
★★ |
| Distributed training |
★★ |
★★★ |
★★★ |
★★★ |
★★★ |
★ |
★★ |
| Production scale |
★★ |
★★★ |
★★ |
★★★ |
★★★ |
★★★ |
★★ |
| Retrieval |
★ |
★ |
★ |
★ |
★ |
★★★ |
★ |
| Open-weights / ecosystem |
★ |
★ |
★ |
★★ |
★ |
★ |
★★★ |
| Constitutional AI / Safety |
★★★ |
★★ |
★★ |
★★ |
★ |
★ |
★ |
A note on tone
Each lab has a tone you should match. Read 3-5 of their published blog posts before your interview. Anthropic: measured and uncertainty-aware. OpenAI: confident and product-forward. DeepMind: scholarly. xAI: irreverent. Cohere/Mistral: pragmatic. Mirroring the tone signals fit.
→ Move on to the lab files: ../lab/00-mock-interview-checklist.md, ../lab/01-paper-pitch-cards.md.