Skip to content

English · Español

Break — Remove PYTHONHASHSEED from seed_everything

🇪🇸 Pequeña ruptura, gran lección: si PYTHONHASHSEED no se fija (o se fija demasiado tarde), hash(str) cambia entre procesos y los dataloaders que indexan por hash producen orden distinto en cada ejecución. La reproducibilidad muere en silencio.

This /break exercise targets reproducibility hygiene. It is intentionally subtle: the test suite may still pass; the failure shows up only across process restarts.

Hypothesis

The learner predicts: "If I remove the os.environ['PYTHONHASHSEED'] line from seed_everything, my unit tests will still pass because random and numpy are still seeded — but any code that depends on hash(str) ordering between fresh Python processes will silently diverge."

The break

In src/utils/seeding.py, comment out the line that sets PYTHONHASHSEED:

 def seed_everything(seed: int) -> None:
-    os.environ["PYTHONHASHSEED"] = str(seed)
+    # os.environ["PYTHONHASHSEED"] = str(seed)  # /break: removed
     random.seed(seed)
     np.random.seed(seed)
     ...

Run procedure

Run this twice as separate Python processes (not two calls in the same REPL):

uv run python -c "
from src.utils.seeding import seed_everything
seed_everything(0)
print(sorted({'work','play','walk','study','listen'}, key=hash))
"
# Run the same command again. Compare the two outputs.

In a parallel control, set the env var at the launcher level:

PYTHONHASHSEED=0 uv run python -c "
print(sorted({'work','play','walk','study','listen'}, key=hash))
"
# Repeat. Outputs should now agree.

Expected failure mode

  • Without the fix: the two process invocations produce different orderings of the verb set. The random and numpy outputs still match (because those are seeded inside the process), so a naive unit test would pass.
  • With the launcher-level env var: outputs match across processes.

Quantitative signature: across 10 trials with 5-string sets, the expected probability that two random Python startups produce the same hash ordering of 5 strings is 1/5! = 1/120 ≈ 0.83%. In practice you will see disagreement within 1–2 trials.

Diagnostic

From logs alone, the symptom is "my dataloader produces a different sample order on each restart, even though seed_everything(0) is the first thing I call." The smoking-gun check:

echo $PYTHONHASHSEED         # empty when broken
uv run python -c "import os; print(os.environ.get('PYTHONHASHSEED', 'unset'))"

If the env var is unset at process start but os.environ['PYTHONHASHSEED'] is set inside the process, it does not retroactively affect hash(str) — the C-level hash randomizer is initialized before any Python code runs.

Lesson

PYTHONHASHSEED is special: it must be set in the environment of the launching shell, not inside the script. The just-recipe wrapper and the test harness both export it; calling seed_everything inside the script is hygienic but not sufficient. Documenting this caveat in the function's docstring (and in the daily journal entry) is the actual deliverable; the broken state taught the reason.

Reference

  • CPython docs, PYTHONHASHSEED and sys.flags.hash_randomization.
  • PEP 456 — Secure and interchangeable hash algorithm (background on why hash randomization exists in the first place).