English · Español
Lab 00 — Environment and utilities¶
Goal: ship
src/utils/seeding.pyandsrc/utils/logging.py— two tiny modules every later phase imports.Estimated time: 60–90 minutes.
Prereqs: Phase 0 environment is set up (
uv,just,ruff,mypy,pytest).
What you produce¶
src/utils/seeding.py—seed_everything(seed: int) -> int. Seeds Pythonrandom, NumPydefault_rng, andPYTHONHASHSEED. Returns the seed for logging.src/utils/logging.py—get_logger(name: str). Returns astructloglogger configured to emit JSON to stdout, with aphasecontext field that callers can set.tests/test_seeding.py— verifies determinism: two calls toseed_everything(42)followed bynp.random.default_rng().random(5)produce identical arrays.tests/test_logging.py— verifies the logger emits a JSON line with the expected fields.experiments/06-environment-check/manifest.json+README.md— a tiny smoke test that imports both utilities, logs one message, and seeds an RNG.
TODOs¶
Block A — seed_everything¶
- Function signature:
def seed_everything(seed: int) -> int. Type-annotated. Docstring. - Seed Python's
random.seed(seed). - Set
os.environ["PYTHONHASHSEED"] = str(seed)— note this only affects child processes; document the caveat. - Do NOT call
np.random.seed(seed)(global state — see theory/01). Instead, return the seed and let callers dorng = np.random.default_rng(seed_everything(42)). - Decide and document: should
torchbe seeded too? Phase 6 doesn't import torch insrc/; defer to Phase 8 tests which need it. Keepseeding.pytorch-free. - Log a
structlogevent when called:log.info("seed_set", seed=seed).
Block B — get_logger¶
- Function signature:
def get_logger(name: str). - Configure
structlogonce at module import time (idempotent). - Processors: timestamp (ISO 8601, UTC), log level, the message itself, JSONRenderer.
- Provide a
bind_phase(phase: str)helper that returns a logger pre-bound with aphasefield. - Test that calling
get_logger("foo").info("bar", x=1)emits valid JSON on stdout containingevent="bar",x=1, a timestamp, andlogger="foo".
Block C — tests¶
-
tests/test_seeding.py: - Test 1:
seed_everything(42)returns 42. - Test 2: After two calls to
seed_everything(42), twonp.random.default_rng(42).random(5)calls produce identical arrays. - Test 3: After
seed_everything(42),random.random()is deterministic. -
tests/test_logging.py: - Test 1:
get_loggerreturns an object withinfo,warning,error,debugmethods. - Test 2: Capturing stdout, calling
log.info("event_name", key="value")emits JSON containing"event": "event_name"and"key": "value". Usecapsysfixture. - Test 3:
bind_phase("phase-06").info("foo")includes"phase": "phase-06"in the JSON.
Block D — smoke test¶
- Create
experiments/06-environment-check/with: check.py— imports both utilities, callsseed_everything(42), logs one info message with the seed.manifest.json—{seed, versions, config, hardware}perLYNX_CORTEX.md§5.README.md(1 paragraph) — what this experiment verifies.
Constraints¶
mypy --strictmust pass. All functions typed, no implicitAny.ruffmust pass. Line length 100 (the repo default — confirm inpyproject.toml).pytestmust pass. All tests green.- No
print. Uselog.infoeven in the smoke test. - No
np.random.seed. Usenp.random.default_rng(seed)only. - Idempotency.
seed_everything(42); seed_everything(42)must produce the same downstream behavior as a single call. Same forget_logger("foo"); get_logger("foo").
Stop conditions¶
Done when:
- Both utility files exist, both pass
mypy --strictandruff. - Both test files exist, all tests pass.
- The smoke experiment runs without error and emits a JSON log line.
git diffshows noprintstatements anywhere in your new files.
Pitfalls¶
structlognot configured before first use. If you calllog.info(...)beforestructlog.configure(...)runs, you get the default processor chain, not yours. Configure at module-import time oflogging.py, and make sure tests importlogging.pybefore exercising loggers.PYTHONHASHSEEDset after Python startup. Settingos.environ["PYTHONHASHSEED"]from within Python does not retroactively change the current process's hash seed (it was decided at interpreter startup). It only affects child processes spawned viasubprocess. Document this in the docstring; do not pretend it makes the parent process deterministic.- Test stdout capture.
capsys.readouterr().outreturns a string. For JSON lines, you may need toout.strip().split("\n")andjson.loadseach line. - Forgetting
__init__.py.src/utils/should already have__init__.pyfrom Phase 0. If not, create empty. - Circular imports.
seeding.pycallsget_logger;logging.pydoesn't import seeding. Keep it one-directional.
When to consult solutions/¶
After you have:
- Committed both utility files.
- Both test files green.
- The smoke experiment ran and you can paste the JSON output into your
README.md.
Then read solutions/00-environment-and-utilities-ref.md (written at phase open) to compare structure choices.
Next lab: lab/01-strides-and-views.md.