English · Español
Lab 00 — Entropy by hand¶
Read
theory/02-entropy-and-kl.mdbefore starting. Do not consultsolutions/.
Objective¶
Hand-compute entropy on a small categorical distribution; verify the upper-bound proof works; implement entropy(p) in NumPy with the correct handling of \(p_i = 0\).
Setup¶
Use the §A13 verb-tense alphabet: {infinitive, present, past, past_participle, future} (\(V = 5\)).
Tasks¶
Task 1 — entropy on paper¶
For each of the following distributions over the 5 tenses, compute \(H(p)\) in nats by hand. Show your arithmetic.
| Distribution | \(p\) |
|---|---|
| A | \((1, 0, 0, 0, 0)\) |
| B | \((0.5, 0.5, 0, 0, 0)\) |
| C | \((0.25, 0.25, 0.25, 0.25, 0)\) |
| D | \((0.2, 0.2, 0.2, 0.2, 0.2)\) (uniform) |
| E | \((0.6, 0.1, 0.1, 0.1, 0.1)\) |
Predict which has the largest entropy before you compute. Then check.
Task 2 — implement entropy(p) in NumPy¶
Constraints:
- Pure NumPy; no
scipy.stats.entropy, no PyTorch. - Must handle \(p_i = 0\) correctly (convention: \(0 \log 0 = 0\)). Hint: there is a one-line idiom involving
np.whereorxlogy. - Must validate that \(p\) is a valid probability vector: shape, non-negative, sums to 1 within tolerance.
- Module:
src/phase05/probability.py(this is a Phase 05 scratch module — does NOT graduate intosrc/utils/; that's a later phase).
Signature suggestion:
def entropy(p: NDArray[np.float64]) -> float:
"""Return H(p) in nats. Raises ValueError if p is not a valid distribution."""
Task 3 — verify the upper bound numerically¶
For \(V \in \{2, 5, 10, 100, 600\}\):
- Sample 1000 random distributions on \(V\) outcomes (e.g., from a Dirichlet(1, ..., 1)).
- For each, compute \(H\).
- Verify \(H \le \log V\) for every sample.
- Plot the histogram of \(H / \log V\) — most should be close to 1 (Dirichlet(1,...,1) concentrates near uniform for moderate \(V\)).
Task 4 — reproduce the Jensen proof¶
In your lab notes (or a notebook cell), write out the proof:
Justify each step. Identify exactly where Jensen's inequality is used and verify the concavity claim.
Measurements to capture¶
- Wall-clock to compute
entropy(p)on \(V = 600\), 100k samples (should be ≲ 10 ms — it's a tiny op). - Sample manifest under
experiments/<date>-phase-05-entropy/manifest.jsonpersrc/utils/seeding.py. - The histogram from Task 3 saved as
experiments/<date>-phase-05-entropy/histogram.png.
Acceptance¶
- All 5 distributions A-E have correct entropies computed on paper.
-
entropy(p)handles \(p_i = 0\) withoutNaN. -
entropy(p)raisesValueErroron invalid inputs (non-normalised, negative values, wrong shape). - Property tests pass:
entropy(uniform_V) ≈ log(V),entropy(point_mass) == 0. - Histogram plot exists; visual check that all samples respect the bound.
- Jensen proof reproduced in your notes.
Pitfalls to expect¶
np.log(p)on a vector with zeros silently emits-inf; multiplying by 0 givesNaN(the IEEE-7540 * infrule). Usenp.where(p > 0, p * np.log(p), 0.0)orscipy.special.xlogy(p, p).- Dirichlet samples may not be exactly normalised (rounding); your validator should allow
np.isclose(p.sum(), 1.0)with default tolerance. - Confusing nats and bits: \(H\) in nats uses
np.log; in bits usenp.log2. The convention for this project is nats (matches PyTorch'sF.cross_entropy).