English · Español

Lab 03 — Norms by experiment: verifying `||Ax|| ≤ ||A||·||x||`¶

Goal: verify the submultiplicative inequality empirically and find the equality case (top singular vector).

Estimated time: 30–45 minutes.

Prereq: theory/04-norms-and-conditioning.md read.

What you produce¶

A directory experiments/03-norms-by-experiment/ containing:

norms.py — script (Borja writes).
results.json — measurements.
equality_case.png — visualisation of when ||Ax|| / ||A||·||x|| approaches 1.
manifest.json.
README.md — 1-2 paragraphs.

And implementations in src/minigrad/linalg.py: frobenius_norm, l1_norm, l2_norm, linf_norm, op_norm.

Part A — Implement norms in `src/minigrad/linalg.py`¶

l1_norm(x), l2_norm(x), linf_norm(x) for 1-D vectors. Pure NumPy.
frobenius_norm(A) for matrices.
op_norm(A) returning the L2 operator norm = σ_max(A). Use np.linalg.svd(A, full_matrices=False, compute_uv=False).

Constraints for the implementations¶

Numerical stability. For very large or very small entries, the naive sqrt(sum(x**2)) can overflow / underflow. Use np.linalg.norm under the hood is fine — but for one of them, implement the stable form max(|x|) * sqrt(sum((x / max(|x|))**2)) by hand and test that it survives entries with magnitude 1e30.
No scipy.linalg.norm. Pure NumPy.
Type annotations + docstrings. Each docstring states the norm formula and links to theory/04.

Tests in `tests/test_linalg.py`¶

Add:

test_norms_match_numpy — verify against np.linalg.norm for random inputs.
test_frobenius_equals_l2_of_singular_values — for random A, verify frobenius_norm(A)² ≈ sum(σ²).
test_op_norm_is_sigma_max — for random A, verify op_norm(A) == max(np.linalg.svd(A, compute_uv=False)).
test_submultiplicative_inequality — for random A, x, verify l2_norm(A @ x) ≤ op_norm(A) * l2_norm(x) with atol=1e-5.
test_stable_l2_handles_huge_values — x = [1e30, 1e30], expect sqrt(2) * 1e30, not overflow.

Part B — The experiment¶

In norms.py:

Generate A = np.random.randn(M, N) with M=64, N=64. Compute its SVD: U, S, V = np.linalg.svd(A, full_matrices=False).
Sample 1000 random unit vectors x_i ∈ R^N (Gaussian then normalize). Compute r_i = ||A x_i||_2 / (||A||_2 · ||x_i||_2).
Plot a histogram of r_i. All values should be ≤ 1.
Equality case: set x_top = V.T[:, 0] (top right singular vector). Compute r_top = ||A x_top||_2 / (||A||_2 · ||x_top||_2). Should be ≈ 1 (within fp32 ulp).
Plot the histogram with r_top marked. Annotate.

Block C — what to record¶

In results.json:

{
  "matrix_shape": [64, 64],
  "matrix_seed": 42,
  "n_random_samples": 1000,
  "sigma_max": ...,
  "ratios": {
    "min": ...,
    "max": ...,
    "mean": ...,
    "ratio_at_top_singular_vector": ...
  }
}

Block D — interpret¶

In README.md, answer:

What's the max of the ratios? Should be very close to 1 (the top-singular-vector case).
What's the mean? For a random unit vector, E[r] ≈ ||A||_F / (sqrt(N) · ||A||_2). Compute this and compare.
Verify SVD orthogonality: check that U.T @ U ≈ I and V @ V.T ≈ I to fp32 tolerance.
What happens if you replace ||·||_2 with ||·||_F? Does ||Ax||_F ≤ ||A||_F · ||x||_F hold? (Hint: yes, but it's a weaker bound.)

Block E — manifest¶

Standard manifest. Document the matrix seed for reproducibility.

Constraints¶

fp32 throughout. Matches the rest of the curriculum.
No torch.linalg.norm. Pure NumPy.
Single-threaded. Add OPENBLAS_NUM_THREADS=1 to the manifest.

Pitfalls¶

r_top slightly > 1. Possible due to fp32 in np.linalg.svd returning a slightly-too-small S[0]. If it exceeds 1 by more than 1e-5, your op_norm is wrong. If by less, that's fp32 noise.
Random unit vectors: np.random.randn(N) / np.linalg.norm(...) is the standard way. Don't use a uniform [0, 1] vector — biased toward the all-positive corner.
Histogram with too few bins. Use ~30 bins for 1000 samples.

Stop conditions¶

All norms implemented; all tests pass.
Histogram shows ratios in [0, 1] with the equality case at ≈ 1.
README answers all four interpretation questions.

When to consult `solutions/`¶

After committing the histogram + tests. solutions/03-norms-by-experiment-ref.md (at phase open) discusses concentration of measure (why most random ratios are clustered well below 1 in high dimensions) and links forward to Phase 4's optimization-on-anisotropic-loss intuition.

End of Phase 3 lab. Next phase: docs/phase-04-calculus-optimization/.

Lab 03 — Norms by experiment: verifying ||Ax|| ≤ ||A||·||x||¶