Skip to content

English · Español

Lab 03 — Norms by experiment: verifying ||Ax|| ≤ ||A||·||x||

Goal: verify the submultiplicative inequality empirically and find the equality case (top singular vector).

Estimated time: 30–45 minutes.

Prereq: theory/04-norms-and-conditioning.md read.


What you produce

A directory experiments/03-norms-by-experiment/ containing:

  • norms.py — script (Borja writes).
  • results.json — measurements.
  • equality_case.png — visualisation of when ||Ax|| / ||A||·||x|| approaches 1.
  • manifest.json.
  • README.md — 1-2 paragraphs.

And implementations in src/minigrad/linalg.py: frobenius_norm, l1_norm, l2_norm, linf_norm, op_norm.

Part A — Implement norms in src/minigrad/linalg.py

  • l1_norm(x), l2_norm(x), linf_norm(x) for 1-D vectors. Pure NumPy.
  • frobenius_norm(A) for matrices.
  • op_norm(A) returning the L2 operator norm = σ_max(A). Use np.linalg.svd(A, full_matrices=False, compute_uv=False).

Constraints for the implementations

  • Numerical stability. For very large or very small entries, the naive sqrt(sum(x**2)) can overflow / underflow. Use np.linalg.norm under the hood is fine — but for one of them, implement the stable form max(|x|) * sqrt(sum((x / max(|x|))**2)) by hand and test that it survives entries with magnitude 1e30.
  • No scipy.linalg.norm. Pure NumPy.
  • Type annotations + docstrings. Each docstring states the norm formula and links to theory/04.

Tests in tests/test_linalg.py

Add:

  • test_norms_match_numpy — verify against np.linalg.norm for random inputs.
  • test_frobenius_equals_l2_of_singular_values — for random A, verify frobenius_norm(A)² ≈ sum(σ²).
  • test_op_norm_is_sigma_max — for random A, verify op_norm(A) == max(np.linalg.svd(A, compute_uv=False)).
  • test_submultiplicative_inequality — for random A, x, verify l2_norm(A @ x) ≤ op_norm(A) * l2_norm(x) with atol=1e-5.
  • test_stable_l2_handles_huge_valuesx = [1e30, 1e30], expect sqrt(2) * 1e30, not overflow.

Part B — The experiment

In norms.py:

  • Generate A = np.random.randn(M, N) with M=64, N=64. Compute its SVD: U, S, V = np.linalg.svd(A, full_matrices=False).
  • Sample 1000 random unit vectors x_i ∈ R^N (Gaussian then normalize). Compute r_i = ||A x_i||_2 / (||A||_2 · ||x_i||_2).
  • Plot a histogram of r_i. All values should be ≤ 1.
  • Equality case: set x_top = V.T[:, 0] (top right singular vector). Compute r_top = ||A x_top||_2 / (||A||_2 · ||x_top||_2). Should be ≈ 1 (within fp32 ulp).
  • Plot the histogram with r_top marked. Annotate.

Block C — what to record

In results.json:

{
  "matrix_shape": [64, 64],
  "matrix_seed": 42,
  "n_random_samples": 1000,
  "sigma_max": ...,
  "ratios": {
    "min": ...,
    "max": ...,
    "mean": ...,
    "ratio_at_top_singular_vector": ...
  }
}

Block D — interpret

In README.md, answer:

  1. What's the max of the ratios? Should be very close to 1 (the top-singular-vector case).
  2. What's the mean? For a random unit vector, E[r] ≈ ||A||_F / (sqrt(N) · ||A||_2). Compute this and compare.
  3. Verify SVD orthogonality: check that U.T @ U ≈ I and V @ V.T ≈ I to fp32 tolerance.
  4. What happens if you replace ||·||_2 with ||·||_F? Does ||Ax||_F ≤ ||A||_F · ||x||_F hold? (Hint: yes, but it's a weaker bound.)

Block E — manifest

Standard manifest. Document the matrix seed for reproducibility.

Constraints

  • fp32 throughout. Matches the rest of the curriculum.
  • No torch.linalg.norm. Pure NumPy.
  • Single-threaded. Add OPENBLAS_NUM_THREADS=1 to the manifest.

Pitfalls

  • r_top slightly > 1. Possible due to fp32 in np.linalg.svd returning a slightly-too-small S[0]. If it exceeds 1 by more than 1e-5, your op_norm is wrong. If by less, that's fp32 noise.
  • Random unit vectors: np.random.randn(N) / np.linalg.norm(...) is the standard way. Don't use a uniform [0, 1] vector — biased toward the all-positive corner.
  • Histogram with too few bins. Use ~30 bins for 1000 samples.

Stop conditions

  • All norms implemented; all tests pass.
  • Histogram shows ratios in [0, 1] with the equality case at ≈ 1.
  • README answers all four interpretation questions.

When to consult solutions/

After committing the histogram + tests. solutions/03-norms-by-experiment-ref.md (at phase open) discusses concentration of measure (why most random ratios are clustered well below 1 in high dimensions) and links forward to Phase 4's optimization-on-anisotropic-loss intuition.


End of Phase 3 lab. Next phase: docs/phase-04-calculus-optimization/.