Skip to content

English · Español

Lab 01 — Strides and Views

Goal: see that arr.T is O(1) and that np.ascontiguousarray(arr.T) is O(n), by measurement. Produce, intentionally, an aliasing bug. Build muscle memory for flags.OWNDATA.

Estimated time: 45–60 minutes.

Prereqs: lab 00.


What you produce

A directory experiments/06-strides-and-views/ containing:

  • measure.py — your benchmark script.
  • aliasing.py — a 20-line script that produces the aliasing bug intentionally and prints proof.
  • results.json — measured times for arr.T vs np.ascontiguousarray(arr.T) vs arr.copy() across sizes.
  • transpose-cost.png — log-log plot of time vs array size for all three operations.
  • manifest.json{seed, versions, config, hardware}.
  • README.md (3–4 paragraphs) — what you measured, what the aliasing bug demonstrated, three sentences of interpretation.

TODOs

Block A — measure transpose vs contiguous-transpose

  • Allocate a = np.random.default_rng(42).standard_normal((N, N), dtype=np.float32) for N ∈ {64, 128, 256, 512, 1024, 2048, 4096} (7 sizes).
  • For each N, time three operations (repeat each ≥10× after one warm-up):
  • b = a.T (just relabel)
  • c = np.ascontiguousarray(a.T) (copy with new layout)
  • d = a.copy() (copy without transposing — should match (2) asymptotically)
  • Record N, n_bytes (= 4·N²), op, time_ns_per_call. Save to results.json.
  • Assert in your script: b.base is a, c.base is not a, c.flags.C_CONTIGUOUS is True, b.flags.C_CONTIGUOUS is False.

Block B — plot

  • matplotlib. x-axis: n_bytes in MiB (log scale). y-axis: time in ms (log scale).
  • Three lines: a.T, ascontiguousarray(a.T), a.copy(). Markers + lines.
  • a.T should be flat near 1 μs (independent of N). The other two should grow linearly with n_bytes.
  • Save as transpose-cost.png.

Block C — aliasing bug, on purpose

In aliasing.py:

  • Allocate a = np.arange(20, dtype=np.int32).
  • Take b = a[::2] (every other element).
  • Assert b.flags.OWNDATA is False and b.base is a.
  • Mutate: b[0] = -999.
  • Print a — show that a[0] is now -999.
  • Then show the safe pattern: c = a[::2].copy(). Mutate c. Show that a is unaffected.
  • Write a 1-sentence comment at the top of aliasing.py describing the bug.

Block D — as_strided demo (optional, exploratory)

If you have time, use numpy.lib.stride_tricks.sliding_window_view to create a length-3 rolling-window view of a 10-element array. Print it. Confirm via np.shares_memory that it shares the base buffer. Do not write into it.

Block E — manifest

Standard schema (see lab 00 for the template). Include in config:

"sizes_N": [64, 128, 256, 512, 1024, 2048, 4096],
"ops": ["transpose", "ascontiguousarray_T", "copy"],
"warmup_iters": 1,
"min_repeats": 10

Constraints

  • fp32, not fp64. We want to match later phases' tensor dtype.
  • Same RNG seed for all sizes. Reproducibility.
  • CPU governor performance. As in Phase 1 lab.
  • No threading. Single-threaded measurement.

Expected results (rough, for the i5-8250U)

Op N=1024 N=4096
a.T ~1 μs ~1 μs
np.ascontiguousarray(a.T) ~3 ms ~70 ms
a.copy() ~2 ms ~30 ms

The first row should be flat. The other two should scale roughly with . ascontiguousarray(a.T) is slower than plain copy() because the transpose walks the buffer in non-unit-stride order (poor cache behavior).

Stop conditions

Done when:

  1. transpose-cost.png shows the flat-vs-linear contrast clearly.
  2. aliasing.py runs end-to-end and the printed a array reflects the write through b.
  3. README.md interprets the curve shape (one sentence per line: "a.T is flat because…", "ascontiguousarray(a.T) scales linearly because…", "copy() is slightly faster because…").

Pitfalls

  • JIT warmup. First call to np.ascontiguousarray may include library load time. Run one warm-up before the timing loop.
  • GC during timing. Wrap timed blocks with gc.disable() / gc.enable() for repeatability.
  • Misleading large-N numbers from swap. A 4096×4096 fp32 is 64 MiB — fine on 62 GiB RAM. But if you push to N=16384 (1 GiB), make sure you're not hitting swap. Watch free -h.
  • b = a.T; b[0, 0] = 99 should mutate a. If it doesn't, you accidentally took a copy somewhere. Check b.base.
  • Forgetting OWNDATA proof. The whole point of the aliasing block is to make the bug visible. Print b.flags before the mutation.

When to consult solutions/

After your transpose-cost.png exists, your aliasing.py runs, and your README.md has the three interpretation sentences. Then solutions/01-strides-and-views-ref.md (written at phase open) shows the reference timings and the canonical aliasing-bug walkthrough.


Next lab: lab/02-broadcasting-trap.md.