English · Español

Lab 00 — `Parameter` and `Module` skeleton¶

Goal: implement Parameter (a 2-line subclass of Tensor) and Module (the ~25-line base class with __setattr__ registration, recursive parameters(), zero_grad, state_dict, load_state_dict, and __repr__). Get mypy --strict clean. Write the structural tests that prove the registration mechanic works.

Estimated time: 90–120 minutes.

Prereqs: Phase 8 closed (minitorch.Tensor available). Theory 00 + 01 read.

What you produce¶

A new module src/minimodel/nn/ with:

__init__.py re-exporting Parameter, Module.
module.py containing both classes.

And tests/test_module_skeleton.py with the registration mechanic tests.

🇪🇸 Toda la fase 9 descansa sobre este lab. Si Module no descubre los parámetros correctamente, ni Linear ni Sequential ni los optimizadores funcionan. Saca 90 minutos para que pase limpio antes de pasar a Lab 01.

TODOs¶

Block A — `Parameter`¶

In src/minimodel/nn/module.py:

Define Parameter(Tensor):
__init__(self, data, requires_grad: bool = True) — call super().__init__(data, requires_grad=requires_grad).
Nothing else.

Block B — `Module` base class¶

Define Module:
__init__(self) -> None:
- Use object.__setattr__(self, "_parameters", {}) and object.__setattr__(self, "_modules", {}).
- Also object.__setattr__(self, "training", True).
__setattr__(self, name, value) -> None:
- Pop name from both _parameters and _modules (handle the re-assignment case).
- If isinstance(value, Parameter): register in _parameters[name].
- Elif isinstance(value, Module): register in _modules[name].
- In all cases, also object.__setattr__(self, name, value).
parameters(self) -> Iterator[Parameter]:
- Yield from _parameters.values().
- For each submodule in _modules.values(), recursively yield from submodule.parameters().
zero_grad(self) -> None:
- For each p in self.parameters(), set p.grad = None.
state_dict(self, prefix: str = "") -> dict[str, np.ndarray]:
- Flat dict; keys are dot-separated paths.
- Recursively walk _modules.
load_state_dict(self, state, prefix: str = "") -> None:
- In-place copy state[key] into the matching Parameter.data.
- Recursively walk _modules.
train(self, mode: bool = True) -> None and eval(self) -> None: set self.training; stub for Phase 9 (no-op semantically, but the attribute must exist).
forward(self, *args, **kwargs) -> Any: raise NotImplementedError.
__call__(self, *args, **kwargs): dispatch to self.forward(*args, **kwargs).
__repr__(self) -> str: emit a tree-style print of submodules and parameter shapes.

Tests¶

In tests/test_module_skeleton.py:

Block C — basic registration¶

test_parameter_is_tensor:

p = Parameter(np.array([1.0, 2.0, 3.0]))
assert isinstance(p, Tensor)
assert p.requires_grad is True

test_module_init_no_parameters:

m = Module()
assert list(m.parameters()) == []

test_single_parameter_registered:

class M(Module):
    def __init__(self):
        super().__init__()
        self.w = Parameter(np.zeros(3))
m = M()
params = list(m.parameters())
assert len(params) == 1
assert params[0] is m.w

test_submodule_registered:

class Inner(Module):
    def __init__(self):
        super().__init__()
        self.w = Parameter(np.zeros(3))

class Outer(Module):
    def __init__(self):
        super().__init__()
        self.inner = Inner()
        self.b = Parameter(np.zeros(1))

m = Outer()
params = list(m.parameters())
assert len(params) == 2
assert params[0] is m.b           # direct Parameter first
assert params[1] is m.inner.w     # then submodule's parameter

(Note: yield order is _parameters then _modules. Lock this convention.)

Block D — re-assignment / overwrite¶

test_parameter_reassign_replaces_registration:

class M(Module):
    def __init__(self):
        super().__init__()
        self.w = Parameter(np.zeros(3))
m = M()
new_p = Parameter(np.ones(5))
m.w = new_p
params = list(m.parameters())
assert len(params) == 1
assert params[0] is new_p

test_parameter_overwritten_with_module: Set self.w = Parameter(...), then self.w = Linear(2, 3) (you can use a temp stub class for Linear here — Lab 01 builds the real one). Verify _parameters no longer has "w" and _modules does.
test_parameter_overwritten_with_none: Set self.w = Parameter(...), then self.w = None. Verify _parameters no longer has "w", and self.w is None.

Block E — `zero_grad`, `state_dict`, `load_state_dict`¶

test_zero_grad_clears_grads: Construct a module with 2 parameters. Manually assign p.grad = np.ones_like(p.data). Call m.zero_grad(). Assert all parameter grads are None.

test_state_dict_keys:

m = Outer()
state = m.state_dict()
assert set(state.keys()) == {"b", "inner.w"}

test_load_state_dict_roundtrip:
m1 = Outer(). Capture state = m1.state_dict().
m2 = Outer() (different random init).
m2.load_state_dict(state).
Assert m1.b.data is not m2.b.data (different objects).
Assert np.array_equal(m1.b.data, m2.b.data) (same values).

Block F — edge cases¶

test_shared_parameter_yields_twice: Two attributes pointing at the same Parameter:

class M(Module):
    def __init__(self):
        super().__init__()
        shared = Parameter(np.zeros(3))
        self.in_emb = shared
        self.out_emb = shared

m = M()
params = list(m.parameters())
assert len(params) == 2
assert params[0] is params[1]

Document this as expected behavior (tied embeddings). Lab 02 reflection asks whether to deduplicate.

test_repr_does_not_crash: repr(M()) returns a string and doesn't recurse forever.
test_call_dispatches_to_forward: Subclass Module, override forward(self, x) to return x * 2. Verify m(5) == 10.
test_forward_not_implemented_raises: Module()(any_arg) should raise NotImplementedError.

Block G — `mypy --strict` clean¶

No Any types except where they appear in the upstream Tensor API.
parameters is typed Iterator[Parameter].
state_dict is typed dict[str, np.ndarray].
forward is typed Any (placeholder; subclasses narrow).

Constraints¶

No Linear, no Sequential, no activations. Those are Lab 01.
Module is the only abstract base. Don't introduce Layer, Container, LossModule etc. KISS — Phase 10+ may refactor if needed.
No PyTorch in src/minimodel/. PyTorch lives in tests/ only (some integration tests in Lab 02 use it).
__init_subclass__ is forbidden. The framework cleverness is limited to __setattr__. No metaclasses, no class decorators.

Pitfalls¶

__setattr__ recursion. If Module.__setattr__ writes via self.X = value (regular assignment), it triggers itself. Always use object.__setattr__(self, name, value) for the actual storage.
super().__init__() forgotten. Test it by constructing a subclass that forgets to call super, then assigning a Parameter — should raise AttributeError because _parameters doesn't exist. Document the expected error.
parameters() returns a generator, not a list. Consumers must convert (list(model.parameters())) if they need to iterate twice. The optimizer base class does the conversion internally.
state_dict key collisions. If two submodules have the same name (shouldn't happen but could via weird subclassing), the second overwrites the first. Add a test (test_state_dict_no_collision) that asserts unique keys.

Stop conditions¶

Done when:

Parameter is a 2-line subclass.
Module is ≤ 60 lines including type hints and the repr.
All 12+ tests above are green.
mypy --strict src/minimodel/nn/module.py clean.
ruff check src/minimodel/nn/module.py clean.
You can explain why we use object.__setattr__ in __init__ (not in __setattr__ body).

When to consult `solutions/`¶

After all tests pass. solutions/00-parameter-and-module-skeleton-ref.md (at phase open) compares your Module against the canonical implementation and points out where the cleverness can be even shorter (or longer, with better error messages).

Next lab: lab/01-linear-and-activations.md.

Lab 00 — Parameter and Module skeleton¶