Skip to content

English · Español

Lab 00 — Parameter and Module skeleton

Goal: implement Parameter (a 2-line subclass of Tensor) and Module (the ~25-line base class with __setattr__ registration, recursive parameters(), zero_grad, state_dict, load_state_dict, and __repr__). Get mypy --strict clean. Write the structural tests that prove the registration mechanic works.

Estimated time: 90–120 minutes.

Prereqs: Phase 8 closed (minitorch.Tensor available). Theory 00 + 01 read.


What you produce

A new module src/minimodel/nn/ with:

  • __init__.py re-exporting Parameter, Module.
  • module.py containing both classes.

And tests/test_module_skeleton.py with the registration mechanic tests.

🇪🇸 Toda la fase 9 descansa sobre este lab. Si Module no descubre los parámetros correctamente, ni Linear ni Sequential ni los optimizadores funcionan. Saca 90 minutos para que pase limpio antes de pasar a Lab 01.

TODOs

Block A — Parameter

In src/minimodel/nn/module.py:

  • Define Parameter(Tensor):
  • __init__(self, data, requires_grad: bool = True) — call super().__init__(data, requires_grad=requires_grad).
  • Nothing else.

Block B — Module base class

  • Define Module:
  • __init__(self) -> None:
    • Use object.__setattr__(self, "_parameters", {}) and object.__setattr__(self, "_modules", {}).
    • Also object.__setattr__(self, "training", True).
  • __setattr__(self, name, value) -> None:
    • Pop name from both _parameters and _modules (handle the re-assignment case).
    • If isinstance(value, Parameter): register in _parameters[name].
    • Elif isinstance(value, Module): register in _modules[name].
    • In all cases, also object.__setattr__(self, name, value).
  • parameters(self) -> Iterator[Parameter]:
    • Yield from _parameters.values().
    • For each submodule in _modules.values(), recursively yield from submodule.parameters().
  • zero_grad(self) -> None:
    • For each p in self.parameters(), set p.grad = None.
  • state_dict(self, prefix: str = "") -> dict[str, np.ndarray]:
    • Flat dict; keys are dot-separated paths.
    • Recursively walk _modules.
  • load_state_dict(self, state, prefix: str = "") -> None:
    • In-place copy state[key] into the matching Parameter.data.
    • Recursively walk _modules.
  • train(self, mode: bool = True) -> None and eval(self) -> None: set self.training; stub for Phase 9 (no-op semantically, but the attribute must exist).
  • forward(self, *args, **kwargs) -> Any: raise NotImplementedError.
  • __call__(self, *args, **kwargs): dispatch to self.forward(*args, **kwargs).
  • __repr__(self) -> str: emit a tree-style print of submodules and parameter shapes.

Tests

In tests/test_module_skeleton.py:

Block C — basic registration

  • test_parameter_is_tensor:

    p = Parameter(np.array([1.0, 2.0, 3.0]))
    assert isinstance(p, Tensor)
    assert p.requires_grad is True
    

  • test_module_init_no_parameters:

    m = Module()
    assert list(m.parameters()) == []
    

  • test_single_parameter_registered:

    class M(Module):
        def __init__(self):
            super().__init__()
            self.w = Parameter(np.zeros(3))
    m = M()
    params = list(m.parameters())
    assert len(params) == 1
    assert params[0] is m.w
    

  • test_submodule_registered:

    class Inner(Module):
        def __init__(self):
            super().__init__()
            self.w = Parameter(np.zeros(3))
    
    class Outer(Module):
        def __init__(self):
            super().__init__()
            self.inner = Inner()
            self.b = Parameter(np.zeros(1))
    
    m = Outer()
    params = list(m.parameters())
    assert len(params) == 2
    assert params[0] is m.b           # direct Parameter first
    assert params[1] is m.inner.w     # then submodule's parameter
    
    (Note: yield order is _parameters then _modules. Lock this convention.)

Block D — re-assignment / overwrite

  • test_parameter_reassign_replaces_registration:

    class M(Module):
        def __init__(self):
            super().__init__()
            self.w = Parameter(np.zeros(3))
    m = M()
    new_p = Parameter(np.ones(5))
    m.w = new_p
    params = list(m.parameters())
    assert len(params) == 1
    assert params[0] is new_p
    

  • test_parameter_overwritten_with_module: Set self.w = Parameter(...), then self.w = Linear(2, 3) (you can use a temp stub class for Linear here — Lab 01 builds the real one). Verify _parameters no longer has "w" and _modules does.

  • test_parameter_overwritten_with_none: Set self.w = Parameter(...), then self.w = None. Verify _parameters no longer has "w", and self.w is None.

Block E — zero_grad, state_dict, load_state_dict

  • test_zero_grad_clears_grads: Construct a module with 2 parameters. Manually assign p.grad = np.ones_like(p.data). Call m.zero_grad(). Assert all parameter grads are None.

  • test_state_dict_keys:

    m = Outer()
    state = m.state_dict()
    assert set(state.keys()) == {"b", "inner.w"}
    

  • test_load_state_dict_roundtrip:

  • m1 = Outer(). Capture state = m1.state_dict().
  • m2 = Outer() (different random init).
  • m2.load_state_dict(state).
  • Assert m1.b.data is not m2.b.data (different objects).
  • Assert np.array_equal(m1.b.data, m2.b.data) (same values).

Block F — edge cases

  • test_shared_parameter_yields_twice: Two attributes pointing at the same Parameter:

    class M(Module):
        def __init__(self):
            super().__init__()
            shared = Parameter(np.zeros(3))
            self.in_emb = shared
            self.out_emb = shared
    
    m = M()
    params = list(m.parameters())
    assert len(params) == 2
    assert params[0] is params[1]
    
    Document this as expected behavior (tied embeddings). Lab 02 reflection asks whether to deduplicate.

  • test_repr_does_not_crash: repr(M()) returns a string and doesn't recurse forever.

  • test_call_dispatches_to_forward: Subclass Module, override forward(self, x) to return x * 2. Verify m(5) == 10.

  • test_forward_not_implemented_raises: Module()(any_arg) should raise NotImplementedError.

Block G — mypy --strict clean

  • No Any types except where they appear in the upstream Tensor API.
  • parameters is typed Iterator[Parameter].
  • state_dict is typed dict[str, np.ndarray].
  • forward is typed Any (placeholder; subclasses narrow).

Constraints

  • No Linear, no Sequential, no activations. Those are Lab 01.
  • Module is the only abstract base. Don't introduce Layer, Container, LossModule etc. KISS — Phase 10+ may refactor if needed.
  • No PyTorch in src/minimodel/. PyTorch lives in tests/ only (some integration tests in Lab 02 use it).
  • __init_subclass__ is forbidden. The framework cleverness is limited to __setattr__. No metaclasses, no class decorators.

Pitfalls

  • __setattr__ recursion. If Module.__setattr__ writes via self.X = value (regular assignment), it triggers itself. Always use object.__setattr__(self, name, value) for the actual storage.
  • super().__init__() forgotten. Test it by constructing a subclass that forgets to call super, then assigning a Parameter — should raise AttributeError because _parameters doesn't exist. Document the expected error.
  • parameters() returns a generator, not a list. Consumers must convert (list(model.parameters())) if they need to iterate twice. The optimizer base class does the conversion internally.
  • state_dict key collisions. If two submodules have the same name (shouldn't happen but could via weird subclassing), the second overwrites the first. Add a test (test_state_dict_no_collision) that asserts unique keys.

Stop conditions

Done when:

  1. Parameter is a 2-line subclass.
  2. Module is ≤ 60 lines including type hints and the repr.
  3. All 12+ tests above are green.
  4. mypy --strict src/minimodel/nn/module.py clean.
  5. ruff check src/minimodel/nn/module.py clean.
  6. You can explain why we use object.__setattr__ in __init__ (not in __setattr__ body).

When to consult solutions/

After all tests pass. solutions/00-parameter-and-module-skeleton-ref.md (at phase open) compares your Module against the canonical implementation and points out where the cleverness can be even shorter (or longer, with better error messages).


Next lab: lab/01-linear-and-activations.md.