Skip to content

English · Español

Lab 02 — MCP Client and Round-Trip

Goal: write src/miniagent/mcp_client.py that spawns the server from Lab 01 as a subprocess, performs the full handshake, lists tools, calls each, and returns Python values. Then exercise the loop in an experiment that captures the wire transcript.

Estimated time: 2–3 hours.

Prereq: Lab 01 done — mcp_server.py exists and round-trips a hand-crafted byte stream.


What you produce

  • src/miniagent/mcp_client.py — the client library.
  • experiments/31-mcp-roundtrip/demo.py — uses the client, prints a transcript, writes results.json.
  • experiments/31-mcp-roundtrip/transcript.txt — the exact JSON-RPC messages exchanged, one per line, prefixed > (client→server) or < (server→client). This is the artifact pasted into PHASE_31_REPORT.md.

TODOs

Block A — MCPClient class skeleton

In src/miniagent/mcp_client.py:

  • Reuse read_message / write_message from mcp_server.py (move both to miniagent/_framing.py to share — they're identical).
  • Define:
    class MCPClient:
        def __init__(self, server_cmd: list[str]):
            self.server_cmd = server_cmd
            self.proc: subprocess.Popen | None = None
            self._next_id = 0
            self._tool_schemas: dict[str, dict] = {}
        def __enter__(self) -> "MCPClient": ...
        def __exit__(self, *exc) -> None: ...
        def initialize(self) -> dict: ...
        def list_tools(self) -> list[dict]: ...
        def call_tool(self, name: str, **arguments) -> Any: ...
        def _send(self, method: str, params: dict, *, notify=False) -> dict | None: ...
    
  • Use as a context manager so the subprocess is always cleaned up:
    with MCPClient(["python", "-m", "miniagent.mcp_server"]) as c:
        c.initialize()
        tools = c.list_tools()
        result = c.call_tool("conjugate", verb="eat", tense="past_simple", person="3sg")
    

Block B — subprocess lifecycle

  • __enter__: self.proc = subprocess.Popen(self.server_cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE, bufsize=0). bufsize=0 matters — buffered stderr can swallow logs you need.
  • __exit__:
  • Close stdin to signal EOF to the server.
  • self.proc.wait(timeout=5.0); on timeout, self.proc.kill() and log.
  • If server exited non-zero, dump stderr content to the client's stderr for debugging.
  • Surface stderr lines in real time: spawn a small daemon thread that reads self.proc.stderr line-by-line and prefixes them [server] before forwarding to logging. Without this, server crashes are invisible.

Block C — request/response correlation

  • _send(method, params, notify=False):
  • If notify, build {"jsonrpc": "2.0", "method": method, "params": params} (no id).
  • Else, id = self._next_id; self._next_id += 1; build with id field.
  • write_message(self.proc.stdin, msg).
  • If notify, return None.
  • Else, read one response with read_message(self.proc.stdout). Assert resp["id"] == id; on mismatch, raise ProtocolError.
  • No concurrent requests. Phase 31's client is strictly request → wait → response. Multiplexing is a Phase 33 concern.

Block D — initialize

  • initialize():
  • Send initialize with our clientInfo and supported capabilities (we advertise {} for now — Phase 32 might add sampling).
  • Receive server's capabilities; store them on self.
  • Send notifications/initialized (notification, no response).
  • If the server returns capabilities that don't include "tools", raise — every tool call would fail anyway.

Block E — list_tools and tool-schema caching

  • list_tools():
  • Send tools/list.
  • Parse response. Store each tool's inputSchema (rename to input_schema internally) in self._tool_schemas[name].
  • Return the raw list (the agent in Phase 32 will display it).

Block F — call_tool

  • call_tool(name, **arguments):
  • Look up schema = self._tool_schemas[name] (raise KeyError if list_tools hasn't been called yet).
  • Client-side validation. jsonschema.Draft7Validator(schema).validate(arguments) before sending. This catches argument errors locally; the server would catch them too, but a local failure is faster and gives a better stack trace.
  • Send tools/call with {"name": name, "arguments": arguments}.
  • Parse response:
    • If result.isError == True → raise ToolError(result.content[0].text).
    • If result.isError == False → return the unwrapped value:
      raw = result["content"][0]["text"]
      try:
          return json.loads(raw)   # dict-returning tools
      except json.JSONDecodeError:
          return raw               # string-returning tools
      
    • If "error" key in the response (JSON-RPC level error) → raise ProtocolError(error["message"], error["code"]).
  • Don't conflate ToolError and ProtocolError at the client. They mean different things to the caller (recoverable vs unrecoverable). The Phase 32 agent reasons differently about each.

Block G — transcript capture

In experiments/31-mcp-roundtrip/demo.py:

  • Monkey-patch write_message and read_message (or use a hook in _framing.py) so every message is also appended to a transcript file, prefixed with direction.
  • Run the canonical flow:
    with MCPClient(["python", "-m", "miniagent.mcp_server"]) as c:
        c.initialize()
        tools = c.list_tools()
        assert len(tools) == 4
        print(c.call_tool("conjugate", verb="eat", tense="past_simple", person="3sg"))  # → "ate"
        print(c.call_tool("lookup_spanish", english_form="ate"))                          # → "comió"
        print(c.call_tool("lookup_irregular_verb", verb="go"))                            # → {...}
        print(c.call_tool("check_subject_verb_agreement", subject="he", verb_form="go"))  # → {...}
    
  • Capture transcript to transcript.txt. Format:
    > initialize {"protocolVersion": "2024-11-05", ...}
    < {"protocolVersion": "2024-11-05", "capabilities": ..., ...}
    > notifications/initialized {}
    > tools/list {}
    < {"tools": [...]}
    ...
    
  • Write results.json: {"tools_discovered": 4, "calls_made": 4, "calls_failed": 0, "transcript_path": "transcript.txt"}.
  • Write manifest.json per §5.

Block H — tests

In tests/test_mcp_roundtrip.py:

  • test_initialize_returns_tools_capability.
  • test_list_tools_returns_four — server exposes exactly the four §A13 tools.
  • test_call_conjugate_returns_ate.
  • test_call_with_invalid_args_raises_protocol_errorc.call_tool("conjugate", verb="run", tense="past_simple", person="3sg") raises ToolError (not ProtocolError, since "run" is a valid string failing the tool's own scope check, not the schema's). NOTE: this requires the input_schema enum to include "run" — confirm in Lab 00 that the enum is exactly the §A13 verbs; out-of-enum strings would be a ProtocolError instead.
  • test_unknown_tool_raises_protocol_errorc.call_tool("not_a_tool") raises ProtocolError.
  • test_client_cleans_up_on_exception — subprocess exits even if the with block raises.

Constraints

  • stdlib subprocess only. No pexpect, no subprocess32.
  • No network. stdio only. (Phase 33 is HTTP; not here.)
  • Synchronous. No asyncio.
  • One request in flight at a time. No pipelining.

Stop conditions

Done when:

  1. All Block H tests pass.
  2. demo.py writes a transcript.txt containing a complete handshake + tools/list + 4 tools/call round-trips.
  3. The server process exits with code 0 when the with MCPClient(...) block exits cleanly.
  4. pytest -q tests/test_mcp_roundtrip.py shows no lingering subprocesses (use psutil or just ps aux | grep miniagent after the run to verify).

Pitfalls

  • Deadlock on full pipe buffers. If the server writes a lot to stderr and the client doesn't drain it, the OS pipe buffer fills and the server blocks on the write, which means it stops reading stdin, which means the client blocks on its next write. The stderr-draining thread in Block B prevents this.
  • Race between EOF and final response. When closing the client, send any final request → read its response → THEN close stdin. Closing stdin first races with the server still trying to respond.
  • Subprocess inheriting wrong Python. subprocess.Popen(["python", ...]) runs whatever python is on PATH, which may not be the project's uv-managed interpreter. Use [sys.executable, "-m", "miniagent.mcp_server"] to guarantee version alignment.
  • PYTHONPATH / module resolution. If you run the test from a directory that doesn't have src/ on sys.path, the subprocess fails with ModuleNotFoundError: miniagent. The fix: env={**os.environ, "PYTHONPATH": str(Path(__file__).parent.parent / "src")} on Popen, or rely on uv run setting this up.
  • json.loads on a non-JSON string. lookup_spanish returns "comió". The wire shape wraps that into content[0].text == "comió". json.loads("comió") raises — the bare-string fallback path is required.

When to consult solutions/

After tests are green. The solution at solutions/02-mcp-roundtrip-ref.md provides a reference transcript.txt to compare byte-by-byte (modulo whitespace) against yours.


Next lab: lab/03-mask-driven-toolcall.md — connect Phase 30's JSONSchemaMask to the tool-call argument generator. End-to-end.