English · Español

02 — MCP Architecture: Server, Client, Transport¶

🇪🇸 MCP es JSON-RPC 2.0 con cuatro verbos que importan, dos transportes principales (stdio y SSE/HTTP), y una negociación inicial de capacidades. Eso es todo. La complejidad aparente del SDK es plumbing alrededor de esa idea minúscula.

This page derives MCP from first principles. By the end, Borja can read the official MCP spec and the Anthropic SDK source without flinching.

Three actors¶

+----------+        +-----------+         +------------+
|  Agent   |  <-->  |  Client   |  <-->   |   Server   |
| (Phase   |        | (Phase 31 |         | (Phase 31  |
|  32)     |        |  mcp_     |         |  mcp_      |
|          |        |  client)  |         |  server)   |
+----------+        +-----------+         +------------+
                       ^                      |
                       |    JSON-RPC 2.0      |
                       +----- over stdio -----+

Agent (Phase 32): the thing that makes decisions. Uses the client as an API.
Client (Phase 31): translates "I want to call tool X" into JSON-RPC requests; sends/receives over a transport; returns results.
Server (Phase 31): exposes a registry of tools (and optionally resources, prompts); answers JSON-RPC requests.

For Phase 31, the agent does not exist yet. Lab 02's client is a script that pretends to be the agent: it lists tools, calls one, prints the result, exits. Phase 32 replaces the script with a real agent loop using the same client interface.

The transport layer¶

MCP defines two main transports:

stdio. The server is a subprocess of the client. The client writes JSON-RPC messages to the server's stdin; the server writes responses to stdout. Logs and errors go to stderr. This is the simplest, most secure transport: there is no network, no auth, no port to leak. The client controls the server process lifecycle. Phase 31 uses this exclusively.
Streamable HTTP / SSE. The server is an HTTP service. The client sends requests via POST; responses come back via streamed Server-Sent Events. This is the transport used when the tool host is remote (different machine, different team's service). Phase 33 uses this for serving the agent itself — but the tool host stays on stdio.

The choice of transport does not affect the message contents. Same JSON-RPC envelopes either way.

JSON-RPC 2.0 framing (stdio)¶

Every message is a JSON object preceded by a Content-Length header (LSP-style):

Content-Length: 178\r\n
\r\n
{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}

The header tells the receiver how many bytes of JSON body follow. Without it, the receiver doesn't know where one message ends and the next begins. Newline-delimited JSON would be simpler but breaks on multi-line strings inside the JSON body.

Phase 31's lab 01 implements exactly this framing. Borja writes a read_message(stream) and write_message(stream, msg) pair that handles the header + body protocol. This is the most error-prone part of the phase.

The four verbs that matter¶

MCP has many methods. For Phase 31, only these four:

`initialize`¶

The first message a client sends. Negotiates protocol version and advertises capabilities.

// Client → Server
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {"tools": {}},
    "clientInfo": {"name": "miniagent-client", "version": "0.1"}
  }
}

// Server → Client
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2024-11-05",
    "capabilities": {"tools": {}},
    "serverInfo": {"name": "miniagent-server", "version": "0.1"}
  }
}

After initialize, the client sends an initialized notification (no id, no response expected) to signal it's ready to receive messages.

`tools/list`¶

The client asks the server "what tools do you expose?".

// Client → Server
{ "jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {} }

// Server → Client
{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "tools": [
      {
        "name": "conjugate",
        "description": "Return the conjugated form of an English verb.",
        "inputSchema": {
          "type": "object",
          "properties": {
            "verb": {"type": "string", "enum": [...20 verbs...]},
            "tense": {"type": "string", "enum": [...5 tenses...]},
            "person": {"type": "string", "enum": [...3 persons...]}
          },
          "required": ["verb", "tense", "person"]
        }
      },
      { "name": "lookup_irregular_verb", ... },
      { "name": "lookup_spanish", ... },
      { "name": "check_subject_verb_agreement", ... }
    ]
  }
}

The client now knows the tool catalog. It will use these schemas to construct calls and (in Phase 32) to feed to the model as tool declarations.

`tools/call`¶

The client invokes a specific tool with arguments.

// Client → Server
{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "conjugate",
    "arguments": {"verb": "eat", "tense": "past_simple", "person": "3sg"}
  }
}

// Server → Client (success)
{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [{"type": "text", "text": "ate"}],
    "isError": false
  }
}

// Server → Client (tool-level error)
{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [{"type": "text", "text": "verb 'run' is out of scope (§A13)"}],
    "isError": true
  }
}

// Server → Client (protocol-level error: unknown tool)
{
  "jsonrpc": "2.0",
  "id": 3,
  "error": {
    "code": -32602,
    "message": "Unknown tool: 'unkn_tool'"
  }
}

The two error shapes matter (theory/01-function-calling-formats.md §"Errors"). The result.isError shape is recoverable; the error field shape is not.

Notifications¶

Notifications are messages without an id and without an expected response. The two we use:

notifications/initialized — client → server, after initialize is acknowledged.
notifications/tools/list_changed — server → client, if the tool catalog changes at runtime. (Phase 31 doesn't change the catalog at runtime; we mention this for completeness.)

The wire trace of a typical session¶

1. Client spawns server subprocess.
2. Client → Server:    initialize             (gets back capabilities)
3. Client → Server:    notifications/initialized
4. Client → Server:    tools/list             (gets back the 4 tools)
5. Client → Server:    tools/call conjugate   (gets back "ate")
6. Client → Server:    tools/call lookup_spanish english_form="ate"  (gets back "comió")
7. Client closes server's stdin.
8. Server exits.

This is exactly the transcript Phase 31's lab 02 produces. The phase report includes the literal byte stream as proof.

What a "capability" is¶

In initialize, both sides advertise capabilities. Phase 31's only advertised capability is tools. Other MCP capabilities (resources, prompts, sampling) are not advertised by our server, which means the client should not send those method calls. If it does, the server returns error.code = -32601 (Method not found).

This negotiation is what makes MCP forward-compatible: a newer client talking to an older server only uses capabilities both sides advertised.

Resources and prompts (mentioned, not implemented)¶

Resources. Read-only data the server exposes — files, URIs, etc. Useful for "give me the §A13 truth table as a markdown document". We do not implement this; the truth table lives in code, not as a resource.
Prompts. Reusable prompt templates the server suggests. Useful for tooling that wants pre-baked prompts ("explain the past simple of a regular verb"). We do not implement this; the agent in Phase 32 has its own prompts.

What the SDK does for you¶

Anthropic's mcp Python SDK provides:

Pydantic models for every message type.
A Server class you subclass and register tools on with decorators.
A ClientSession context manager that handles spawn, initialize, and cleanup.
Async I/O (it's built on anyio).

Phase 31 does none of this in the hand-rolled implementation. We do it explicitly, in ~200 lines per process, to see the bytes. Lab 02 has an optional stretch goal: port to the SDK and compare line counts.

Synchronous vs async¶

The SDK is async. Phase 31's hand-rolled implementation is synchronous (blocking sys.stdin.read). This is acceptable for stdio with one client; production servers that handle many clients over HTTP must be async (Phase 33).

What this page does NOT cover¶

Authn/authz. That's theory/03-authn-authz.md.
HTTP / SSE transport details. Phase 33.
The full MCP method list. We covered the four that matter; the spec at https://modelcontextprotocol.io has the rest if curiosity strikes.
Server-initiated requests. Some MCP variants allow the server to call back into the client (sampling/createMessage). We do not implement this; our server is purely reactive.

Next: theory/03-authn-authz.md — permission models and what stdio gives us for free.