03. How It Works
Architecture: hosts, clients, servers
MCP follows a three-role architecture.
MCP Host is the AI application itself. Examples: Claude Desktop, Claude Code, VS Code with Copilot, a custom agent built on the Anthropic API. The host owns the user experience, runs the LLM, and manages one or more MCP clients.
MCP Client is a component inside the host that maintains a dedicated connection to one MCP server. A host creates one client per server it connects to. The client handles the lifecycle of that connection: initialization, capability negotiation, message exchange, and teardown.
MCP Server is a program that exposes tools, resources, or prompts over the protocol. It can run locally on the same machine (launched as a subprocess by the host) or remotely on a network endpoint. A single MCP server can serve many clients simultaneously when using the HTTP transport.
The protocol itself is built on JSON-RPC 2.0, a lightweight request/response/notification format. All message semantics, including tool calls, resource reads, capability negotiation, and real-time notifications, are expressed as JSON-RPC messages.
Transport mechanisms
MCP defines two transport mechanisms:
Stdio transport uses the server process's standard input and output streams for communication. The host launches the server as a local subprocess and exchanges JSON-RPC messages over stdin/stdout. This transport has zero network overhead, is ideal for local integrations (filesystem access, local databases, shell commands), and is the most common transport for developer tools. One client connects to one server process.
Streamable HTTP transport (previously called HTTP/SSE) uses HTTP POST for client-to-server messages and optional Server-Sent Events for streaming responses from server to client. This transport supports remote servers, scales to many simultaneous clients per server, and integrates with standard HTTP authentication: bearer tokens, API keys, and OAuth 2.1. Anthropic recommends OAuth 2.1 with PKCE for production deployments.
Connection lifecycle
Every MCP connection begins with a capability negotiation handshake. The client sends an initialize request declaring its protocol version and the primitives it supports (for example, elicitation). The server responds with its own protocol version and the primitives it can offer (tools, resources, prompts, real-time notifications). After the client sends an initialized notification, the connection is ready for use. This handshake ensures both sides know exactly what the other can do before any real work begins.
What servers expose: the three primitives
Tools are executable functions the AI can invoke to take action. Examples: run a database query, call an external API, read or write a file, execute shell commands. Tools are the most commonly used primitive. The host discovers available tools via tools/list, which returns JSON schema definitions for each tool's name, description, and input parameters. The AI uses these schemas to decide when and how to call a tool. Execution happens via tools/call.
Resources are data sources that provide contextual information. A resource might be a file's contents, a database schema, a set of records, or an API response. Resources are read-only context inputs rather than actions. The AI requests them when it needs background information to reason over.
Prompts are reusable templates that structure interactions with the language model. A server can expose system prompt templates, few-shot example sets, or domain-specific instruction blocks. The AI can retrieve and apply these prompts to ensure consistent behavior across different tasks.
Servers can also declare real-time notification support: for example, a server can notify connected clients when its tool list changes (notifications/tools/list_changed), allowing the host to refresh its tool registry without polling.