Skip to main content

The paradox

HTTP is fundamentally stateless: each request is independent, the server doesn’t remember previous requests, and there’s no inherent connection concept. Yet the MCP (Model Context Protocol) requires session state for tools to maintain context across multiple invocations. The question: How do you build a stateful protocol on top of a stateless transport? FastMCP’s answer: Session headers + server-side session store.

How it works

Initialization (handshake)

Client → Server: POST /mcp/initialize
Headers: {Accept: "application/json, text/event-stream"}
Body: {"jsonrpc": "2.0", "method": "initialize", "params": {...}}

Server → Client: HTTP 200 + mcp-session-id header
Headers: {mcp-session-id: "01ARZ3NDEKTSV4RRFFQ69G5FAV"}
Body: SSE-wrapped response
The server creates a session on first connection and sends back the ID.

Subsequent requests

Client → Server: POST /mcp/call
Headers: {
  mcp-session-id: "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  Content-Type: "application/json"
}
Body: {"jsonrpc": "2.0", "method": "call_tool", "params": {...}}

Server: Looks up session by ID, retrieves context, executes tool
The client must include the session ID in every request. The server validates it and retrieves the session context.

Why this design exists

Advantage: Stateless scaling

Each request can hit any FastMCP server instance (load balancer friendly):
Client → LB → Server A (init) → Session ID "xyz"
Client → LB → Server B (tool call) → Session ID "xyz" 
Client → LB → Server C (tool call) → Session ID "xyz"
As long as servers share a session store (Redis, database, etc.), the session ID is portable.

Advantage: Natural MCP semantics

MCP expects tools to share context. Without sessions:
Tool A: list_requirements() → returns ["req-1", "req-2"]
Tool B: search_requirements(query="test") → ???
Without context, Tool B doesn’t know what collection we’re searching in. Sessions solve this.

Disadvantage: Client complexity

The client must:
  1. Capture the session ID from the first response
  2. Store it somewhere accessible
  3. Include it in every subsequent request
  4. Handle session expiry gracefully

Disadvantage: Difficult to debug

Session headers are invisible in most HTTP clients:
  • curl won’t show them unless you add -v
  • Browser dev tools often hide them
  • Logs that don’t include headers look like “random HTTP 401 errors”

Implementation pattern (PEBBLE’s approach)

class HttpTransport:
    def __init__(self, url: str):
        self.url = url
        self._session_id: Optional[str] = None
    
    async def initialize(self) -> dict:
        # First request: get session ID
        response = await self._post("initialize", {...})
        self._session_id = response.headers.get('mcp-session-id')
        return parse_sse(response.text)
    
    async def call_tool(self, tool_name: str, args: dict) -> dict:
        # All subsequent requests: include session ID
        headers = {'mcp-session-id': self._session_id}
        response = await self._post("call_tool", {...}, headers=headers)
        return parse_sse(response.text)
Key pattern: Capture once, reuse everywhere.

Real-world consequences

Session expiry

FastMCP sessions typically expire after 15-30 minutes of inactivity. If a tool takes 10 minutes to run:
T=0:00  Client initializes, gets session ID
T=10:00 Tool finishes
T=10:05 Client tries to return result → Session expired!
Mitigation: Ping the server periodically, or increase session TTL, or re-initialize if session expires.

Concurrent requests

If a client runs multiple tools in parallel:
Tool A call → Uses session ID "xyz"
Tool B call → Uses session ID "xyz" (parallel)
Tool A result → Modifies session state (Tool B sees it)
Tool B result → ??? (state may have changed)
Mitigation: Either serialize tool calls, or ensure tools are stateless / don’t interfere.

Debugging session failures

Error message: HTTP 401 Unauthorized Reality: Could be any of:
  1. Session ID not included in headers
  2. Session ID expired
  3. Session ID is wrong (captured from different server instance)
  4. Server crashed and lost session store
Prevention: Log the session ID + timestamp on every request. When debugging, grep for the session ID in logs.

How PEBBLE measures it

From benchmark results (Apr 15, 2026):
PhaseTimeNotes
Initialize (session + handshake)195msOne-time cost
Session reuse (tool call)19-29msPer tool, network-bound
Session reuse (subsequent tools)20-26msConsistent, no degradation
Finding: Session reuse has minimal overhead. The 195ms initialization is the network roundtrip, not session setup.

Organizational takeaway

Session-based HTTP state is a pragmatic compromise between MCP’s stateful semantics and HTTP’s stateless nature.

When designing HTTP-based stateful protocols:

  1. Minimize session size — avoid storing large objects in session (send them with each request instead)
  2. Set explicit TTLs — don’t let sessions linger forever
  3. Log session IDs with every request — essential for debugging
  4. Consider alternative transport — if sessions become complex, use WebSocket or gRPC (both have built-in connection state)
  5. Document the session requirement — this isn’t obvious from HTTP alone

For PEBBLE specifically:

  • Initialize once per provider lifecycle (usually once at startup)
  • Reuse the session for all tool calls
  • Re-initialize if tools start failing with 401 errors
  • Monitor session initialization time as a health metric
This pattern will apply to every HTTP-based MCP server integrated into PEBBLE.