Using FastMCP with OpenAI and Avoiding Session Termination Issues

August 10, 2025

FastMCP is the lightweight Python framework for building MCP servers — the kind of server that exposes tools to an AI agent. OpenAI's Responses API can call them as remote tools, and most of the time the integration is uneventful. But there's one footgun that catches everyone the first time, and it still catches people in 2026.

The symptom

Your tool works on the first call. The second call from the same agent fails with:

{
  "jsonrpc": "2.0",
  "id": "server-error",
  "error": {
    "code": -32600,
    "message": "Not Found: Session has been terminated"
  }
}

In the FastMCP server's logs you'll see a DELETE request to the session endpoint between calls. That's OpenAI's client cleaning up after each tool invocation — it doesn't reuse the session, and it doesn't open a fresh one before the next call. FastMCP, which by default tracks state per session, finds the session already torn down and refuses the next request.

The fix

Pass stateless_http=True when constructing the server:

from fastmcp import FastMCP

mcp = FastMCP(
    name="MyToolServer",
    stateless_http=True,
)

@mcp.tool()
def greet(name: str) -> str:
    return f"Hello, {name}!"

if __name__ == "__main__":
    mcp.run(
        transport="http",
        host="0.0.0.0",
        port=8000,
    )

That's it. Each request is handled independently, no session ID is required, and the DELETE is a no-op instead of a problem.

Why it works

In stateless mode, FastMCP doesn't track session IDs across requests. Every incoming MCP call carries enough information on its own to be served, so the framework doesn't need a long-lived session to bind it to. OpenAI's per-call tear-down stops mattering because there's nothing to tear down.

What you give up

Stateless mode disables the parts of MCP that depend on cross-call state:

Sampling — server-initiated requests for the model to generate something on its behalf
Resource update notifications — notifications/resources/updated pushes that tell the client to re-fetch a resource
Progress notifications — partial progress updates streamed during a long-running tool call
Anything keyed on session ID — server-side per-session memory, rate counters, etc.

For most tool servers — pure request/response with no need to push back to the client — none of this matters and stateless is the right default. If you do need any of the above with OpenAI as the client, you'll need to build the state into the tool itself (a database, a cache keyed by some identifier the agent passes in) rather than relying on MCP's session model.

TL;DR

FastMCP server behind OpenAI's Responses API and you see Session has been terminated after the first call? Set stateless_http=True and ship.