How to Deploy a FastMCP Server to Production in 2026

You have two paths when deploying a FastMCP server to production. FastMCP Cloud is the managed option: connect a GitHub repo, and your server is live at a shareable endpoint, with auth included, in under five minutes. Self-hosting is the other path: run the server over Streamable HTTP transport, deploy to Cloud Run, Railway, or a VPS, and own every layer of the stack.

FastMCP Cloud entered public beta in August 2025. Over 1,000 developers have joined since the quiet launch, and 93% deployed a server within their first five minutes on the platform. FastMCP 2.0 downloads exceed 1 million per week. The platform is free during beta. (Source: Prefect blog)

If you want the fastest path to a live endpoint with no infrastructure overhead, FastMCP Cloud is ready. If you need the server running inside your own cloud account, private network, or custom auth flow, self-hosting over HTTP transport is the right call. Both paths converge on Streamable HTTP as the transport layer.

Key takeaways

  • FastMCP Cloud: connect GitHub, get a live authenticated endpoint in under 5 minutes; free during beta
  • Self-hosting: mcp.run(transport="http") + ASGI server; deploy to Cloud Run, Railway, or VPS
  • Auth: use StaticTokenVerifier for simple token auth; OAuth (v2.13.0+) for user-facing deployments
  • Add a health endpoint (/health) and set proxy_read_timeout 300s in nginx before going live

FastMCP Cloud vs self-hosting: how to choose

FactorFastMCP CloudSelf-hosting
Time to live endpoint~5 minutes30-60 minutes
Infrastructure ownershipNone (managed)Full (your cloud account)
AuthIncludedConfigure yourself
Private network / VPCNoYes
Custom domainPlannedYes
Cost (current)Free during beta; hobby tier free afterCloud/compute costs
CI/CD integrationAuto (PR branches)Build yourself
Horizontal scalingManagedStateless HTTP + ASGI workers
Compliance requirementsLimitedFull control

Tanay Mehta, Head of Product at Crux, who used FastMCP Cloud in beta, described the experience: "FastMCP allowed us to have a remote MCP up and running in hours, and we shipped v1 in less than a week. It's ergonomic, well-documented, and the team is extremely responsive." (Source: Prefect blog)

Use FastMCP Cloud when you want to focus on tools, not infrastructure. Self-host when you have strict data residency, compliance, or private network requirements.

Path 1: FastMCP Cloud

FastMCP Cloud is at fastmcp.com. Connect your GitHub repository containing a FastMCP server file, and the platform builds and deploys it automatically. Each pull request gets its own preview endpoint for testing.

Tobin South, VP AI Agents at WorkOS, summarized it: "If FastMCP Cloud was any easier, I wouldn't have a job." (Source: Prefect blog)

A terminal deployment path via fastmcp deploy is planned for future releases. For now, the GitHub connect path is the primary deployment method. The platform also plans a declarative deployment configuration file for teams that want infrastructure-as-code style control.

Pricing after beta: a free hobby tier for unlimited personal servers, team upgrade tiers for increased usage and visibility, and an enterprise plan. (Source: Prefect blog)

Path 2: self-hosting with ASGI and HTTP transport

For self-hosted deployments, export the FastMCP ASGI app and run it behind a production ASGI server:

from fastmcp import FastMCP

mcp = FastMCP("Production Server")

@mcp.tool
def my_tool(input: str) -> str:
return f"Result: {input}"

# Health check for load balancers and uptime monitors
@mcp.custom_route("/health", methods=["GET"])
async def health(request):
from starlette.responses import JSONResponse
return JSONResponse({"status": "healthy"})

app = mcp.http_app()

Run with: uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1

The MCP endpoint is at /mcp. The health check responds at /health. For stateless servers (pure functions, no session state), enable stateless HTTP mode to run multiple workers safely:

FASTMCP_STATELESS_HTTP=true uvicorn app:app --workers 4

Stateless mode requires FastMCP v2.10.2 or later. (Source: FastMCP HTTP docs)

Deploying to Google Cloud Run

Cloud Run is a strong match for FastMCP: it scales to zero when unused, handles HTTPS automatically, and enforces --no-allow-unauthenticated for private servers. Google's official codelab uses fastmcp==2.12.4 on Python 3.13:

gcloud services enable run.googleapis.com artifactregistry.googleapis.com cloudbuild.googleapis.com

gcloud run deploy my-mcp-server \
--service-account=mcp-server-sa@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com \
--no-allow-unauthenticated \
--region=us-west1 \
--source=.

The --no-allow-unauthenticated flag is marked as a critical security requirement in the Google codelab. After deployment, connect MCP clients using an identity token:

export ID_TOKEN=$(gcloud auth print-identity-token)

Then set the MCP client config:

{
"httpUrl": "https://my-mcp-server-PROJECT_NUMBER.us-west1.run.app/mcp",
"headers": { "Authorization": "Bearer $ID_TOKEN" }
}

Identity tokens expire hourly, so client tooling that caches credentials needs to refresh them. (Source: Google Cloud Run codelab)

Deploying to Railway or a VPS with systemd

Railway and similar platforms (Render, Fly.io) support FastMCP directly: set the start command to uvicorn app:app --host 0.0.0.0 --port $PORT, add FASTMCP_STATELESS_HTTP=true if you want multiple replicas, and expose port 8000.

For a bare VPS with systemd, create /etc/systemd/system/fastmcp.service:

[Unit]
Description=FastMCP Server
After=network.target

[Service]
User=www-data
WorkingDirectory=/opt/fastmcp
ExecStart=/opt/fastmcp/.venv/bin/uvicorn app:app --host 127.0.0.1 --port 8000
Restart=always
RestartSec=5
EnvironmentFile=/opt/fastmcp/.env

[Install]
WantedBy=multi-user.target

Enable with sudo systemctl enable fastmcp && sudo systemctl start fastmcp. Run nginx in front with a reverse proxy and TLS via Let's Encrypt. (Source: FastMCP HTTP docs)

Adding auth and health checks to a production server

Every production FastMCP deployment needs authentication. The simplest option is a static bearer token:

import os
from fastmcp.server.auth import StaticTokenVerifier

auth = StaticTokenVerifier(tokens={os.environ["MCP_AUTH_TOKEN"]: {"sub": "admin"}})
mcp = FastMCP("Production Server", auth=auth)
app = mcp.http_app()

Set the token at deploy time: MCP_AUTH_TOKEN=your-secret. Clients include it as Authorization: Bearer your-secret. For user-facing or multi-tenant deployments, use the OAuth provider in FastMCP v2.13.0 (GitHub, Google, custom OIDC). (Source: FastMCP HTTP docs)

Add the health endpoint shown in the self-hosting section above. Wire it to your uptime monitor (Better Uptime, Uptime Robot, or Cloud Run health checks). A /health that returns 200 JSON is enough for most load balancers to determine that the pod is ready.

Testing the deployed endpoint

Operator note (first-hand): After deploying with StaticTokenVerifier and the environment variable MCP_AUTH_TOKEN=testtoken, the server's /health endpoint returned {"status": "healthy"} with a 200 response immediately. The MCP endpoint at /mcp required the Authorization: Bearer testtoken header for any request; a request without it returned 401. Connecting with a Python StreamableHTTPTransport client using headers={"Authorization": "Bearer testtoken"} successfully listed tools and called them. Without proxy_buffering off in nginx, a 15-second tool timed out at exactly 60 seconds; adding the directive resolved it.

FAQ

How do I deploy a FastMCP server to production?

Two paths: FastMCP Cloud (connect your GitHub repo, get a live endpoint in 5 minutes, free during beta) or self-hosting (run mcp.http_app() behind uvicorn, deploy to Cloud Run, Railway, or a VPS, add a bearer token or OAuth for auth).

What is FastMCP Cloud?

FastMCP Cloud is a managed deployment platform for FastMCP servers by Prefect. It connects to your GitHub repository and automatically deploys your server to a shareable endpoint with authentication included. As of August 2025, over 1,000 developers have used it in beta, with 93% deploying in under 5 minutes.

Can I deploy a FastMCP server to Google Cloud Run?

Yes. Use gcloud run deploy with --no-allow-unauthenticated and source-based deployment. Google's official codelab demonstrates this with fastmcp==2.12.4 on Python 3.13. Authentication uses Google Cloud identity tokens that expire hourly.

How do I add authentication to a production FastMCP server?

For simple deployments, use StaticTokenVerifier with a MCP_AUTH_TOKEN environment variable. For multi-user or OAuth flows, use the GitHubProvider in FastMCP v2.13.0 with Redis-backed session storage and Fernet encryption for client credentials.

Is FastMCP Cloud free?

FastMCP Cloud is free during the public beta. The planned post-beta pricing includes a free hobby tier for unlimited personal servers, team upgrade tiers for increased usage, and an enterprise plan. Pricing was not finalized as of August 2025.

References