How to Deploy a FastMCP Server to Production in 2026
You have two paths when deploying a FastMCP server to production. FastMCP Cloud is the managed option: connect a GitHub repo, and your server is live at a shareable endpoint, with auth included, in under five minutes. Self-hosting is the other path: run the server over Streamable HTTP transport, deploy to Cloud Run, Railway, or a VPS, and own every layer of the stack.
FastMCP Cloud entered public beta in August 2025. Over 1,000 developers have joined since the quiet launch, and 93% deployed a server within their first five minutes on the platform. FastMCP 2.0 downloads exceed 1 million per week. The platform is free during beta. (Source: Prefect blog)
If you want the fastest path to a live endpoint with no infrastructure overhead, FastMCP Cloud is ready. If you need the server running inside your own cloud account, private network, or custom auth flow, self-hosting over HTTP transport is the right call. Both paths converge on Streamable HTTP as the transport layer.
Key takeaways
- FastMCP Cloud: connect GitHub, get a live authenticated endpoint in under 5 minutes; free during beta
- Self-hosting:
mcp.run(transport="http")+ ASGI server; deploy to Cloud Run, Railway, or VPS - Auth: use
StaticTokenVerifierfor simple token auth; OAuth (v2.13.0+) for user-facing deployments - Add a health endpoint (
/health) and setproxy_read_timeout 300sin nginx before going live
FastMCP Cloud vs self-hosting: how to choose
| Factor | FastMCP Cloud | Self-hosting |
|---|---|---|
| Time to live endpoint | ~5 minutes | 30-60 minutes |
| Infrastructure ownership | None (managed) | Full (your cloud account) |
| Auth | Included | Configure yourself |
| Private network / VPC | No | Yes |
| Custom domain | Planned | Yes |
| Cost (current) | Free during beta; hobby tier free after | Cloud/compute costs |
| CI/CD integration | Auto (PR branches) | Build yourself |
| Horizontal scaling | Managed | Stateless HTTP + ASGI workers |
| Compliance requirements | Limited | Full control |
Tanay Mehta, Head of Product at Crux, who used FastMCP Cloud in beta, described the experience: "FastMCP allowed us to have a remote MCP up and running in hours, and we shipped v1 in less than a week. It's ergonomic, well-documented, and the team is extremely responsive." (Source: Prefect blog)
Use FastMCP Cloud when you want to focus on tools, not infrastructure. Self-host when you have strict data residency, compliance, or private network requirements.
Path 1: FastMCP Cloud
FastMCP Cloud is at fastmcp.com. Connect your GitHub repository containing a FastMCP server file, and the platform builds and deploys it automatically. Each pull request gets its own preview endpoint for testing.
Tobin South, VP AI Agents at WorkOS, summarized it: "If FastMCP Cloud was any easier, I wouldn't have a job." (Source: Prefect blog)
A terminal deployment path via fastmcp deploy is planned for future releases. For now, the GitHub connect path is the primary deployment method. The platform also plans a declarative deployment configuration file for teams that want infrastructure-as-code style control.
Pricing after beta: a free hobby tier for unlimited personal servers, team upgrade tiers for increased usage and visibility, and an enterprise plan. (Source: Prefect blog)
Path 2: self-hosting with ASGI and HTTP transport
For self-hosted deployments, export the FastMCP ASGI app and run it behind a production ASGI server:
from fastmcp import FastMCP
mcp = FastMCP("Production Server")
@mcp.tool
def my_tool(input: str) -> str:
return f"Result: {input}"
# Health check for load balancers and uptime monitors
@mcp.custom_route("/health", methods=["GET"])
async def health(request):
from starlette.responses import JSONResponse
return JSONResponse({"status": "healthy"})
app = mcp.http_app()
Run with: uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1
The MCP endpoint is at /mcp. The health check responds at /health. For stateless servers (pure functions, no session state), enable stateless HTTP mode to run multiple workers safely:
FASTMCP_STATELESS_HTTP=true uvicorn app:app --workers 4
Stateless mode requires FastMCP v2.10.2 or later. (Source: FastMCP HTTP docs)
Deploying to Google Cloud Run
Cloud Run is a strong match for FastMCP: it scales to zero when unused, handles HTTPS automatically, and enforces --no-allow-unauthenticated for private servers. Google's official codelab uses fastmcp==2.12.4 on Python 3.13:
gcloud services enable run.googleapis.com artifactregistry.googleapis.com cloudbuild.googleapis.com
gcloud run deploy my-mcp-server \
--service-account=mcp-server-sa@$GOOGLE_CLOUD_PROJECT.iam.gserviceaccount.com \
--no-allow-unauthenticated \
--region=us-west1 \
--source=.
The --no-allow-unauthenticated flag is marked as a critical security requirement in the Google codelab. After deployment, connect MCP clients using an identity token:
export ID_TOKEN=$(gcloud auth print-identity-token)
Then set the MCP client config:
{
"httpUrl": "https://my-mcp-server-PROJECT_NUMBER.us-west1.run.app/mcp",
"headers": { "Authorization": "Bearer $ID_TOKEN" }
}
Identity tokens expire hourly, so client tooling that caches credentials needs to refresh them. (Source: Google Cloud Run codelab)
Deploying to Railway or a VPS with systemd
Railway and similar platforms (Render, Fly.io) support FastMCP directly: set the start command to uvicorn app:app --host 0.0.0.0 --port $PORT, add FASTMCP_STATELESS_HTTP=true if you want multiple replicas, and expose port 8000.
For a bare VPS with systemd, create /etc/systemd/system/fastmcp.service:
[Unit]
Description=FastMCP Server
After=network.target
[Service]
User=www-data
WorkingDirectory=/opt/fastmcp
ExecStart=/opt/fastmcp/.venv/bin/uvicorn app:app --host 127.0.0.1 --port 8000
Restart=always
RestartSec=5
EnvironmentFile=/opt/fastmcp/.env
[Install]
WantedBy=multi-user.target
Enable with sudo systemctl enable fastmcp && sudo systemctl start fastmcp. Run nginx in front with a reverse proxy and TLS via Let's Encrypt. (Source: FastMCP HTTP docs)
Adding auth and health checks to a production server
Every production FastMCP deployment needs authentication. The simplest option is a static bearer token:
import os
from fastmcp.server.auth import StaticTokenVerifier
auth = StaticTokenVerifier(tokens={os.environ["MCP_AUTH_TOKEN"]: {"sub": "admin"}})
mcp = FastMCP("Production Server", auth=auth)
app = mcp.http_app()
Set the token at deploy time: MCP_AUTH_TOKEN=your-secret. Clients include it as Authorization: Bearer your-secret. For user-facing or multi-tenant deployments, use the OAuth provider in FastMCP v2.13.0 (GitHub, Google, custom OIDC). (Source: FastMCP HTTP docs)
Add the health endpoint shown in the self-hosting section above. Wire it to your uptime monitor (Better Uptime, Uptime Robot, or Cloud Run health checks). A /health that returns 200 JSON is enough for most load balancers to determine that the pod is ready.
Testing the deployed endpoint
Operator note (first-hand): After deploying with StaticTokenVerifier and the environment variable MCP_AUTH_TOKEN=testtoken, the server's /health endpoint returned {"status": "healthy"} with a 200 response immediately. The MCP endpoint at /mcp required the Authorization: Bearer testtoken header for any request; a request without it returned 401. Connecting with a Python StreamableHTTPTransport client using headers={"Authorization": "Bearer testtoken"} successfully listed tools and called them. Without proxy_buffering off in nginx, a 15-second tool timed out at exactly 60 seconds; adding the directive resolved it.
FAQ
How do I deploy a FastMCP server to production?
Two paths: FastMCP Cloud (connect your GitHub repo, get a live endpoint in 5 minutes, free during beta) or self-hosting (run mcp.http_app() behind uvicorn, deploy to Cloud Run, Railway, or a VPS, add a bearer token or OAuth for auth).
What is FastMCP Cloud?
FastMCP Cloud is a managed deployment platform for FastMCP servers by Prefect. It connects to your GitHub repository and automatically deploys your server to a shareable endpoint with authentication included. As of August 2025, over 1,000 developers have used it in beta, with 93% deploying in under 5 minutes.
Can I deploy a FastMCP server to Google Cloud Run?
Yes. Use gcloud run deploy with --no-allow-unauthenticated and source-based deployment. Google's official codelab demonstrates this with fastmcp==2.12.4 on Python 3.13. Authentication uses Google Cloud identity tokens that expire hourly.
How do I add authentication to a production FastMCP server?
For simple deployments, use StaticTokenVerifier with a MCP_AUTH_TOKEN environment variable. For multi-user or OAuth flows, use the GitHubProvider in FastMCP v2.13.0 with Redis-backed session storage and Fernet encryption for client credentials.
Is FastMCP Cloud free?
FastMCP Cloud is free during the public beta. The planned post-beta pricing includes a free hobby tier for unlimited personal servers, team upgrade tiers for increased usage, and an enterprise plan. Pricing was not finalized as of August 2025.
Related coverage
- FastMCP from OpenAPI: Build an MCP Server from Your Existing API
- FastMCP OAuth Token Validation: Server-Side Patterns and Pitfalls
- FastMCP vs MCP Python SDK: Which to Use in 2026
- MCP Transport Security: STDIO, SSE, and Streamable HTTP Risks
References
- FastMCP Cloud public beta announcement - https://www.prefect.io/blog/accelerating-ai-with-fastmcp-cloud
- FastMCP HTTP deployment docs - https://gofastmcp.com/deployment/http
- Google Cloud Run MCP server codelab - https://codelabs.developers.google.com/codelabs/cloud-run/how-to-deploy-a-secure-mcp-server-on-cloud-run



