Hosting Your Own MCP Server Online: Python, Heroku, and OAuth
The first MCP server most people build runs over stdio: a local subprocess the client launches on your machine. That is fine for desktop tooling, but it does not work for anything you want to reach from the web, share with a team, or connect to from a browser-based client. stdio servers cannot be hit over a URL, cannot be multi-tenant, and force every user to install the server locally.
This guide builds the other kind: a remote MCP server that runs over Streamable HTTP, lives on a public HTTPS URL, and authenticates callers. We will write it in Python with FastMCP, deploy it to Heroku, and connect it from both Claude.ai and Cursor. Auth is covered two ways: a simple API key for private/personal use, and OAuth 2.1 for the full spec-compliant flow.
If you have not built a basic MCP tool before, read the stdio walkthrough first; this article assumes you know what a tool and a schema are.
The shape of a remote server
Three things change when you go from local to remote.
- Transport. Local servers use stdio. Remote servers use Streamable HTTP, the transport introduced in the March 2025 spec that replaced the older two-endpoint HTTP+SSE design with a single bidirectional endpoint. Every current client negotiates it. Build Streamable HTTP only; do not ship SSE for new work.
-
An endpoint. Instead of
mcp.run()with no arguments, the server exposes an HTTP path, conventionally/mcp. Clients connect tohttps://your-app.example.com/mcp. - Authentication. A local stdio server runs as you, so auth is implicit. A server on the public internet needs real auth. The spec mandates OAuth 2.1 for public servers, but for a private personal tool a static API key checked in middleware is a pragmatic, lighter option. We will do both.
Prerequisites
- Python 3.10+
- pip (bundled with Python)
- A Heroku account and the Heroku CLI (
heroku --version) - git
Step 1: The project
mkdir mcp-remote && cd mcp-remote
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install "mcp[cli]" uvicorn gunicorn uvicorn-worker
The MCP Streamable HTTP endpoint is an async ASGI application, so it needs an ASGI server, not a plain WSGI one. We install three pieces for the server side: uvicorn (the ASGI server, also handy for local dev), gunicorn (a battle-tested process manager for production), and uvicorn-worker (lets gunicorn run uvicorn as its worker class). In production on Heroku we run gunicorn with uvicorn workers; locally you can just run uvicorn directly.
Step 2: The server with an HTTP endpoint
Create server.py:
import os
from enum import Enum
from mcp.server.fastmcp import FastMCP
from pydantic import BaseModel, Field
mcp = FastMCP("remote-demo")
@mcp.tool()
def say_hello(name: str) -> str:
"""Return a friendly greeting for the given name."""
return f"Hello, {name}! You are talking to a hosted MCP server."
class Units(str, Enum):
celsius = "celsius"
fahrenheit = "fahrenheit"
class WeatherReport(BaseModel):
location: str = Field(description="The location this report is for.")
temperature: float = Field(description="Temperature in the requested units.")
units: Units = Field(description="Units the temperature is expressed in.")
conditions: str = Field(description="Short description, e.g. 'Partly cloudy'.")
@mcp.tool()
def get_weather(
location: str = Field(description="City name, e.g. 'Toronto, CA'."),
units: Units = Field(default=Units.celsius, description="Temperature units."),
) -> WeatherReport:
"""Look up current weather for a location."""
temp_c = 21.0
temp = temp_c if units == Units.celsius else round(temp_c * 9 / 5 + 32, 1)
return WeatherReport(
location=location,
temperature=temp,
units=units,
conditions="Partly cloudy",
)
# Expose the server as an ASGI app mounted at /mcp.
# uvicorn/gunicorn import this `app` object.
app = mcp.streamable_http_app()
if __name__ == "__main__":
# Local dev convenience. Heroku does NOT use this branch;
# it runs gunicorn (with a uvicorn worker) against `app` via the Procfile.
port = int(os.environ.get("PORT", 8000))
mcp.run(transport="streamable-http", host="0.0.0.0", port=port)
The key line is app = mcp.streamable_http_app(). That turns your FastMCP server into a standard ASGI application with the MCP endpoint mounted at /mcp, which is what a production web server can serve.
Test locally before going further. For local development, running uvicorn directly is simplest (no need for gunicorn until production):
uvicorn server:app --host 0.0.0.0 --port 8000
Your endpoint is now at http://localhost:8000/mcp. You can point the MCP Inspector at it (mcp dev, then connect to that URL) to confirm both tools list and run.
Step 3: Add authentication
Pick one of the two approaches below depending on who will use the server.
Option A: API key (simplest, good for private/personal use)
A small piece of ASGI middleware rejects any request that does not carry the right key in a header. The key lives in an environment variable, never in code.
Add to server.py, replacing the app = mcp.streamable_http_app() line:
from starlette.middleware import Middleware
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
class APIKeyMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
expected = os.environ.get("MCP_API_KEY")
# If no key is configured, allow through (useful for local dev).
if expected:
provided = request.headers.get("x-api-key")
if provided != expected:
return JSONResponse({"error": "unauthorized"}, status_code=401)
return await call_next(request)
app = mcp.streamable_http_app()
app.add_middleware(APIKeyMiddleware)
The client sends the key as a header (x-api-key: ...). Both Claude.ai and Cursor let you attach custom headers, shown later. This is not OAuth and is not what the spec recommends for public, multi-user servers, but for a tool only you call it is simple and effective. Treat the key like a password and rotate it if it leaks.
Option B: OAuth 2.1 (spec-compliant, for shared or public servers)
The MCP spec mandates OAuth 2.1 with PKCE for public remote servers. The flow, briefly:
- The client hits your server with no token and gets a
401whoseWWW-Authenticateheader points at your authorization server. - The client discovers the authorization server’s metadata, registers, and opens a browser for the user to log in and consent.
- The user grants scopes; the client receives an access token bound to your server as the audience (RFC 8707), and sends it as
Authorization: Bearer ...on every call.
You generally do not hand-roll this. Two sane paths:
- Bring your own identity provider. FastMCP exposes auth through an
AuthSettingsobject and a token verifier. You point it at an existing OAuth 2.1 / OpenID Connect provider (Auth0, Okta, Clerk, WorkOS, your own) that issues and validates tokens. Your server becomes a resource server that only validates incoming tokens; it never issues them. - Use a gateway. Managed MCP gateways and platforms can terminate OAuth in front of your server so your application code stays auth-free.
A minimal sketch of the resource-server wiring in FastMCP looks like this (provider details depend on your IdP):
from mcp.server.auth.settings import AuthSettings
mcp = FastMCP(
"remote-demo",
auth=AuthSettings(
issuer_url="https://YOUR_TENANT.us.auth0.com",
required_scopes=["mcp:tools"],
# plus a token verifier configured for your provider's JWKS
),
)
One spec rule worth internalizing: never pass a client’s token through to a downstream API (Slack, GitHub, Stripe). A token minted for your server must not be forwarded to a different service; that breaks audience binding and creates a confused-deputy vulnerability. If your tools call other APIs, use their own separate credentials, stored server-side.
For the rest of this guide the Heroku steps are identical either way; only the environment variables differ.
Step 4: Files Heroku needs
Three files in the project root.
Procfile (no extension) tells Heroku how to start the app. We run gunicorn as the process manager with a uvicorn worker class so the async MCP endpoint is served correctly. $PORT is injected by Heroku at runtime:
web: gunicorn server:app -k uvicorn_worker.UvicornWorker --bind 0.0.0.0:$PORT
A note on worker count: gunicorn defaults to a single worker, which is what you want here unless you change the server’s session handling. FastMCP’s HTTP transport keeps session state in memory per process, so adding workers (-w 4) without switching to stateless mode means a client’s requests can land on a worker that does not know its session. Start with one worker; only scale out after addressing sessions (see the pitfalls section).
(The worker class is uvicorn_worker.UvicornWorker from the uvicorn-worker package. Older guides import uvicorn.workers.UvicornWorker from inside uvicorn itself; that path still works but is deprecated, so prefer the standalone package.)
runtime.txt pins the Python version:
python-3.12.7
requirements.txt lists dependencies. Freeze them from your virtual environment:
pip freeze > requirements.txt
Or write it by hand:
mcp[cli]
uvicorn
gunicorn
uvicorn-worker
Add a .gitignore too, so you do not commit your virtual environment or secrets:
.venv/
__pycache__/
.env
Step 5: Deploy to Heroku
git init
git add .
git commit -m "Initial remote MCP server"
heroku login
heroku create my-mcp-server # pick a unique name
# Set your secrets as config vars (never commit these):
heroku config:set MCP_API_KEY=$(openssl rand -hex 32)
# (OAuth users set their IdP vars here instead.)
git push heroku main
Heroku installs your dependencies, reads the Procfile, and starts gunicorn. When the push finishes, your endpoint is live at:
https://my-mcp-server.herokuapp.com/mcp
Check it is running:
heroku logs --tail
You want to see gunicorn boot, start its uvicorn worker, and bind to the port. If the app crashes on boot, the logs almost always name the missing dependency or the bad Procfile line.
Note the API key you generated:
heroku config:get MCP_API_KEY
You will paste it into the client in the next step.
Step 6: Connect from Claude.ai
Claude connects to remote servers as custom connectors. Claude reaches your server from Anthropic’s cloud, not from your device, so the URL must be public HTTPS, which Heroku gives you.
- In Claude.ai, go to Settings → Connectors (shortcut:
claude.ai/settings/connectors). - Scroll to the bottom and click Add custom connector.
- Enter a name and your URL:
https://my-mcp-server.herokuapp.com/mcp. - For OAuth, click Advanced settings and supply the OAuth Client ID and Client Secret if your server requires them; Claude runs the OAuth flow and you authorize in the browser.
- For the API key option, add the custom header (
x-api-keywith your key value) in the connector’s advanced settings. - Click Add.
Then, in any conversation, click the + button, open Connectors, and toggle your server on. Custom connectors are in beta and available across Free, Pro, Max, Team, and Enterprise, though Free is limited to one. On Team and Enterprise, an Owner has to add the connector at the organization level first before members can enable it.
Step 7: Connect from Cursor
Cursor reads MCP config from a JSON file: ~/.cursor/mcp.json for global use, or .cursor/mcp.json in a project root for team-shared, version-controlled setup.
For a remote server you provide a url and, if needed, headers.
API-key server:
{
"mcpServers": {
"remote-demo": {
"url": "https://my-mcp-server.herokuapp.com/mcp",
"headers": {
"x-api-key": "${env:MCP_API_KEY}"
}
}
}
}
The ${env:MCP_API_KEY} syntax pulls the value from an environment variable instead of hard-coding the secret in the file. This matters if you commit .cursor/mcp.json to a repo: never put a plaintext key in version control.
OAuth server: just give the URL and omit the headers. Cursor detects that the server requires auth, opens a browser OAuth flow, and stores the credentials for you:
{
"mcpServers": {
"remote-demo": {
"url": "https://my-mcp-server.herokuapp.com/mcp"
}
}
}
You can also use Settings → Tools & MCP → New MCP Server to add it through the UI. Restart Cursor fully after editing the file. One ceiling to know about: Cursor caps the total number of tools across all servers (around 40), so keep your server focused.
Common pitfalls
- Binding to the wrong port. On Heroku you must bind
$PORT. Hard-coding 8000 makes the dyno fail its health check and crash. TheProcfileabove handles this. - Forgetting the
/mcppath. The base URL alone is not the endpoint. Clients connect tohttps://.../mcp. - Shipping SSE. New servers should be Streamable HTTP only. Do not add a legacy SSE endpoint.
- Token passthrough. Do not forward a client’s auth token to a downstream API. Give your tools their own server-side credentials.
- Secrets in code or git. Keys and client secrets belong in Heroku config vars and in client-side env references, never in the repo.
- Free-tier sleep. If your dyno sleeps on idle, the first request after a nap is slow while it wakes; the client may time out on the very first call and succeed on retry. Use a paid dyno for anything you rely on.
- Stateful sessions across multiple workers or dynos. FastMCP’s HTTP transport keeps session state in memory per process by default. The same problem appears whether you add gunicorn workers (
-w) or scale to more than one dyno: a client’s follow-up requests may land on a process that does not know its session. Stay on a single worker and a single dyno, or run in stateless mode / add shared session storage before scaling out.
Where to go next
You now have a server reachable from any MCP client over the web, with a choice of API-key or OAuth auth, deployed on infrastructure you control. The natural next steps are giving the tools something real to do (call an actual API, query a database) using server-side credentials, tightening OAuth scopes so callers only get the access they need, and adding logging or rate-limiting middleware in the same place the API-key check lives.
PS: Make get_weather return real data
The get_weather tool above returns a hard-coded 21°C so the deployment steps stay the focus. Here is how to make it return live data using a genuinely free API.
We use Open-Meteo. It requires no API key, no signup, and no token management, which makes it ideal for an MCP tool: there is no secret to store in Heroku config vars. It is free for non-commercial use, and the data is licensed CC BY 4.0 (attribute Open-Meteo if you redistribute). Two endpoints do the work, both plain HTTP GET returning JSON:
- Geocoding turns a place name into coordinates:
https://geocoding-api.open-meteo.com/v1/search - Forecast turns coordinates into current weather:
https://api.open-meteo.com/v1/forecast
First add an HTTP client to your dependencies:
pip install requests
And add it to requirements.txt (requests) so Heroku installs it too.
Now replace the demo get_weather with this version. It geocodes the location, fetches the current weather, asks Open-Meteo to return the units the caller requested, and maps the numeric weather code to a readable description.
import requests
GEOCODE_URL = "https://geocoding-api.open-meteo.com/v1/search"
FORECAST_URL = "https://api.open-meteo.com/v1/forecast"
# Open-Meteo encodes conditions as WMO weather codes. A small subset:
WMO_CODES = {
0: "Clear sky",
1: "Mainly clear",
2: "Partly cloudy",
3: "Overcast",
45: "Fog",
48: "Depositing rime fog",
51: "Light drizzle",
53: "Moderate drizzle",
55: "Dense drizzle",
61: "Slight rain",
63: "Moderate rain",
65: "Heavy rain",
71: "Slight snow",
73: "Moderate snow",
75: "Heavy snow",
80: "Rain showers",
81: "Moderate rain showers",
82: "Violent rain showers",
95: "Thunderstorm",
96: "Thunderstorm with hail",
}
@mcp.tool()
def get_weather(
location: str = Field(description="City name, e.g. 'Toronto, CA'."),
units: Units = Field(default=Units.celsius, description="Temperature units."),
) -> WeatherReport:
"""Look up current weather for a location using the Open-Meteo API."""
# 1. Resolve the place name to coordinates.
geo = requests.get(
GEOCODE_URL,
params={"name": location, "count": 1},
timeout=10,
)
geo.raise_for_status()
results = geo.json().get("results")
if not results:
raise ValueError(f"Could not find a location named '{location}'.")
place = results[0]
lat, lon = place["latitude"], place["longitude"]
resolved = ", ".join(
part for part in (place.get("name"), place.get("country")) if part
)
# 2. Fetch current weather in the requested units.
unit_param = "fahrenheit" if units == Units.fahrenheit else "celsius"
forecast = requests.get(
FORECAST_URL,
params={
"latitude": lat,
"longitude": lon,
"current_weather": True,
"temperature_unit": unit_param,
},
timeout=10,
)
forecast.raise_for_status()
current = forecast.json()["current_weather"]
return WeatherReport(
location=resolved,
temperature=current["temperature"],
units=units,
conditions=WMO_CODES.get(current["weathercode"], "Unknown"),
)
A few things worth noting:
- Always set a
timeout. A tool that hangs on a slow upstream parks the model’s request indefinitely. Ten seconds is a reasonable ceiling. - Let the API do unit conversion. Passing
temperature_unitmeans you never do the math yourself, so there is no rounding drift between Celsius and Fahrenheit. - Raise a clear error on a missing location. The
ValueErrormessage travels back to the model as a tool error it can act on, which is far more useful than a generic 500. - No key means no secret. Because Open-Meteo is keyless, this works on Heroku with zero extra config. If you later switch to a keyed provider, store the key in a Heroku config var and read it with
os.environ, exactly as we did withMCP_API_KEY, and never hard-code it.
Redeploy with the usual git push heroku main, and your hosted server now answers real weather questions from any connected client.
