Advanced
mcptoolinggovernanceintegrationsecurity

MCP Gateway

The Model Context Protocol gives hosts, clients, and servers a clean interoperability story. But once you cross the line from one integration to dozens — across teams, tenants, and trust boundaries — you need a place to enforce policy, normalize transports, and see everything that happens. The MCP Gateway is that place.

Used in
MCP Gateway Architecture
Mini Map
Pan, zoom, and explore. Click export to download as PNG.

Interactive diagram — pan, zoom, and explore. Click export to download as PNG.

📐The protocol is clean. The operational surface is not.

Why MCP needs a gateway

MCP is intentionally minimal at the edge. A host coordinates clients, each client maintains an isolated stateful session with a server, and servers expose tools, resources, and prompts over JSON-RPC. That design is excellent for composability — you can connect any host to any server and it just works. It is also exactly why teams reach production and suddenly discover they have no central place to manage the system as a whole.

Direct host-to-server topology works fine for a single developer wiring up Claude Desktop to a Git server. It starts failing the moment MCP usage becomes multi-team, multi-tenant, or compliance-sensitive. Every host re-implements its own auth handling. Every server advertises capabilities that no human has approved at a platform level. Rate limits are local, so a runaway agent on one host can saturate an upstream that a dozen other hosts depend on. Tool calls succeed or fail inside individual hosts with no global observability, so incident response means grepping logs on whichever laptop saw the problem.

These are not flaws in MCP. They are governance gaps that any capability-sharing protocol develops at scale, and they are exactly the gaps that operating systems, service meshes, and API gateways learned to fill in earlier eras. The MCP Gateway pattern applies that lesson: insert a policy and routing tier between hosts and capability providers, keep the host-client-server model intact on either side of it, and move cross-cutting concerns into one operator-owned seam.

Next up:Policy Plane
🔧One place to enforce auth, approved servers, and quotas

Policy Plane

The policy plane is the first thing a host talks to. Instead of the host authenticating to each upstream MCP server independently, it authenticates to the gateway. The gateway then decides — based on identity, the requested server, and the requested operation — whether the call is allowed at all, and only then routes it to the correct upstream. This consolidation is what makes the gateway valuable. Authentication stops being a per-host concern. Policy stops being a per-server concern. Rate limiting stops being a per-tool concern. All three move into a control plane that operators actually own, and that every host implicitly inherits. Capability negotiation still happens end-to-end between client and server, but it happens through a broker that can say no.

Policy Plane
Mini Map
Pan, zoom, and explore. Click export to download as PNG.
🪪

Auth & Identity

Every inbound connection carries a verified identity — human operator, service account, or tenant-scoped token — established via OIDC, mTLS, or platform SSO. The upstream server never sees the raw caller credentials, only a gateway-attested principal.

⚖️

Policy Engine

An OPA or Cedar evaluator decides whether a specific principal can invoke a specific tool or read a specific resource. Policies are versioned in source control and reviewed like any other infrastructure change, rather than being scattered across servers.

📒

Server Registry

A central inventory of which MCP servers are approved for which tenants, at which protocol version, and exposing which capabilities. Discovery returns only what the caller is permitted to see — hosts cannot even find servers they are not allowed to use.

🚦

Quotas & Breakers

Per-tenant and per-tool rate limits prevent a single runaway agent from drowning a shared upstream. Circuit breakers open when an upstream starts failing, shedding load gracefully instead of letting every host retry simultaneously.

Next up:Transport Bridge
🔧Hosts see one interface — whatever the upstream actually speaks

Transport Bridge

MCP currently defines two standard transports: stdio for local subprocess servers, and Streamable HTTP for remote servers (which replaced the older HTTP+SSE transport). Real environments mix both. You might have legacy internal tools wrapped as local stdio servers, modern SaaS-style MCP servers running over Streamable HTTP, and a handful of long-lived SSE deployments that cannot upgrade quickly. Without a gateway, every host has to understand every transport. With one, hosts talk to the gateway in a single way — typically Streamable HTTP — and the gateway handles the impedance mismatch on the other side. It launches stdio subprocesses on behalf of the caller, manages their lifecycle, and shuttles JSON-RPC messages between sockets and pipes. It keeps SSE compatibility alive during protocol migrations so host teams are not forced to upgrade on the same schedule as server teams.

Transport Bridge
Mini Map
Pan, zoom, and explore. Click export to download as PNG.
🔁

Stdio ↔ Streamable HTTP

A bidirectional bridge: local stdio MCP servers become HTTP-accessible, and remote HTTP-native servers remain reachable from hosts that only know how to speak stdio. Neither side needs to change.

🪢

Session Manager

Allocates and tracks the Mcp-Session-Id header that Streamable HTTP uses for stateful sessions. Cleans up stdio subprocesses when their owning session ends, so there are no orphaned processes consuming resources.

🕰️

Protocol-Version Shim

Reads the MCP-Protocol-Version header on incoming requests and negotiates with upstreams that may be on a different version. Lets the platform team roll protocol upgrades gradually instead of as a synchronized cutover.

🛡️

Origin & DNS-Rebind Guard

Validates the Origin header on every HTTP transport call before it reaches an upstream, closing the DNS-rebinding attack surface that local MCP servers would otherwise inherit from the browser.

Next up:Observability & Audit
🔧The single seam where you can actually see what agents are doing

Observability & Audit

Observability is the secondary reason to deploy a gateway, and often the one that justifies it politically. Every tool call that a host makes flows through one process. That means every tool call can carry the same trace ID from host through gateway into upstream server, every success and failure contributes to the same metrics, and every action leaves the same structured audit record regardless of which upstream served it. When something goes wrong at three in the morning — a wrong answer, a leaked record, a suspicious invocation — an operator opens one log stream instead of ten. They can replay the exact sequence of tool calls that a specific agent made during a specific session, correlated with the policy decisions that allowed each one. This is not a nice-to-have in regulated environments; it is typically the reason the platform team is allowed to run MCP at all.

Observability & Audit
Mini Map
Pan, zoom, and explore. Click export to download as PNG.
🧵

End-to-End Trace ID

Each tool call is stamped with a trace ID at the host, propagated through the gateway, and forwarded to the upstream server. Host, gateway, and server logs can be joined on that ID without any further correlation work.

📈

Metrics

Latency distributions, error rates, and QPS are collected per tenant, per server, and per tool. Operators can see at a glance which upstream is degrading and which tenant is generating unusual load.

📊

Audit Log

Every invocation — allowed, denied, or failed — is written to an append-only log with the principal, tool, arguments hash, policy decision, and result. Compliance teams get the trail they need without the server teams having to build one themselves.

⏮️

Replay & Alerts

Incident tooling can reconstruct a full session from the audit log, and alerts fire on anomalies — spikes in policy denials, breakers opening, or a tenant exceeding historical baselines. The gateway is where ops gets to act.

Next up:When to Use This Pattern
🎯Signs this is the right architecture for your situation

When to Use This Pattern

More than one team is connecting hosts to MCP servers, and you need consistent auth, approval, and quota behavior across all of them
Agents can invoke tools that touch real systems — billing, CRM, write APIs — and you need auditable policy decisions on every call
You run a mix of local stdio servers and remote HTTP-native servers and do not want every host to understand both transports
Protocol-version migrations are coming (SSE → Streamable HTTP, capability revisions) and you want to absorb them centrally instead of per-host
Next up:Trade-offs
⚖️What you gain — and what it costs

Trade-offs

Benefit
Cost
One policy surface across all MCP traffic — auth, approved servers, per-tenant quotas
An extra network hop adds latency on the critical path of every tool call
End-to-end observability and a single audit log for compliance and incident response
The gateway becomes a shared dependency — its availability now determines every integration's availability
Transport normalization frees host teams from tracking stdio vs Streamable HTTP vs legacy SSE
Running the gateway is a real platform investment — registry hygiene, policy review, HA, and version tracking
Protocol-version migrations happen in one place instead of being per-host migration projects
Policy modeling is a new discipline; early policies are usually too loose or too strict and require iteration