MCP server security
A Model Context Protocol (MCP) server exposes tools, resources, and prompts to an AI agent over an untrusted boundary: tool descriptions and tool outputs flow straight into the model's context, and remote servers participate in OAuth flows. Treat every string the server emits as model-influencing input and every token it receives as untrusted. Pin guidance to the MCP spec revision dated 2025-06-18 (Security Best Practices + Authorization), which carries the normative MUST/MUST NOT statements below.
Threat classes
| Class | Vector | Core mitigation |
|---|---|---|
| Tool poisoning | Malicious instructions embedded in a tool's description/schema |
Treat descriptions as untrusted content; surface them for human review |
| Output prompt injection | Attacker-controlled data returned in tool results | Label tool output as data, not instructions; sanitize/escape before returning |
| Rug-pull (tool redefinition) | Tool changes its behavior/description after initial approval | Bind user approval to a hash of the tool's content |
| Confused deputy | OAuth proxy reuses a static client ID + consent cookie | Per-client consent before forwarding to third-party authz |
| Token passthrough | Server forwards client-supplied tokens downstream | FORBIDDEN; reject tokens not issued for this server |
| Session hijacking | Guessed/stolen session ID impersonates a client | Non-deterministic IDs bound to user identity; never authenticate via session |
| SSRF | Malicious discovery URLs point at internal/metadata hosts | Validate/allowlist OAuth discovery URLs; block private ranges |
Tool definitions and output
- The server MUST NOT treat its own tool descriptions, parameter schemas, or returned results as trusted instructions to the agent; an attacker who controls any of these is attempting tool poisoning or output injection.
- Tool results that contain external/user data MUST be returned as clearly delimited data, and the server SHOULD strip or neutralize embedded directives (e.g., "ignore previous instructions") before returning them.
- The server SHOULD keep tool descriptions stable and side-effect-free; surface side effects in the schema, not in prose meant to steer the model.
Rug-pull / tool redefinition
A server (or a tool it proxies) can present a benign description at approval time and a malicious one later.
- When a host binds user approval to a tool, that approval SHOULD be bound to a hash of the tool's full content (name, description, input schema). Any change re-triggers consent.
- The server SHOULD version tool definitions and avoid silent semantic changes; changing behavior without a version bump defeats hash-bound approval.
- ACCURACY NOTE: the publicized "MCPoison"/"CurXecute" CVEs were CLIENT-side bugs in an IDE and have been patched. Retain the rug-pull threat class and the hash-bound-approval mitigation; do not cite those CVEs as motivation for server authoring.
Authorization: token audience and passthrough
Treat a remote MCP server as an OAuth 2.1 resource server.
- The server MUST NOT accept any token that was not explicitly issued for it (validate the
aud/audience per RFC 8707 resource indicators). - The server MUST NOT pass a received token through to a downstream API ("token passthrough" is explicitly forbidden). To call a downstream service, use a token-exchange flow to obtain a distinct token, or act as its own client.
- The server MUST validate every inbound token before processing the request and reject ones lacking it in the audience.
Confused deputy (OAuth proxy servers)
When the server proxies a third-party API using a static client ID, a stale consent cookie lets an attacker skip the consent screen.
- A proxy server MUST implement per-client consent: maintain a registry of approved
client_idvalues and check it before initiating the third-party flow. - The MCP-level consent page MUST identify the requesting client, show the third-party scopes and the exact registered
redirect_uri, and apply CSRF protection. redirect_uriMUST be validated by exact string match (no wildcards). Thestatevalue MUST be cryptographically random, single-use, short-lived, and set only after consent is approved.
Session hijacking
- Servers that implement authorization MUST verify all inbound requests and MUST NOT use sessions for authentication.
- Session IDs MUST be non-deterministic (CSPRNG-generated, e.g. UUIDv4); avoid sequential or guessable IDs.
- Session-scoped data SHOULD be keyed as
<user_id>:<session_id>, binding the session to identity derived from the token (not client-supplied), so a guessed ID cannot impersonate another user.
SSRF via OAuth discovery
A malicious server can populate discovery fields (resource_metadata, authorization_servers, token_endpoint) with internal URLs.
- Server-side deployments MUST consider SSRF and apply mitigations when fetching OAuth-related URLs.
- They SHOULD require HTTPS (loopback excepted for dev), block private/reserved ranges (
10/8,172.16/12,192.168/16,169.254/16,127/8,fc00::/7,fe80::/10), and validate redirect targets per hop. - Prefer an egress proxy and DNS-pinning over hand-rolled IP parsing, which misses octal/hex/IPv4-mapped-IPv6 encodings.