MCP server security

A Model Context Protocol (MCP) server exposes tools, resources, and prompts to an AI agent over an untrusted boundary: tool descriptions and tool outputs flow straight into the model's context, and remote servers participate in OAuth flows. Treat every string the server emits as model-influencing input and every token it receives as untrusted. Pin guidance to the MCP spec revision dated 2025-06-18 (Security Best Practices + Authorization), which carries the normative MUST/MUST NOT statements below.

Threat classes

Class Vector Core mitigation
Tool poisoning Malicious instructions embedded in a tool's description/schema Treat descriptions as untrusted content; surface them for human review
Output prompt injection Attacker-controlled data returned in tool results Label tool output as data, not instructions; sanitize/escape before returning
Rug-pull (tool redefinition) Tool changes its behavior/description after initial approval Bind user approval to a hash of the tool's content
Confused deputy OAuth proxy reuses a static client ID + consent cookie Per-client consent before forwarding to third-party authz
Token passthrough Server forwards client-supplied tokens downstream FORBIDDEN; reject tokens not issued for this server
Session hijacking Guessed/stolen session ID impersonates a client Non-deterministic IDs bound to user identity; never authenticate via session
SSRF Malicious discovery URLs point at internal/metadata hosts Validate/allowlist OAuth discovery URLs; block private ranges

Tool definitions and output

  • The server MUST NOT treat its own tool descriptions, parameter schemas, or returned results as trusted instructions to the agent; an attacker who controls any of these is attempting tool poisoning or output injection.
  • Tool results that contain external/user data MUST be returned as clearly delimited data, and the server SHOULD strip or neutralize embedded directives (e.g., "ignore previous instructions") before returning them.
  • The server SHOULD keep tool descriptions stable and side-effect-free; surface side effects in the schema, not in prose meant to steer the model.

Rug-pull / tool redefinition

A server (or a tool it proxies) can present a benign description at approval time and a malicious one later.

  • When a host binds user approval to a tool, that approval SHOULD be bound to a hash of the tool's full content (name, description, input schema). Any change re-triggers consent.
  • The server SHOULD version tool definitions and avoid silent semantic changes; changing behavior without a version bump defeats hash-bound approval.
  • ACCURACY NOTE: the publicized "MCPoison"/"CurXecute" CVEs were CLIENT-side bugs in an IDE and have been patched. Retain the rug-pull threat class and the hash-bound-approval mitigation; do not cite those CVEs as motivation for server authoring.

Authorization: token audience and passthrough

Treat a remote MCP server as an OAuth 2.1 resource server.

  • The server MUST NOT accept any token that was not explicitly issued for it (validate the aud/audience per RFC 8707 resource indicators).
  • The server MUST NOT pass a received token through to a downstream API ("token passthrough" is explicitly forbidden). To call a downstream service, use a token-exchange flow to obtain a distinct token, or act as its own client.
  • The server MUST validate every inbound token before processing the request and reject ones lacking it in the audience.

Confused deputy (OAuth proxy servers)

When the server proxies a third-party API using a static client ID, a stale consent cookie lets an attacker skip the consent screen.

  • A proxy server MUST implement per-client consent: maintain a registry of approved client_id values and check it before initiating the third-party flow.
  • The MCP-level consent page MUST identify the requesting client, show the third-party scopes and the exact registered redirect_uri, and apply CSRF protection.
  • redirect_uri MUST be validated by exact string match (no wildcards). The state value MUST be cryptographically random, single-use, short-lived, and set only after consent is approved.

Session hijacking

  • Servers that implement authorization MUST verify all inbound requests and MUST NOT use sessions for authentication.
  • Session IDs MUST be non-deterministic (CSPRNG-generated, e.g. UUIDv4); avoid sequential or guessable IDs.
  • Session-scoped data SHOULD be keyed as <user_id>:<session_id>, binding the session to identity derived from the token (not client-supplied), so a guessed ID cannot impersonate another user.

SSRF via OAuth discovery

A malicious server can populate discovery fields (resource_metadata, authorization_servers, token_endpoint) with internal URLs.

  • Server-side deployments MUST consider SSRF and apply mitigations when fetching OAuth-related URLs.
  • They SHOULD require HTTPS (loopback excepted for dev), block private/reserved ranges (10/8, 172.16/12, 192.168/16, 169.254/16, 127/8, fc00::/7, fe80::/10), and validate redirect targets per hop.
  • Prefer an egress proxy and DNS-pinning over hand-rolled IP parsing, which misses octal/hex/IPv4-mapped-IPv6 encodings.
version
1.0.0
tags
mcp, security, ai
author
Mike Fullerton
modified
2026-06-09

Change History

Version Date Author Summary
1.0.0 2026-06-09 Mike Fullerton Initial creation