MCP Gateway — Detailed Specification
Technical specification for a centralized MCP Gateway with detailed functional and non-functional requirements, security considerations, and stakeholder sign-off matrix.
The MCP Gateway provides a secure, observable, and self-service entry point for AI agents to invoke Model Context Protocol (MCP) tools exposed by engineering teams across the company. By routing all MCP traffic through a centralized gateway, the platform enforces consistent authentication, access control, and audit logging — without burdening individual product teams with those concerns.
Stakeholders
| Stakeholder | Description | Contact Persons |
|---|---|---|
| AI Platform | Governance and drive AI related initiatives | Sons, Kirsten / VP; Meyner, Felix / Platform Owner; Neurohr, Erik / Principal Engineer; Sheikh, Imrul Hassan / Engineering Manager |
| Product Team | Expose existing/new capability via MCP | Duraj, Ervis / Principal Engineer |
| Cyber Security | Ensure gateway/MCP tools deployed are production-grade in terms of security context | Dimitriadis, Dimitrios / Lead; Horodetskyy Nagaychuk, Orest / Exp. Engineer; Rathnayaka, Nuwan / Exp. Engineer |
| API Platform | Potential operator of the gateway, provide automated self-service and guidance for onboarding a new MCP tool | Thomas, Lee / Senior Engineer |
Functional Requirements
F1 Discoverable
- A central MCP catalogue lists all registered MCP Servers and their available tools.
- AI agents can query the catalogue at runtime to enumerate available tools without prior knowledge of upstream server addresses.
- The catalogue is accessible to authorized consumers via a stable API; no manual coordination with product teams is required to discover tools.
F2 Access Control
- The gateway controls who has access to which MCP tools.
- The gateway must implement token-based authentication:
- Support authentication OAuth 2.1 specification to identify users behind the AI agent, ideally supports the company standard OAuth provider (Entra ID)
- The gateway should also support authentication for external services, either via internal or external OAuth provider
- The gateway should support authentication using service account API-key to identify AI agents
- Identities should then be propagated to the MCP Servers for further role-based access control on specific resources (e.g., user viewing their own webshop account data)
- No human should access by direct call to the exposed MCP endpoint on the gateway
F3 Traffic Routing
- The gateway must perform correct routing to the upstream MCP Server based on pre-configured mapping from subpath to hosts and routes.
- Each upstream MCP Server should follow a company standard versioning.
- A different routing setup is possible based on development stage and its necessity.
- Routing configuration is managed declaratively (GitOps)
F4 Proxy Related
- The gateway should act as a transparent MCP proxy.
- The gateway must support Streamable HTTP as the transport protocol, should also support HTTP+SSE for backward compatibility
- The gateway must be able to transform the payload in specific cases:
- Removal or masking of sensitive information
- Perform additional authentication
Non-Functional Requirements
NF1 Latency
- P99 gateway overhead for a tool-call request must be < 100 ms (excluding upstream MCP Server processing time). Accounting for features like response masking and additional authentication.
NF2 Scalability
- The gateway must handle 100+ MCP Servers backend across all domains.
- The gateway must be horizontally scalable with no single point of failure.
- Onboarding process should be scalable and independent, i.e., speed of onboarding must not be bound by the gateway owning team.
NF3 Availability
- The gateway must be 99.XXXX% available
- The gateway must be hosted multi-region.
- The gateway health must be independent of degradation or outage of any upstream MCP Servers.
NF4 Monitoring
Appropriate metrics and tags should be provided similar to any standard REST API:
Standard Metrics:
requests_totalrequest_body_bytes_totalresponse_body_bytes_totalrequest_duration_seconds
Standard Tags:
response_codeupstream_response_code
MCP-Specific Tags:
mcp_methodtool_nameresource_uriprompt_name
NF5 Auditability
Each invocation of MCP Server should emit an audit entry with the following metadata:
timestampuser_id(when applicable)agent_identitymcp_servermcp_method(tools/list, tools/call…)- For tool invocation:
tool_name - For resource invocation:
resource_uri - For prompt invocation:
prompt_name response_codeupstream_response_code
The audit traces should be available for 7 days in productive environments.
NF6 Self Service
The gateway owner should provide self-service in best effort:
- Onboarding of new MCP Server
- Decommissioning of MCP Server
- Adding new consumer
- Access control modification
- Upstream host modification
NF7 Approval Bypass in Freeze Period for MCP Gateway
A clear bypass process should be defined for last-minute changes right before / during the freeze period, that includes:
- Critical criteria
- Escalation path
- Final decision maker (e.g., platform owner and VP)
NF8 Separation of MCP Gateway Instance
- The traffic towards MCP Servers should not affect the performance of the gateway of classic APIs.
NF9 Security
- The final gateway implementation should undergo a complete security assessment by the cyber security team
- Each MCP Server should pass a simple OWASP security checks as part of the go-live checklist
- Approval process must be set for onboarding new API and access granting process.
- Rate limiting must be implemented to avoid AI attacks. The implementation can be on company, MCP server, tool, and session level
Stakeholder Sign-Offs
| # | Requirements | AI Platform | Cyber Security | API Platform |
|---|---|---|---|---|
| F1 | Discoverable | ✅/❌ | ✅/❌ | ✅/❌ |
| F2 | Access Control | ✅/❌ | ✅/❌ | ✅/❌ |
| F3 | Traffic Routing | ✅/❌ | ✅/❌ | ✅/❌ |
| F4 | Proxy Related | ✅/❌ | ✅/❌ | ✅/❌ |
| NF1 | Latency | ✅/❌ | ✅/❌ | ✅/❌ |
| NF2 | Scalability | ✅/❌ | ✅/❌ | ✅/❌ |
| NF3 | Availability | ✅/❌ | ✅/❌ | ✅/❌ |
| NF4 | Monitoring | ✅/❌ | ✅/❌ | ✅/❌ |
| NF5 | Auditability | ✅/❌ | ✅/❌ | ✅/❌ |
| NF6 | Self Service | ✅/❌ | ✅/❌ | ✅/❌ |
| NF7 | Approval bypass in freeze period for MCP gateway | ✅/❌ | ✅/❌ | ✅/❌ |
| NF8 | Separation of MCP gateway instance | ✅/❌ | ✅/❌ | ✅/❌ |
| NF9 | Security | ✅/❌ | ✅/❌ | ✅/❌ |
Points to Clarify
- AI Platform: What is the foreseeable number of MCP tools?
- API Platform: Appropriate SLA of MCP?
- API Platform: Do we have use case to seamlessly transform OAS to MCP Server?