Visualização de leitura

Scaling MCP adoption: Our reference architecture for simpler, safer and cheaper enterprise deployments of MCP

We at Cloudflare have aggressively adopted Model Context Protocol (MCP) as a core part of our AI strategy. This shift has moved well beyond our engineering organization, with employees across product, sales, marketing, and finance teams now using agentic workflows to drive efficiency in their daily tasks. But the adoption of agentic workflow with MCP is not without its security risks. These range from authorization sprawl, prompt injection, and supply chain risks. To secure this broad company-wide adoption, we have integrated a suite of security controls from both our Cloudflare One (SASE) platform and our Cloudflare Developer platform, allowing us to govern AI usage with MCP without slowing down our workforce. 

In this blog we’ll walk through our own best practices for securing MCP workflows, by putting different parts of our platform together to create a unified security architecture for the era of autonomous AI. We’ll also share two new concepts that support enterprise MCP deployments:

We also talk about how our organization approached deploying MCP, and how we built out our MCP security architecture using Cloudflare products including remote MCP servers, Cloudflare Access, MCP server portals and AI Gateway

Remote MCP servers provide better visibility and control

MCP is an open standard that enables developers to build a two-way connection between AI applications and the data sources they need to access. In this architecture, the MCP client is the integration point with the LLM or other AI agent, and the MCP server sits between the MCP client and the corporate resources.

The separation between MCP clients and MCP servers allows agents to autonomously pursue goals and take actions while maintaining a clear boundary between the AI (integrated at the MCP client) and the credentials and APIs of the corporate resource (integrated at the MCP server). 

Our workforce at Cloudflare is constantly using MCP servers to access information in various internal resources, including our project management platform, our internal wiki, documentation and code management platforms, and more. 

Very early on, we realized that locally-hosted MCP servers were a security liability. Local MCP server deployments may rely on unvetted software sources and versions, which increases the risk of supply chain attacks or tool injection attacks. They prevent IT and security administrators from administrating these servers, leaving it up to individual employees and developers to choose which MCP servers they want to run and how they want to keep them up to date. This is a losing game.

Instead, we have a centralized team at Cloudflare that manages our MCP server deployment across the enterprise. This team built a shared MCP platform inside our monorepo that provides governed infrastructure out of the box. When an employee wants to expose an internal resource via MCP, they first get approval from our AI governance team, and then they copy a template, write their tool definitions, and deploy, all the while inheriting default-deny write controls with audit logging, auto-generated CI/CD pipelines, and secrets management for free. This means standing up a new governed MCP server is minutes of scaffolding. The governance is baked into the platform itself, which is what allowed adoption to spread so quickly. 

Our CI/CD pipeline deploys them as remote MCP servers on custom domains on Cloudflare’s developer platform. This gives us visibility into which MCPs servers are being used by our employees, while maintaining control over software sources. As an added bonus, every remote MCP server on the Cloudflare developer platform is automatically deployed across our global network of data centers, so MCP servers can be accessed by our employees with low latency, regardless of where they might be in the world.

Cloudflare Access provides authentication

Some of our MCP servers sit in front of public resources, like our Cloudflare documentation MCP server or Cloudflare Radar MCP server, and thus we want them to be accessible to anyone. But many of the MCP servers used by our workforce are sitting in front of our private corporate resources. These MCP servers require user authentication to ensure that they are off limits to everyone but authorized Cloudflare employees. To achieve this, our monorepo template for MCP servers integrates Cloudflare Access as the OAuth provider. Cloudflare Access secures login flows and issues access tokens to resources, while acting as an identity aggregator that verifies end user single-sign on (SSO), multifactor authentication (MFA), and a variety of contextual attributes such as IP addresses, location, or device certificates. 

MCP server portals centralize discovery and governance

MCP server portals unify governance and control for all AI activity.

As the number of our remote MCP servers grew, we hit a new wall: discovery. We wanted to make it easy for every employee (especially those that are new to MCP) to find and work with all the MCP servers that are available to them. Our MCP server portals product provided a convenient solution. The employee simply connects their MCP client to the MCP server portal, and the portal immediately reveals every internal and third-party MCP servers they are authorized to use. 

Beyond this, our MCP server portals provide centralized logging, consistent policy enforcement and data loss prevention (DLP guardrails). Our administrators can see who logged into what MCP portal and create DLP rules that prevent certain data, like personally identifiable data (PII), from being shared with certain MCP servers.

We can also create policies that control who has access to the portal itself, and what tools from each MCP server should be exposed. For example, we could set up one MCP server portal that is only accessible to employees that are part of our finance group that exposes just the read-only tools for the MCP server in front of our internal code repository. Meanwhile, a different MCP server portal, accessible only to employees on their corporate laptops that are in our engineering team, could expose more powerful read/write tools to our code repository MCP server.

An overview of our MCP server portal architecture is shown above. The portal supports both remote MCP servers hosted on Cloudflare, and third-party MCP servers hosted anywhere else. What makes this architecture uniquely performant is that all these security and networking components run on the same physical machine within our global network. When an employee's request moves through the MCP server portal, a Cloudflare-hosted remote MCP server, and Cloudflare Access, their traffic never needs to leave the same physical machine. 

Code Mode with MCP server portals reduces costs

After months of high-volume MCP deployments, we’ve paid out our fair share of tokens. We’ve also started to think most people are doing MCP wrong.

The standard approach to MCP requires defining a separate tool for every API operation that is exposed via an MCP server. But this static and exhaustive approach quickly exhausts an agent’s context window, especially for large platforms with thousands of endpoints.

We previously wrote about how we used server-side Code Mode to power Cloudflare’s MCP server, allowing us to expose the thousands of end-points in Cloudflare API while reducing token use by 99.9%. The Cloudflare MCP server exposes just two tools: a search tool lets the model write JavaScript to explore what’s available, and an execute tool lets it write JavaScript to call the tools it finds. The model discovers what it needs on demand, rather than receiving everything upfront.

We like this pattern so much, we had to make it available for everyone. So we have now launched the ability to use the “Code Mode” pattern with MCP server portals. Now you can front all of your MCP servers with a centralized portal that performs audit controls and progressive tool disclosure, in order to reduce token costs.

Here is how it works. Instead of exposing every tool definition to a client, all of your underlying MCP servers collapse into just two MCP portal tools: portal_codemode_search and portal_codemode_execute. The search tool gives the model access to a codemode.tools() function that returns all the tool definitions from every connected upstream MCP server. The model then writes JavaScript to filter and explore these definitions, finding exactly the tools it needs without every schema being loaded into context. The execute tool provides a codemode proxy object where each upstream tool is available as a callable function. The model writes JavaScript that calls these tools directly, chaining multiple operations, filtering results, and handling errors in code. All of this runs in a sandboxed environment on the MCP server portal powered by Dynamic Workers

Here is an example of an agent that needs to find a Jira ticket and update it with information from Google Drive. It first searches for the right tools:

// portal_codemode_search
async () => {
 const tools = await codemode.tools();
 return tools
  .filter(t => t.name.includes("jira") || t.name.includes("drive"))
  .map(t => ({ name: t.name, params: Object.keys(t.inputSchema.properties || {}) }));
}

The model now knows the exact tool names and parameters it needs, without the full schemas of tools ever entering its context. It then writes a single execute call to chain the operations together:

// portal_codemode_execute
async () => {
 const tickets = await codemode.jira_search_jira_with_jql({
  jql: ‘project = BLOG AND status = “In Progress”’,
  fields: [“summary”, “description”]
 });
 const doc = await codemode.google_workspace_drive_get_content({
  fileId: “1aBcDeFgHiJk”
 });
 await codemode.jira_update_jira_ticket({
  issueKey: tickets[0].key,
  fields: { description: tickets[0].description + “\n\n” + doc.content }
 });
 return { updated: tickets[0].key };
}

This is just two tool calls. The first discovers what's available, the second does the work. Without Code Mode, this same workflow would have required the model to receive the full schemas of every tool from both MCP servers upfront, and then make three separate tool invocations.

Let’s put the savings in perspective: when our internal MCP server portal is connected to just four of our internal MCP servers, it exposes 52 tools that consume approximately 9,400 tokens of context just for their definitions. With Code Mode enabled, those 52 tools collapse into 2 portal tools consuming roughly 600 tokens, a 94% reduction. And critically, this cost stays fixed. As we connect more MCP servers to the portal, the token cost of Code Mode doesn’t grow.

Code Mode can be activated on an MCP server portal by adding a query parameter to the URL. Instead of connecting to your portal over its usual URL (e.g. https://myportal.example.com/mcp), you attach ?codemode=search_and_execute to the URL (e.g. https://myportal.example.com/mcp?codemode=search_and_execute).

AI Gateway provides extensibility and cost controls

We aren’t done yet. We plug AI Gateway into our architecture by positioning it on the connection between the MCP client and the LLM. This allows us to quickly switch between various LLM providers (to prevent vendor lock-in) and to enforce cost controls (by limiting the number of tokens each employee can burn through). The full architecture is shown below.

Cloudflare Gateway discovers and blocks shadow MCP

Now that we’ve provided governed access to authorized MCP servers, let’s look into dealing with unauthorized MCP servers. We can perform shadow MCP discovery using Cloudflare Gateway. Cloudflare Gateway is our comprehensive secure web gateway that provides enterprise security teams with visibility and control over their employees’ Internet traffic.

We can use the Cloudflare Gateway API to perform a multi-layer scan to find remote MCP servers that are not being accessed via an MCP server portal. This is possible using a variety of existing Gateway and Data Loss Prevention (DLP) selectors, including:

  • Using the Gateway httpHost selector to scan for 

    • known MCP server hostnames using (like mcp.stripe.com)

    • mcp.* subdomains using wildcard hostname patterns 

  • Using the Gateway httpRequestURI selector to scan for MCP-specific URL paths like /mcp and /mcp/sse 

  • Using DLP-based body inspection to find MCP traffic, even if that traffic uses URI that do not contain the telltale mentions of mcp or sse. Specifically, we use the fact that MCP uses JSON-RPC over HTTP, which means every request contains a "method" field with values like "tools/call", "prompts/get", or "initialize." Here are some regex rules that can be used to detect MCP traffic in the HTTP body:

const DLP_REGEX_PATTERNS = [
  {
    name: "MCP Initialize Method",
    regex: '"method"\\s{0,5}:\\s{0,5}"initialize"',
  },
  {
    name: "MCP Tools Call",
    regex: '"method"\\s{0,5}:\\s{0,5}"tools/call"',
  },
  {
    name: "MCP Tools List",
    regex: '"method"\\s{0,5}:\\s{0,5}"tools/list"',
  },
  {
    name: "MCP Resources Read",
    regex: '"method"\\s{0,5}:\\s{0,5}"resources/read"',
  },
  {
    name: "MCP Resources List",
    regex: '"method"\\s{0,5}:\\s{0,5}"resources/list"',
  },
  {
    name: "MCP Prompts List",
    regex: '"method"\\s{0,5}:\\s{0,5}"prompts/(list|get)"',
  },
  {
    name: "MCP Sampling Create Message",
    regex: '"method"\\s{0,5}:\\s{0,5}"sampling/createMessage"',
  },
  {
    name: "MCP Protocol Version",
    regex: '"protocolVersion"\\s{0,5}:\\s{0,5}"202[4-9]',
  },
  {
    name: "MCP Notifications Initialized",
    regex: '"method"\\s{0,5}:\\s{0,5}"notifications/initialized"',
  },
  {
    name: "MCP Roots List",
    regex: '"method"\\s{0,5}:\\s{0,5}"roots/list"',
  },
];

The Gateway API supports additional automation. For example, one can use the custom DLP profile we defined above to block traffic, or redirect it, or just to log and inspect MCP payloads. Put this together, and Gateway can be used to provide comprehensive detection of unauthorized remote MCP servers accessed via an enterprise network. 

For more information on how to build this out, see this tutorial

Public-facing MCP Servers are protected with AI Security for Apps

So far, we’ve been focused on protecting our workforce’s access to our internal MCP servers. But, like many other organizations, we also have public-facing MCP servers that our customers can use to agentically administer and operate Cloudflare products. These MCP servers are hosted on Cloudflare’s developer platform. (You can find a list of individual MCPs for specific products here, or refer back to our new approach for providing more efficient access to the entire Cloudflare API using Code Mode.)

We believe that every organization should publish official, first-party MCP servers for their products. The alternative is that your customers source unvetted servers from public repositories where packages may contain dangerous trust assumptions, undisclosed data collection, and any range of unsanctioned behaviors. By publishing your own MCP servers, you control the code, update cadence, and security posture of the tools your customers use.

Since every remote MCP server is an HTTP endpoint, we can put it behind the Cloudflare Web Application Firewall (WAF). Customers can enable the AI Security for Apps feature within the WAF to automatically inspect inbound MCP traffic for prompt injection attempts, sensitive data leakage, and topic classification. Public facing MCPs are protected just as any other web API.  

The future of MCP in the enterprise

We hope our experience, products, and reference architectures will be useful to other organizations as they continue along their own journey towards broad enterprise-wide adoption of MCP.

We’ve secured our own MCP workflows by: 

  • Offering our developers a templated framework for building and deploying remote MCP servers on our developer platform using Cloudflare Access for authentication

  • Ensuring secure, identity-based access to authorized MCP servers by connecting our entire workforce to MCP server portals

  • Controlling costs using AI Gateway to mediate access to the LLMs powering our workforce’s MCP clients, and using Code Mode in MCP server portals to reduce token consumption and context bloat

  • Discovering shadow MCP usage by Cloudflare Gateway 

For organizations advancing on their own enterprise MCP journeys, we recommend starting by putting your existing remote and third-party MCP servers behind  Cloudflare MCP server portals and enabling Code Mode to start benefitting for cheaper, safer and simpler enterprise deployments of MCP.  

Acknowledgements:  This reference architecture and blog represents this work of many people across many different roles and business units at Cloudflare. This is just a partial list of contributors: Ann Ming Samborski,  Kate Reznykova, Mike Nomitch, James Royal, Liam Reese, Yumna Moazzam, Simon Thorpe, Rian van der Merwe, Rajesh Bhatia, Ayush Thakur, Gonzalo Chavarri, Maddy Onyehara, and Haley Campbell.

Managed OAuth for Access: make internal apps agent-ready in one click

We have thousands of internal apps at Cloudflare. Some are things we’ve built ourselves, others are self-hosted instances of software built by others. They range from business-critical apps nearly every person uses, to side projects and prototypes.

All of these apps are protected by Cloudflare Access. But when we started using and building agents — particularly for uses beyond writing code — we hit a wall. People could access apps behind Access, but their agents couldn’t.

Access sits in front of internal apps. You define a policy, and then Access will send unauthenticated users to a login page to choose how to authenticate. 

Example of a Cloudflare Access login page

This flow worked great for humans. But all agents could see was a redirect to a login page that they couldn’t act on.

Providing agents with access to internal app data is so vital that we immediately implemented a stopgap for our own internal use. We modified OpenCode’s web fetch tool such that for specific domains, it triggered the cloudflared CLI to open an authorization flow to fetch a JWT (JSON Web Token). By appending this token to requests, we enabled secure, immediate access to our internal ecosystem.

While this solution was a temporary answer to our own dilemma, today we’re retiring this workaround and fixing this problem for everyone. Now in open beta, every Access application supports managed OAuth. One click to enable it for an Access app, and agents that speak OAuth 2.0 can easily discover how to authenticate (RFC 9728), send the user through the auth flow, and receive back an authorization token (the same JWT from our initial solution). 

Now, the flow works smoothly for both humans and agents. Cloudflare Access has a generous free tier. And building off our newly-introduced Organizations beta, you’ll soon be able to bridge identity providers across Cloudflare accounts too.

How managed OAuth works

For a given internal app protected by Cloudflare Access, you enable managed OAuth in one click:

Once managed OAuth is enabled, Cloudflare Access acts as the authorization server. It returns the www-authenticate header, telling unauthorized agents where to look up information on how to get an authorization token. They find this at https://<your-app-domain>/.well-known/oauth-authorization-server. Equipped with that direction, agents can just follow OAuth standards: 

  1. The agent dynamically registers itself as a client (a process known as Dynamic Client Registration — RFC 7591), 

  2. The agent sends the human through a PKCE (Proof Key for Code Exchange) authorization flow (RFC 7636)

  3. The human authorizes access, which grants a token to the agent that it can use to make authenticated requests on behalf of the user

Here’s what the authorization flow looks like:

If this authorization flow looks familiar, that’s because it’s what the Model Context Protocol (MCP) uses. We originally built support for this into our MCP server portals product, which proxies and controls access to many MCP servers, to allow the portal to act as the OAuth server. Now, we’re bringing this to all Access apps so agents can access not only MCP servers that require authorization, but also web pages, web apps, and REST APIs.

Mass upgrading your internal apps to be agent-ready

Upgrading the long tail of internal software to work with agents is a daunting task. In principle, in order to be agent-ready, every internal and external app would ideally have discoverable APIs, a CLI, a well-crafted MCP server, and have adopted the many emerging agent standards.

AI adoption is not something that can wait for everything to be retrofitted. Most organizations have a significant backlog of apps built over many years. And many internal “apps” work great when treated by agents as simple websites. For something like an internal wiki, all you really need is to enable Markdown for Agents, turn on managed OAuth, and agents have what they need to read protected content.

To make the basics work across the widest set of internal applications, we use Managed OAuth. By putting Access in front of your legacy internal apps, you make them agent-ready instantly. No code changes, no retrofitting. Instead, just immediate compatibility.

It’s the user’s agent. No service accounts and tokens needed

Agents need to act on behalf of users inside organizations. One of the biggest anti-patterns we’ve seen is people provisioning service accounts for their agents and MCP servers, authenticated using static credentials. These have their place in simple use cases and quick prototypes, and Cloudflare Access supports service tokens for this purpose.

But the service account approach quickly shows its limits when fine-grained access controls and audit logs are required. We believe that every action an agent performs must be easily attributable to the human who initiated it, and that an agent must only be able to perform actions that its human operator is likewise authorized to do. Service accounts and static credentials become points at which attribution is lost. Agents that launder all of their actions through a service account are susceptible to confused deputy problems and result in audit logs that appear to originate from the agent itself.

For security and accountability, agents must use security primitives capable of expressing this user–agent relationship. OAuth is the industry standard protocol for requesting and delegating access to third parties. It gives agents a way to talk to your APIs on behalf of the user, with a token scoped to the user’s identity, so that access controls correctly apply and audit logs correctly attribute actions to the end user.

Standards for the win: how agents can and should adopt RFC 9728 in their web fetch tools

RFC 9728 is the OAuth standard that makes it possible for agents to discover where and how to authenticate. It standardizes where this information lives and how it’s structured. This RFC became official in April 2025 and was quickly adopted by the Model Context Protocol (MCP), which now requires that both MCP servers and clients support it.

But outside of MCP, agents should adopt RFC 9728 for an even more essential use case: making requests to web pages that are protected behind OAuth and making requests to plain old REST APIs.

Most agents have a tool for making basic HTTP requests to web pages. This is commonly called the “web fetch” tool. It’s similar to using the fetch() API in JavaScript, often with some additional post-processing on the response. It’s what lets you paste a URL into your agent and have your agent go look up the content.

Today, most agents’ web fetch tools won’t do anything with the www-authenticate header that a URL returns. The underlying model might choose to introspect the response headers and figure this out on its own, but the tool itself does not follow www-authenticate, look up /.well-known/oauth-authorization-server, and act as the client in the OAuth flow. But it can, and we strongly believe it should! Agents already do this to act as remote MCP clients.

To demonstrate this, we’ve put up a draft pull request that adapts the web fetch tool in Opencode to show this in action. Before making a request, the adapted tool first checks whether it already has credentials ; if it does, it uses them to make the initial request. If the tool gets back a 401 or a 403 with a www-authenticate header, it asks the user for consent to be sent through the server’s OAuth flow.

Here’s how that OAuth flow works. If you give the agent a URL that is protected by OAuth and complies with RFC 9728, the agent prompts the human for consent to open the authorization flow:

…sending the human to the login page:

…and then to a consent dialog that prompts the human to grant access to the agent:

Once the human grants access to the agent, the agent uses the token it has received to make an authenticated request:

Any agent from Codex to Claude Code to Goose and beyond can implement this, and there’s nothing bespoke to Cloudflare. It’s all built using OAuth standards.

We think this flow is powerful, and that supporting RFC 9728 can help agents with more than just making basic web fetch requests. If a REST API supports RFC 9728 (and the agent does too), the agent has everything it needs to start making authenticated requests against that API. If the REST API supports RFC 9727, then the client can discover a catalog of REST API endpoints on its own, and do even more without additional documentation, agent skills, MCP servers or CLIs. 

Each of these play important roles with agents — Cloudflare itself provides an MCP server for the Cloudflare API (built using Code Mode), Wrangler CLI, and Agent Skills, and a Plugin. But supporting RFC 9728 helps ensure that even when none of these are preinstalled, agents have a clear path forward. If the agent has a sandbox to execute untrusted code, it can just write and execute code that calls the API that the human has granted it access to. We’re working on supporting this for Cloudflare’s own APIs, to help your agents understand how to use Cloudflare.

Coming soon: share one identity provider (IdP) across many Cloudflare accounts

At Cloudflare our own internal apps are deployed to dozens of different Cloudflare accounts, which are all part of an Organization — a newly introduced way for administrators to manage users, configurations, and view analytics across many Cloudflare accounts. We have had the same challenge as many of our customers: each Cloudflare account has to separately configure an IdP, so Cloudflare Access uses our identity provider. It’s critical that this is consistent across an organization — you don’t want one Cloudflare account to inadvertently allow people to sign in just with a one-time PIN, rather than requiring that they authenticate via single-sign on (SSO).

To solve this, we’re currently working on making it possible to share an identity provider across Cloudflare accounts, giving organizations a way to designate a single primary IdP for use across every account in their organization.

As new Cloudflare accounts are created within an organization, administrators will be able to configure a bridge to the primary IdP with a single click, so Access applications across accounts can be protected by one identity provider. This removes the need to manually configure IdPs account by account, which is a process that doesn’t scale for organizations with many teams and individuals each operating their own accounts.

What’s next

Across companies, people in every role and business function are now using agents to build internal apps, and expect their agents to be able to access context from internal apps. We are responding to this step function growth in internal software development by making the Workers Platform and Cloudflare One work better together — so that it is easier to build and secure internal apps on Cloudflare. 

Expect more to come soon, including:

  • More direct integration between Cloudflare Access and Cloudflare Workers, without the need to validate JWTs or remember which of many routes a particular Worker is exposed on.

  • wrangler dev --tunnel — an easy way to expose your local development server to others when you’re building something new, and want to share it with others before deploying

  • A CLI interface for Cloudflare Access and the entire Cloudflare API

  • More announcements to come during Agents Week 2026

Enable Managed OAuth for your internal apps behind Cloudflare Access

Managed OAuth is now available, in open beta, to all Cloudflare customers. Head over to the Cloudflare dashboard to enable it for your Access applications. You can use it for any internal app, whether it’s one built on Cloudflare Workers, or hosted elsewhere. And if you haven’t built internal apps on the Workers Platform yet — it’s the fastest way for your team to go from zero to deployed (and protected) in production.

❌