Vercel AI Gateway + Archestra

Vercel AI Gateway Archestra comparison: where the Vercel model proxy stops, where Archestra picks up for MCP routing, prompt injection, and enterprise audit.

Written by

Mack Chi

Vercel AI Gateway Archestra: Where Each Layer Fits

Vercel AI Gateway Archestra is not an either-or stack. Vercel AI Gateway is the right front-end for a Next.js AI app that needs streaming, edge routing, and provider fan-out across Anthropic, OpenAI, Google, Groq, Bedrock, and Mistral. Archestra is the agent-layer security and governance plane that sits underneath: MCP tool routing, prompt-injection defense, identity-aware execution, and a per-user audit trail. The two solve different problems on the same call path. Teams that close their first enterprise deal on a Vercel-hosted AI SDK app and then face an SSO, audit, and MCP checklist typically end up running both.

Archestra is an open-source security and governance layer between the AI app and the tools and models behind it. MCP brokering, prompt-injection defense, identity-aware tool routing, and an audit trail a security review can read. MIT-licensed, self-hostable. The full architecture is in the Vercel AI example.

The Vercel AI Gateway Archestra split, up front: Vercel AI Gateway is the right front-end for a Next.js AI app. Streaming, edge routing, model fan-out, billing across providers. Archestra is what gets bolted on the day the app gains MCP tools, enterprise customers, and a security review. Different concerns, same stack.

Axis	Vercel AI Gateway	Archestra
Layer it lives at	Model routing for Vercel-hosted apps	Agent execution, MCP tool routing, model proxy
What it terminates	Provider API calls from Next.js / AI SDK	Agent turns and tool calls
Auth it handles	Vercel project keys, BYOK per provider	Provider keys, virtual keys, OAuth 2.1 client credentials, user OAuth, enterprise JWKS, MCP OAuth and OBO
Guardrails	Rate limits, basic spend caps	Prompt-injection defense (dual-LLM), tool allowlist, dynamic tool engine, MCP sandboxing
Observability	Token spend, latency per provider, basic logs	Agent decision trace, tool calls, identity-bound audit, Prometheus and OpenTelemetry
Tools / MCP	Out of scope	Native MCP gateway, per-identity tool routing
Provider fan-out	Anthropic, OpenAI, Google, Groq, Bedrock, Mistral, others	OpenAI-compatible Model Router across configured providers
Deployment	Hosted on Vercel	Self-hosted Docker, runs anywhere
License	Proprietary (Vercel SaaS)	MIT

What Vercel AI Gateway Is Great At

Vercel AI Gateway solves a real problem cleanly. For a Next.js app on Vercel where the bottleneck is "swap GPT-4 for Claude without rewriting the route handler," the gateway is the right answer. Changing a model identifier in streamText swaps providers, billed through the Vercel account, with one set of credentials and one set of logs.

The fan-out story is strong. Anthropic, OpenAI, Google, Groq, Bedrock, Mistral, and a long tail of smaller providers behind the same ai package interface. Streaming works at the edge. Latency is measured per provider. A dashboard ships out of the box. No infrastructure to run. For a single-product team shipping consumer AI features on Vercel, that is the correct level of abstraction, and self-hosting the equivalent is not worth the cost.

Many Next.js AI apps start this way. The first version ships fast because Vercel hides the boring parts. That is a feature.

Where Vercel AI Gateway Stops

The gateway treats the LLM call as the unit of work. A request arrives from a Next.js route, a request goes out to Anthropic or OpenAI, the response streams back, the bill gets tracked. That model is correct for a model proxy. It is also why the gateway does not, and reasonably should not, handle the work that appears the moment the app stops being a chat UI and starts being an agent talking to real tools on behalf of a real user.

Concrete examples. The gateway does not know which MCP server an agent is about to call, because MCP servers are application concerns and a different network endpoint entirely. It does not validate that the human behind the call has access to the tools the agent wants to invoke, because there is no concept of "this user can read Jira but cannot post in Slack" inside a model proxy. It does not sandbox the code an MCP server runs. It does not defend against prompt injection in tool results, because by the time a tool result hits the gateway it is already text inside the next messages array. And it does not produce a per-user audit trail mapping employee to agent decision to tool call. The audit Vercel ships is per-project, not per-identity.

None of this is a knock on Vercel. These problems live at the agent layer. A model gateway is the wrong place to solve them. The mistake is assuming that because the product is called a "gateway," it covers the agent surface. It does not, and teams that learn this on the first procurement call with a regulated customer pay for the discovery.

Where Archestra Picks Up

Archestra terminates the agent turn, not just the model call. Same shift described in the LiteLLM post. The unit of work changes, and so does what the gateway can do.

Wiring it into a Vercel AI SDK app is small. One URL changes. In createOpenAI (or the equivalent for the chosen provider) baseURL points at Archestra instead of the Vercel gateway, and the rest of the streamText code keeps working. The full walkthrough, with code, is in the Vercel AI example. One line of config buys the layer underneath.

What that one line delivers. Archestra's MCP Gateway exposes one endpoint for every MCP tool the agent might use, with sandboxing and per-identity routing. The dual-LLM sub-agent quarantines untrusted tool output so a malicious GitHub issue or a poisoned document cannot rewrite agent instructions. The dynamic tool engine hides sensitive tools from the agent's view when the context is untrusted. JWKS validation against the customer's Okta or Entra ID produces an audit trail that names the employee at the customer who ran the tool, instead of "your Vercel project key was used 1,847 times today." The LLM proxy underneath handles provider fan-out across OpenAI, Anthropic, Bedrock, Vertex, Azure, Gemini, Groq, Mistral, Cerebras, OpenRouter, vLLM, Ollama and others, with an OpenAI-compatible front, so AI SDK code on the Next.js side does not change.

For "did the model respond, did it stream, did the bill land correctly," the Vercel gateway is enough. For "did the agent do the right thing on behalf of the right person with the right tools, and can it be proven to a customer's security team," that is the job Archestra was built for. The Why We Founded Archestra post is the longer version of why the agent layer is a separate problem rather than a feature bolted onto a model proxy.

How They Stack

The Vercel AI Gateway Archestra topology that ships for enterprise-grade Next.js apps:

Next.js app on Vercel  -->  Archestra (agent + MCP + guardrails + identity)  -->  Vercel AI Gateway (provider fan-out + streaming)  -->  OpenAI / Anthropic / Google / ...
                                          |
                                          +-->  MCP servers (sandboxed, identity-routed)

Archestra fronts the agent. When the agent needs a model call, Archestra's model router points at Vercel AI Gateway as the upstream provider, because the gateway speaks the OpenAI API on the front and Archestra's proxy speaks it on the back. Vercel keeps doing what it is good at: provider routing, streaming at the edge, billing across providers. Archestra does the agent-layer work above it: MCP brokering, dual-LLM, tool allowlists, identity-aware routing, decision-level audit against the customer's IdP.

For some teams the Vercel gateway becomes redundant once Archestra is in place, because Archestra's own LLM proxy covers the providers in scope. For other teams the Vercel gateway stays because the spend reporting and edge streaming are good enough and migration cost is not worth paying. Both choices are reasonable. The decision is operational, not architectural.

Picking

Use Vercel AI Gateway alone when the app is a Next.js app on Vercel, users are individual signed-in accounts on the product, and there are no MCP tools, enterprise customers, or security reviews on the calendar. The gateway is the right answer and adding Archestra is overkill.
Use Archestra alone when deploying outside Vercel, or starting fresh and wanting the agent layer, the MCP gateway, and the LLM proxy in one stack with one auth story and one audit trail. The Archestra LLM proxy covers most of what teams need; see supported providers.
Use both when already on Vercel and an enterprise customer has just sent a checklist. Archestra goes in front for the agent and MCP layer, Archestra's model router points at Vercel AI Gateway as an upstream provider, and Vercel keeps handling streaming and cross-provider billing. Four weeks is enough time to ship this path.

Two gateways, two different jobs. The Vercel AI Gateway Archestra pairing matches the layer of the actual problem, and stacks when the problem spans both. For Next.js AI apps with enterprise customers, it usually does.