Adding LLM Providers

9 min read

Overview

This guide covers how to add a new LLM provider to Archestra Platform. Each provider requires:

  1. LLM Proxy - The proxy that sits between clients and LLM providers. Handles security policies, tool invocation controls, metrics, and observability. Clients send requests to the proxy, which forwards them to the provider. It must handle both streaming and non-streaming provider responses.

  2. Chat - The built-in chat interface.

LLM Proxy

Provider Registration

Defines the provider identity used throughout the codebase for type safety and runtime checks.

FileDescription
shared/model-constants.tsAdd provider to SupportedProvidersSchema enum
shared/model-constants.tsAdd to SupportedProvidersDiscriminatorSchema - format is provider:endpoint
shared/model-constants.tsAdd display name to providerDisplayNames

Type Definitions

Each provider needs Zod schemas defining its API contract. TypeScript types are inferred from these schemas.

FileDescription
backend/src/types/llm-providers/{provider}/api.tsRequest body schema, response schema, and headers schema (for extracting API keys)
backend/src/types/llm-providers/{provider}/messages.tsMessage array schemas - defines the structure of conversation history (user/assistant/tool messages)
backend/src/types/llm-providers/{provider}/tools.tsTool definition schemas - how tools are declared in requests (function calling format)
backend/src/types/llm-providers/{provider}/index.tsNamespace export that groups all types under {Provider}.Types
backend/src/types/interaction.tsAdd provider schemas to InteractionRequestSchema, InteractionResponseSchema, and SelectInteractionSchema discriminated union

Adapter Implementation

The adapter pattern provides a provider-agnostic API for business logic. LLMProxy operates entirely through adapters, never touching provider-specific types directly.

FileDescription
backend/src/routes/proxy/adapterV2/{provider}.tsImplement all adapter classes
backend/src/routes/proxy/adapterV2/index.tsExport the {provider}AdapterFactory function

Adapters to Implement:

  • RequestAdapter: Provides Read/write access for the request data (model, messages, tools);
  • ResponseAdapter: Provides Read/write access to thee response data (id, model, text, tool calls, usage);
  • StreamAdapter: Process streaming chunks incrementally, accumulatin data required fro the LLMProxy logic;
  • LLMProvider: Create adapters, extract API keys from headers, create provider SDK clients, execute requests;

Route Handler

HTTP endpoint that receives client requests and delegates to handleLLMProxy().

FileDescription
shared/routes.tsAdd RouteId constants for the new provider (e.g., {Provider}ChatCompletionsWithDefaultAgent, {Provider}ChatCompletionsWithAgent)
backend/src/routes/proxy/routesv2/{provider}.tsFastify route that validates request, extracts context (agent ID, org ID), and calls handleLLMProxy(body, headers, reply, adapterFactory, context)
backend/src/routes/index.tsExport the new route module
backend/src/server.tsRegister the route with Fastify and add request/response schemas to the global Zod registry for OpenAPI generation

Important: Deterministic Codegen

Routes must always be registered regardless of whether the provider is enabled. This ensures OpenAPI schema generation is deterministic across environments.

  • Register routes unconditionally (for schema generation)
  • Conditionally register HTTP proxy only when provider is enabled (has baseUrl configured)
  • Return a 500 error in route handlers if provider is not configured at runtime
// ✅ Correct: Routes always registered, proxy conditionally registered
if (config.llm.{provider}.enabled) {
  await fastify.register(fastifyHttpProxy, { upstream: config.llm.{provider}.baseUrl as string, ... });
}

// In route handlers, check at runtime:
if (!config.llm.{provider}.enabled) {
  return reply.status(500).send({
    error: { message: "{Provider} is not configured. Set ARCHESTRA_{PROVIDER}_BASE_URL to enable.", type: "api_internal_server_error" }
  });
}

Configuration

Base URL configuration allows routing to custom endpoints (e.g., Azure OpenAI, local proxies, testing mocks).

FileDescription
backend/src/config.tsAdd llm.{provider}.baseUrl and llm.{provider}.enabled (typically Boolean(baseUrl)) with environment variable (e.g., ARCHESTRA_{PROVIDER}_BASE_URL)

Feature Flags

Expose provider availability to the frontend for conditional UI rendering.

FileDescription
backend/src/routes/features.tsAdd {provider}Enabled boolean to the features schema and response

Tokenizer

Note: This is a known abstraction leak that we're planning to address in future versions. Thanks for bearing with us!

Tokenizers estimate token counts for provider messages. Used by Model Optimization and Tool Results Compression.

FileDescription
backend/src/tokenizers/base.tsAdd provider message type to ProviderMessage union
backend/src/tokenizers/base.tsUpdate BaseTokenizer.getMessageText() if provider has a different message format
backend/src/tokenizers/index.tsAdd case to getTokenizer() switch - return appropriate tokenizer (or fallback to TiktokenTokenizer)

Model Optimization

Note: This is a known abstraction leak that we're planning to address in future versions. Thanks for bearing with us!

Model optimization evaluates token counts to switch to cheaper models when possible.

FileDescription
backend/src/routes/proxy/utils/cost-optimization.tsAdd provider to ProviderMessages type mapping (e.g., gemini: Gemini.Types.GenerateContentRequest["contents"])

Tool Results Compression

Note: This is a known abstraction leak that we're planning to address in future versions. Thanks for bearing with us!

TOON (Token-Oriented Object Notation) compression converts JSON tool results to a more token-efficient format. Each provider needs its own implementation because message structures differ.

FileDescription
backend/src/routes/proxy/adapterV2/{provider}.tsImplement convertToolResultsToToon() function that traverses provider-specific message array and compresses tool result content

The function must:

  1. Iterate through provider-specific message array structure
  2. Find tool result messages (e.g., role: "tool" in OpenAI, tool_result blocks in Anthropic, functionResponse parts in Gemini)
  3. Parse JSON content and convert to TOON format using @toon-format/toon
  4. Calculate token savings using the appropriate tokenizer
  5. Return compressed messages and compression statistics

Dual LLM

Note: This is a known abstraction leak that we're planning to address in future versions. Thanks for bearing with us!

Dual LLM pattern uses a secondary LLM for Q&A verification of tool invocations. Each provider needs its own client implementation.

FileDescription
backend/src/routes/proxy/utils/dual-llm-client.tsCreate {Provider}DualLlmClient class implementing DualLlmClient interface with chat() and chatWithSchema() methods
backend/src/routes/proxy/utils/dual-llm-client.tsAdd case to createDualLlmClient() factory switch

Metrics

Note: This is a known abstraction leak that we're planning to address in future versions. Thanks for bearing with us!

Prometheus metrics for request duration, token usage, and costs. Requires instrumenting provider SDK clients.

For example: OpenAI and Anthropic SDKs accept a custom fetch function, so we inject an instrumented fetch via getObservableFetch(). Gemini SDK doesn't expose fetch, so we wrap the SDK instance directly via getObservableGenAI().

FileDescription
backend/src/llm-metrics.tsImplement instrumented API calls for the SDK

Frontend: Logs UI

Interaction handlers parse stored request/response data for display in the LLM Proxy Logs UI (/logs/llm-proxy).

FileDescription
frontend/src/lib/llmProviders/{provider}.tsImplement InteractionUtils interface for parsing provider-specific request/response JSON
frontend/src/lib/interaction.utils.tsAdd case to getInteractionClass() switch to route discriminator to handler

E2E Tests

Each provider must be added to the LLM Proxy e2e tests to ensure all features work correctly.

FileDescription
helm/e2e-tests/mappings/{provider}-*.jsonWireMock stub mappings for mocking provider API responses (models list, chat completions, tool calls, etc.)
.github/values-ci.yamlAdd provider base URL pointing to WireMock (e.g., ARCHESTRA_{PROVIDER}_BASE_URL: "http://e2e-tests-wiremock:8080/v1")
e2e-tests/tests/api/llm-proxy/tool-invocation.spec.tsTool invocation policy tests - add {provider}Config to testConfigs array
e2e-tests/tests/api/llm-proxy/tool-persistence.spec.tsTool call persistence tests - add {provider}Config to testConfigs array
e2e-tests/tests/api/llm-proxy/tool-result-compression.spec.tsTOON compression tests - add {provider}Config to testConfigs array
e2e-tests/tests/api/llm-proxy/model-optimization.spec.tsModel optimization tests - add {provider}Config to testConfigs array
e2e-tests/tests/api/llm-proxy/token-cost-limits.spec.tsToken cost limits tests - add {provider}Config to testConfigs array

Chat Support

Below is the list of modification requrest to support new Provider in the built-in Archestra Chat.

Configuration

Environment variables for API keys and base URLs.

FileDescription
backend/src/config.tsAdd chat.{provider}.apiKey and baseUrl

Chat Provider Registration

Allows users to select this provider's models in the Chat UI.

FileDescription
backend/src/types/chat-api-key.tsAdd to SupportedChatProviderSchema

Model Listing

Each provider has a different API for listing available models.

FileDescription
backend/src/routes/chat-models.tsAdd fetch{Provider}Models() function and register in modelFetchers
backend/src/routes/chat-models.tsAdd case to getProviderApiKey() switch

LLM Client

Chat uses Vercel AI SDK which requires provider-specific model creation.

FileDescription
backend/src/services/llm-client.tsAdd to detectProviderFromModel() - model naming conventions differ (e.g., gpt-*, claude-*)
backend/src/services/llm-client.tsAdd case to resolveProviderApiKey() switch
backend/src/services/llm-client.tsAdd case to createLLMModel() - AI SDK requires provider-specific initialization

Error Handling

Each provider SDK wraps errors differently, requiring provider-specific parsing.

FileDescription
shared/chat-error.tsAdd {Provider}ErrorTypes constants
backend/src/routes/chat/errors.tsAdd parse{Provider}Error() and map{Provider}ErrorToCode() functions

Frontend UI

UI components for Chat need provider-specific configuration.

FileDescription
frontend/public/icons/{provider}.pngProvider logo (64x64px PNG recommended)
frontend/src/components/chat/model-selector.tsxAdd provider to providerToLogoProvider mapping
frontend/src/components/chat-api-key-form.tsxAdd provider entry to PROVIDER_CONFIG with name, icon path, placeholder, and console URL
frontend/src/app/chat/page.tsxUpdate hasValidApiKey logic if provider doesn't require API key (e.g., local providers like vLLM/Ollama)

Reference Implementations

Existing provider implementations for reference:

Full implementations (custom API formats):

  • OpenAI: backend/src/routes/proxy/routesv2/openai.ts, backend/src/routes/proxy/adapterV2/openai.ts
  • Anthropic: backend/src/routes/proxy/routesv2/anthropic.ts, backend/src/routes/proxy/adapterV2/anthropic.ts
  • Gemini: backend/src/routes/proxy/routesv2/gemini.ts, backend/src/routes/proxy/adapterV2/gemini.ts

OpenAI-compatible implementations (reuse OpenAI types/adapters with minor modifications):

  • vLLM: backend/src/routes/proxy/routesv2/vllm.ts, backend/src/routes/proxy/adapterV2/vllm.ts
  • Ollama: backend/src/routes/proxy/routesv2/ollama.ts, backend/src/routes/proxy/adapterV2/ollama.ts

Tip: If adding support for an OpenAI-compatible provider (e.g., Azure OpenAI, Together AI, Groq), use the vLLM/Ollama implementations as starting points - they reuse OpenAI's type definitions and adapters.

Smoke Testing

Use PROVIDER_SMOKE_TEST.md during development to verify basic functionality. This is a quick, non-exhaustive list.

Note, that Archestra Chat uses streaming for all LLM interactions. To test non-streaming responses, use an external client like n8n Chat node.