Can you use Vellum with Claude and GPT in 2026?

Quick Answer: Yes. As of April 2026, Vellum supports Anthropic Claude (Sonnet, Opus, Haiku), OpenAI GPT (4.5, 4 Turbo, 4o), Google Gemini, Mistral, and others through a unified prompt and workflow interface. Teams swap or A/B test models without rewriting prompts.

Using Vellum With Claude and GPT

Vellum is a multi-provider LLM platform. Connecting Claude, GPT, and others is a configuration step rather than a code change.

Supported Providers

As of April 2026, Vellum supports:

  • Anthropic — Claude Sonnet 4.5, Opus 4.5, Haiku
  • OpenAI — GPT-4.5, GPT-4 Turbo, GPT-4o, GPT-4o mini, o3, o4-mini
  • Google — Gemini 1.5 and 2.0 (Pro, Flash)
  • Mistral — Mistral Large, Codestral
  • Cohere, Meta Llama, Amazon Bedrock, Azure OpenAI as additional providers

Setup

  1. In Vellum admin, navigate to Settings → API Keys
  2. Add API keys for each provider
  3. In the Prompt or Workflow editor, select the model from the dropdown

Each prompt version pins a specific model and parameters. Switching models creates a new version, preserving the previous configuration for rollback.

A/B Testing Models

Vellum supports prompt experiments where the same input is run against multiple models and outputs are compared via:

  • Side-by-side diff
  • Automated metrics (latency, cost, exact match, semantic similarity)
  • Custom Python evaluators

Common pattern: prototype on Claude Sonnet, eval on GPT-4o, deploy the winner.

Cost and Latency Tracking

Vellum tracks token usage and latency per provider in observability dashboards. Teams use the data to right-size model choice — for example, downgrading to Haiku or GPT-4o mini for low-stakes classification while keeping Opus for reasoning-heavy tasks.

BYO Provider

For self-hosted models or non-listed providers, Vellum supports a custom HTTP endpoint via the Workflow Custom Node, letting teams call Llama 3.x, internal APIs, or fine-tuned models.

Last updated: | By Rafal Fila