Can you use Vellum with Claude and GPT in 2026?
Quick Answer: Yes. As of April 2026, Vellum supports Anthropic Claude (Sonnet, Opus, Haiku), OpenAI GPT (4.5, 4 Turbo, 4o), Google Gemini, Mistral, and others through a unified prompt and workflow interface. Teams swap or A/B test models without rewriting prompts.
Using Vellum With Claude and GPT
Vellum is a multi-provider LLM platform. Connecting Claude, GPT, and others is a configuration step rather than a code change.
Supported Providers
As of April 2026, Vellum supports:
- Anthropic — Claude Sonnet 4.5, Opus 4.5, Haiku
- OpenAI — GPT-4.5, GPT-4 Turbo, GPT-4o, GPT-4o mini, o3, o4-mini
- Google — Gemini 1.5 and 2.0 (Pro, Flash)
- Mistral — Mistral Large, Codestral
- Cohere, Meta Llama, Amazon Bedrock, Azure OpenAI as additional providers
Setup
- In Vellum admin, navigate to Settings → API Keys
- Add API keys for each provider
- In the Prompt or Workflow editor, select the model from the dropdown
Each prompt version pins a specific model and parameters. Switching models creates a new version, preserving the previous configuration for rollback.
A/B Testing Models
Vellum supports prompt experiments where the same input is run against multiple models and outputs are compared via:
- Side-by-side diff
- Automated metrics (latency, cost, exact match, semantic similarity)
- Custom Python evaluators
Common pattern: prototype on Claude Sonnet, eval on GPT-4o, deploy the winner.
Cost and Latency Tracking
Vellum tracks token usage and latency per provider in observability dashboards. Teams use the data to right-size model choice — for example, downgrading to Haiku or GPT-4o mini for low-stakes classification while keeping Opus for reasoning-heavy tasks.
BYO Provider
For self-hosted models or non-listed providers, Vellum supports a custom HTTP endpoint via the Workflow Custom Node, letting teams call Llama 3.x, internal APIs, or fine-tuned models.