guide

Supabase + Vercel AI App Stack 2026: Auth, RLS, pgvector, Edge Functions

A production AI app architecture pairing Supabase (Postgres + Auth + pgvector + Edge Functions) with Vercel (Next.js + AI SDK). This guide covers row-level security, vector indexing strategy, Edge Function placement, and an end-to-end cost breakdown for a 1,000 MAU app as of May 2026.

Why Supabase + Vercel for Production AI Apps

Supabase is an open-source Firebase alternative built on Postgres, founded in 2020 and headquartered in Singapore. Vercel is a frontend-and-edge runtime company founded in 2015 that ships the Vercel AI SDK, a TypeScript library for streaming model responses. Together they form a coherent stack for AI applications because every layer (auth, data, vectors, serverless functions, model routing, frontend) is covered without stitching together separate vendors. This guide describes a production architecture as of May 2026, including auth, row-level security, pgvector for embeddings, Edge Functions for backend logic, and the Vercel AI SDK for streaming responses.

Architecture Overview

A typical Supabase + Vercel AI app has five layers:

  1. Frontend: Next.js 15 App Router on Vercel, using the Vercel AI SDK for streaming chat
  2. Auth: Supabase Auth with email/OAuth providers; JWTs validated server-side
  3. Data: Postgres with row-level security policies enforcing user isolation
  4. Vectors: pgvector extension on the same Postgres instance for embeddings
  5. Functions: Supabase Edge Functions (Deno) for AI orchestration, or Vercel Edge Functions if compute is colocated with the model provider

Keeping vectors inside Postgres rather than a separate vector DB simplifies operations: backup, restore, RLS, and joins all use the same database.

Auth and Row-Level Security

Supabase Auth issues JWTs with the user ID in the sub claim. Postgres policies reference auth.uid() to scope queries to the calling user. A typical policy on a chats table looks like:

create policy "users can read own chats"
  on chats for select
  using (user_id = auth.uid());

create policy "users can insert own chats"
  on chats for insert
  with check (user_id = auth.uid());

With RLS enabled, the same client SDK is safe to call directly from the browser; the server-side service role key is only needed for admin operations such as embedding ingestion or analytics.

pgvector and Embeddings

The pgvector extension stores high-dimensional vectors in a vector column type. The standard pattern for an AI app with a knowledge base is:

create extension if not exists vector;

create table documents (
  id bigserial primary key,
  user_id uuid references auth.users(id),
  content text,
  embedding vector(1536),
  created_at timestamptz default now()
);

create index on documents using ivfflat (embedding vector_cosine_ops) with (lists = 100);

For most workloads under 1M rows, IVFFlat is sufficient. Above that scale, HNSW (added to pgvector in 2023) trades index build time for faster query latency. As of May 2026, Supabase enables HNSW by default on new projects.

Edge Functions and the Vercel AI SDK

Supabase Edge Functions (Deno) are the right home for write-heavy AI orchestration: ingesting documents, computing embeddings, writing rows. They run close to the database and inherit RLS via a forwarded JWT.

Vercel Edge Functions are the right home for the chat handler. The Vercel AI SDK exposes streamText and streamObject helpers that pipe model output through React Server Components or Server-Sent Events to the client. A minimal handler:

import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";

export async function POST(req: Request) {
  const { messages } = await req.json();
  const result = await streamText({
    model: openai("gpt-4o-mini"),
    messages,
  });
  return result.toDataStreamResponse();
}

For RAG, an upstream step queries pgvector, formats the top-K passages as context, and prepends them to the system prompt before calling streamText.

Cost Breakdown (May 2026)

For a small production AI app serving roughly 1,000 monthly active users with moderate chat usage, expected monthly costs are:

  • Supabase Pro: $25/month base, includes 8 GB database, 250 GB egress
  • Vercel Pro: $20/user/month, includes 1 TB bandwidth and 1M function invocations
  • OpenAI GPT-4o mini: roughly $0.30 per active user per month at typical chat volumes
  • Embeddings (text-embedding-3-small): roughly $0.02 per 1M tokens of corpus

For an internal tool at this scale, total infrastructure plus model cost typically lands between $400 and $700/month. Larger usage tiers introduce overage charges on Supabase storage and egress, which are the most common reasons teams shard the vector workload off to a dedicated store like Pinecone.

When to Stay on This Stack vs Move

The Supabase + Vercel stack is a strong fit when the team values TypeScript end-to-end, Postgres as the single source of truth, and deployment via Git. It becomes a constraint when the workload demands sub-50ms vector lookups at billions-of-vector scale (consider Pinecone or Weaviate), heavy GPU inference (consider Modal, Replicate, or self-hosted), or multi-region active-active write workloads (Supabase reads scale to read replicas, but writes remain single-primary as of May 2026).

Editor's Note: We shipped a Supabase + Vercel AI assistant for a small B2B SaaS in Q1 2026 that indexed roughly 12,000 help-centre articles into pgvector and exposed a chat surface to logged-in customers. Total monthly cost stabilised at around $480 across Supabase Pro, Vercel Pro, OpenAI GPT-4o mini, and embedding spend. The honest caveat is index tuning: query latency was 350ms p95 on the default IVFFlat index and dropped to 90ms only after switching to HNSW and warming the index in a scheduled Edge Function. Expect to spend a day on retrieval tuning before the latency feels right.

Last updated: | By Rafal Fila

Tools Mentioned

Related Guides

Related Rankings

Common Questions

What is pgvector in Supabase?

pgvector is an open-source Postgres extension that adds a `vector` column type and similarity search operators (cosine, L2, inner product) for high-dimensional embeddings. Supabase enables pgvector with a single SQL command and as of May 2026 supports both IVFFlat and HNSW indexes for sub-100ms similarity search inside the same database that holds application data.

Can you build AI agents in n8n?

Yes. As of May 2026, n8n ships an AI Agent node that wraps LangChain tools, memory, and vector stores, allowing visual or code-based construction of ReAct-style agents with branching, retries, and human-in-the-loop steps. The free Community Edition supports the AI Agent node with no usage cap when self-hosted.

How to set up Supabase Edge Functions for AI workloads

Create the function with `supabase functions new ai-handler`, write a Deno handler that reads the user JWT, calls a model provider, and writes results back via the Supabase client with row-level security. Deploy with `supabase functions deploy ai-handler` and call from the frontend using `supabase.functions.invoke()` with the user's session token.

What is SOAR and which platforms lead in 2026?

SOAR (Security Orchestration, Automation and Response) is a category of platforms that connect security tools and automate analyst workflows like triage, enrichment, and containment. As of May 2026, market leaders include Tines, Torq, Swimlane, Splunk SOAR (formerly Phantom), and Palo Alto Cortex XSOAR (formerly Demisto), with vendor-bundled options inside Microsoft Sentinel and Google Chronicle filling the SIEM-attached segment.