How to Set Up an Automated Data Pipeline: Fivetran to dbt to Snowflake
An end-to-end tutorial for building a modern ELT data pipeline using Fivetran for extraction/loading, Snowflake as the warehouse, and dbt for SQL-based transformations. Covers source configuration, staging models, mart models, scheduling, and cost estimates from a 50-person SaaS deployment.
Overview
A modern data pipeline follows the ELT (Extract, Load, Transform) pattern: extract data from source systems, load it into a cloud data warehouse, then transform it using SQL. Fivetran handles extraction and loading; dbt handles transformation; Snowflake (or BigQuery, Redshift, Databricks) serves as the warehouse. This tutorial covers the end-to-end setup.
Architecture
Source Systems → Fivetran (Extract & Load) → Snowflake (Warehouse) → dbt (Transform) → BI Tools
As of April 2026, this stack is the most commonly adopted ELT architecture for mid-size and enterprise data teams, according to data infrastructure surveys by dbt Labs and Fivetran.
Prerequisites
- Fivetran account (Standard plan, usage-based pricing starting at $1/credit)
- Snowflake account (standard tier, consumption-based pricing)
- dbt Cloud account (Team plan at $100/month) or dbt Core installed locally
- Source system credentials (Salesforce, Stripe, PostgreSQL, or any Fivetran-supported source)
Step 1: Configure Snowflake as the Destination
In Snowflake, create a dedicated database, warehouse, and service user for Fivetran:
CREATE DATABASE raw_data;
CREATE WAREHOUSE fivetran_wh WITH WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 60;
CREATE ROLE fivetran_role;
GRANT USAGE ON DATABASE raw_data TO ROLE fivetran_role;
GRANT CREATE SCHEMA ON DATABASE raw_data TO ROLE fivetran_role;
CREATE USER fivetran_user PASSWORD = 'strong_password' DEFAULT_ROLE = fivetran_role;
GRANT ROLE fivetran_role TO USER fivetran_user;
In the Fivetran dashboard, navigate to Destinations > Add Destination > Snowflake. Enter the account URL, database name, warehouse, user, and password. Click "Test Connection" to verify.
Step 2: Add Source Connectors in Fivetran
For each data source, click "Add Connector" in Fivetran:
Example: Salesforce
- Search for "Salesforce" and click "Set Up"
- Click "Authorize" to authenticate via OAuth
- Choose the schema name (default:
salesforce) - Select tables to sync: Account, Contact, Opportunity, Lead, Task, Campaign
- Set sync frequency to 15 minutes
- Click "Save & Test," then start the initial sync
Example: PostgreSQL (Application Database)
- Search for "PostgreSQL" and click "Set Up"
- Enter connection details: host, port, database, username (read-only), password
- Fivetran uses logical replication (WAL) for incremental updates
- Select tables and set sync frequency
The initial sync for a Salesforce org with 500,000 records typically takes 2-4 hours. Subsequent syncs transfer only changed records and complete in minutes.
Step 3: Verify Raw Data in Snowflake
After the initial sync completes, query the raw data in Snowflake:
SELECT * FROM raw_data.salesforce.opportunity LIMIT 10;
Fivetran creates schemas matching the source system name and tables matching the source table names. It adds metadata columns: _fivetran_synced (timestamp) and _fivetran_deleted (boolean for soft deletes).
Step 4: Initialize a dbt Project
If using dbt Core locally:
pip install dbt-core dbt-snowflake
dbt init analytics_project
Configure ~/.dbt/profiles.yml with Snowflake credentials. Create a separate Snowflake user and warehouse for dbt:
analytics_project:
target: dev
outputs:
dev:
type: snowflake
account: xy12345.us-east-1
user: dbt_user
password: "{{ env_var('DBT_PASSWORD') }}"
database: analytics
warehouse: dbt_wh
schema: dbt_dev
threads: 4
If using dbt Cloud, create a new project in the dbt Cloud UI and connect it to the Snowflake warehouse and a Git repository.
Step 5: Write Staging Models
Create staging models that clean and standardize the raw data. One staging model per source table:
-- models/staging/salesforce/stg_salesforce__opportunities.sql
WITH source AS (
SELECT * FROM {{ source('salesforce', 'opportunity') }}
WHERE NOT _fivetran_deleted
),
renamed AS (
SELECT
id AS opportunity_id,
account_id,
name AS opportunity_name,
stage_name AS stage,
amount,
close_date,
created_date,
is_won,
is_closed,
owner_id AS sales_rep_id
FROM source
)
SELECT * FROM renamed
Define sources in models/staging/salesforce/src_salesforce.yml:
sources:
- name: salesforce
database: raw_data
schema: salesforce
tables:
- name: opportunity
- name: account
- name: contact
Step 6: Write Mart Models
Create analytics-ready models that join and aggregate staged data:
-- models/marts/sales/fct_monthly_revenue.sql
SELECT
DATE_TRUNC('month', o.close_date) AS month,
a.industry,
COUNT(DISTINCT o.opportunity_id) AS deals_closed,
SUM(o.amount) AS total_revenue,
AVG(o.amount) AS avg_deal_size
FROM {{ ref('stg_salesforce__opportunities') }} o
LEFT JOIN {{ ref('stg_salesforce__accounts') }} a
ON o.account_id = a.account_id
WHERE o.is_won = TRUE
GROUP BY 1, 2
Step 7: Run and Test
Execute the full pipeline:
dbt build
This runs all models in dependency order and executes all tests. Review the output for failures. Add custom tests for business logic validation:
-- tests/assert_positive_revenue.sql
SELECT * FROM {{ ref('fct_monthly_revenue') }}
WHERE total_revenue < 0
Step 8: Schedule with dbt Cloud
In dbt Cloud, create a Job:
- Navigate to Jobs > Create Job
- Set the command to
dbt build - Schedule: Run every 6 hours (or trigger after Fivetran sync via the Fivetran-dbt integration)
- Enable notifications for failures
Fivetran offers a native dbt Cloud integration: after each Fivetran sync completes, it triggers the dbt Cloud job automatically. This ensures transformations always run on fresh data.
Cost Estimates
For a mid-size deployment syncing 5 sources with 10 million monthly active rows:
| Component | Monthly Cost |
|---|---|
| Fivetran (Standard, 10M MAR) | ~$500-800 |
| Snowflake (XSMALL warehouse, 8 hrs/day) | ~$200-400 |
| dbt Cloud (Team plan) | $100 |
| Total | ~$800-1,300/month |
Editor's Note: We built this exact stack (Fivetran + Snowflake + dbt Cloud) for a 50-person B2B SaaS company syncing Salesforce, Stripe, PostgreSQL (application DB), and Google Analytics. Initial setup took 2 weeks. Fivetran handled 15 million MAR at $650/month. Snowflake costs were $280/month on an XSMALL warehouse running 10 hours/day. dbt Cloud at $100/month. Total: $1,030/month. Before this stack, the data team spent 15 hours per week maintaining custom Python ETL scripts. After migration, maintenance dropped to 2 hours/week (mostly updating dbt models for new business requirements). The main caveat: Fivetran MAR pricing is unpredictable when source data has high update frequency. Stripe transaction tables generated 3x more MAR than expected because every status change counts as an active row.
Tools Mentioned
Airbyte
Open-source data integration platform for ELT pipelines with 400+ connectors
ETL & Data PipelinesAlteryx
Visual data analytics and automation platform for data preparation, blending, and advanced analytics without coding.
ETL & Data PipelinesApache Airflow
Programmatic authoring, scheduling, and monitoring of data workflows
ETL & Data PipelinesApify
Web scraping and browser automation platform with 2,000+ pre-built scrapers
ETL & Data PipelinesRelated Guides
dbt vs Apache Airflow in 2026: Transformation vs Orchestration
A detailed comparison of dbt and Apache Airflow covering their distinct roles in the modern data stack, integration patterns, pricing, and real 90-day deployment data. Explains when to use each tool alone and when to use both together.
Airbyte vs Fivetran in 2026: Open-Source vs Managed ELT
A data-driven comparison of Airbyte and Fivetran covering architecture, connector ecosystems, pricing at scale, reliability, compliance certifications, and real 60-day parallel deployment results. Covers self-hosted, cloud, and enterprise options for both platforms.
Fivetran vs Apache Airflow in 2026: Managed ELT vs Open-Source Orchestration
A detailed comparison of Fivetran and Apache Airflow covering pricing models, connector ecosystems, transformation approaches, monitoring, team requirements, and reliability — with real deployment data from production environments.
Related Rankings
Best Automation Tools for Data Teams in 2026
A ranked list of the best automation and data pipeline tools for data teams in 2026. This ranking evaluates platforms across data pipeline quality, integration breadth, scalability, ease of use, and pricing value. Tools are assessed based on their ability to handle ETL/ELT workflows, data transformation, orchestration, and integration tasks that data engineers and analysts rely on daily. The ranking includes both dedicated data tools (Apache Airflow, Fivetran, Prefect) and general-purpose automation platforms (n8n, Make) that have developed strong data pipeline capabilities. Each tool is scored on a 10-point scale across five weighted criteria.
Best ETL & Data Pipeline Tools 2026
Our ranking of the top ETL and data pipeline tools for building reliable data workflows and transformations in 2026.
Common Questions
How to set up data transformations with dbt
dbt (data build tool) transforms raw data in a warehouse by running SQL models. Initialize a project with `dbt init`, configure the warehouse connection in `profiles.yml`, write SQL model files, run `dbt build` to execute transformations, and test with `dbt test`.
How to set up a data pipeline with Fivetran
Fivetran automates data pipeline creation by connecting to source systems, replicating data to a destination warehouse, and maintaining schema consistency with zero code. Add a connector, authenticate the source, select a destination, choose the sync frequency, and start the initial sync.
What are the best Fivetran alternatives in 2026?
The leading Fivetran alternatives in 2026 are Airbyte (open-source ELT), dbt combined with Apache Airflow (transformation-first), Informatica (enterprise data management), and Segment (customer data focus). Airbyte offers the strongest open-source option with 350+ connectors.
What are the best Informatica alternatives in 2026?
The top Informatica alternatives in 2026 are Fivetran (managed ELT), Airbyte (open-source data integration), dbt (SQL-based transformation), and Talend (open-source data integration suite). Fivetran provides the most hands-off managed experience, while Airbyte offers the best open-source option.