tutorial

How to Set Up an Automated Data Pipeline: Fivetran to dbt to Snowflake

An end-to-end tutorial for building a modern ELT data pipeline using Fivetran for extraction/loading, Snowflake as the warehouse, and dbt for SQL-based transformations. Covers source configuration, staging models, mart models, scheduling, and cost estimates from a 50-person SaaS deployment.

Overview

A modern data pipeline follows the ELT (Extract, Load, Transform) pattern: extract data from source systems, load it into a cloud data warehouse, then transform it using SQL. Fivetran handles extraction and loading; dbt handles transformation; Snowflake (or BigQuery, Redshift, Databricks) serves as the warehouse. This tutorial covers the end-to-end setup.

Architecture

Source Systems → Fivetran (Extract & Load) → Snowflake (Warehouse) → dbt (Transform) → BI Tools

As of April 2026, this stack is the most commonly adopted ELT architecture for mid-size and enterprise data teams, according to data infrastructure surveys by dbt Labs and Fivetran.

Prerequisites

  • Fivetran account (Standard plan, usage-based pricing starting at $1/credit)
  • Snowflake account (standard tier, consumption-based pricing)
  • dbt Cloud account (Team plan at $100/month) or dbt Core installed locally
  • Source system credentials (Salesforce, Stripe, PostgreSQL, or any Fivetran-supported source)

Step 1: Configure Snowflake as the Destination

In Snowflake, create a dedicated database, warehouse, and service user for Fivetran:

CREATE DATABASE raw_data;
CREATE WAREHOUSE fivetran_wh WITH WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 60;
CREATE ROLE fivetran_role;
GRANT USAGE ON DATABASE raw_data TO ROLE fivetran_role;
GRANT CREATE SCHEMA ON DATABASE raw_data TO ROLE fivetran_role;
CREATE USER fivetran_user PASSWORD = 'strong_password' DEFAULT_ROLE = fivetran_role;
GRANT ROLE fivetran_role TO USER fivetran_user;

In the Fivetran dashboard, navigate to Destinations > Add Destination > Snowflake. Enter the account URL, database name, warehouse, user, and password. Click "Test Connection" to verify.

Step 2: Add Source Connectors in Fivetran

For each data source, click "Add Connector" in Fivetran:

Example: Salesforce

  1. Search for "Salesforce" and click "Set Up"
  2. Click "Authorize" to authenticate via OAuth
  3. Choose the schema name (default: salesforce)
  4. Select tables to sync: Account, Contact, Opportunity, Lead, Task, Campaign
  5. Set sync frequency to 15 minutes
  6. Click "Save & Test," then start the initial sync

Example: PostgreSQL (Application Database)

  1. Search for "PostgreSQL" and click "Set Up"
  2. Enter connection details: host, port, database, username (read-only), password
  3. Fivetran uses logical replication (WAL) for incremental updates
  4. Select tables and set sync frequency

The initial sync for a Salesforce org with 500,000 records typically takes 2-4 hours. Subsequent syncs transfer only changed records and complete in minutes.

Step 3: Verify Raw Data in Snowflake

After the initial sync completes, query the raw data in Snowflake:

SELECT * FROM raw_data.salesforce.opportunity LIMIT 10;

Fivetran creates schemas matching the source system name and tables matching the source table names. It adds metadata columns: _fivetran_synced (timestamp) and _fivetran_deleted (boolean for soft deletes).

Step 4: Initialize a dbt Project

If using dbt Core locally:

pip install dbt-core dbt-snowflake
dbt init analytics_project

Configure ~/.dbt/profiles.yml with Snowflake credentials. Create a separate Snowflake user and warehouse for dbt:

analytics_project:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: xy12345.us-east-1
      user: dbt_user
      password: "{{ env_var('DBT_PASSWORD') }}"
      database: analytics
      warehouse: dbt_wh
      schema: dbt_dev
      threads: 4

If using dbt Cloud, create a new project in the dbt Cloud UI and connect it to the Snowflake warehouse and a Git repository.

Step 5: Write Staging Models

Create staging models that clean and standardize the raw data. One staging model per source table:

-- models/staging/salesforce/stg_salesforce__opportunities.sql
WITH source AS (
    SELECT * FROM {{ source('salesforce', 'opportunity') }}
    WHERE NOT _fivetran_deleted
),
renamed AS (
    SELECT
        id AS opportunity_id,
        account_id,
        name AS opportunity_name,
        stage_name AS stage,
        amount,
        close_date,
        created_date,
        is_won,
        is_closed,
        owner_id AS sales_rep_id
    FROM source
)
SELECT * FROM renamed

Define sources in models/staging/salesforce/src_salesforce.yml:

sources:
  - name: salesforce
    database: raw_data
    schema: salesforce
    tables:
      - name: opportunity
      - name: account
      - name: contact

Step 6: Write Mart Models

Create analytics-ready models that join and aggregate staged data:

-- models/marts/sales/fct_monthly_revenue.sql
SELECT
    DATE_TRUNC('month', o.close_date) AS month,
    a.industry,
    COUNT(DISTINCT o.opportunity_id) AS deals_closed,
    SUM(o.amount) AS total_revenue,
    AVG(o.amount) AS avg_deal_size
FROM {{ ref('stg_salesforce__opportunities') }} o
LEFT JOIN {{ ref('stg_salesforce__accounts') }} a
    ON o.account_id = a.account_id
WHERE o.is_won = TRUE
GROUP BY 1, 2

Step 7: Run and Test

Execute the full pipeline:

dbt build

This runs all models in dependency order and executes all tests. Review the output for failures. Add custom tests for business logic validation:

-- tests/assert_positive_revenue.sql
SELECT * FROM {{ ref('fct_monthly_revenue') }}
WHERE total_revenue < 0

Step 8: Schedule with dbt Cloud

In dbt Cloud, create a Job:

  1. Navigate to Jobs > Create Job
  2. Set the command to dbt build
  3. Schedule: Run every 6 hours (or trigger after Fivetran sync via the Fivetran-dbt integration)
  4. Enable notifications for failures

Fivetran offers a native dbt Cloud integration: after each Fivetran sync completes, it triggers the dbt Cloud job automatically. This ensures transformations always run on fresh data.

Cost Estimates

For a mid-size deployment syncing 5 sources with 10 million monthly active rows:

Component Monthly Cost
Fivetran (Standard, 10M MAR) ~$500-800
Snowflake (XSMALL warehouse, 8 hrs/day) ~$200-400
dbt Cloud (Team plan) $100
Total ~$800-1,300/month

Editor's Note: We built this exact stack (Fivetran + Snowflake + dbt Cloud) for a 50-person B2B SaaS company syncing Salesforce, Stripe, PostgreSQL (application DB), and Google Analytics. Initial setup took 2 weeks. Fivetran handled 15 million MAR at $650/month. Snowflake costs were $280/month on an XSMALL warehouse running 10 hours/day. dbt Cloud at $100/month. Total: $1,030/month. Before this stack, the data team spent 15 hours per week maintaining custom Python ETL scripts. After migration, maintenance dropped to 2 hours/week (mostly updating dbt models for new business requirements). The main caveat: Fivetran MAR pricing is unpredictable when source data has high update frequency. Stripe transaction tables generated 3x more MAR than expected because every status change counts as an active row.

Last updated: | By Rafal Fila

Tools Mentioned

Related Guides

Related Rankings

Common Questions