case-study

When Temporal Beat Airflow for a Fintech ETL Replay Job

Anonymized retrospective of a fintech client choosing Temporal over Apache Airflow for a multi-day ETL replay job. Replay correctness drove the decision; estimated total cost of ownership over 12 months landed at roughly $48,000 for Temporal Cloud vs $26,000 for managed Airflow, with replay determinism worth the premium for this workload.

The Bottom Line: Temporal beat Airflow for a fintech ETL replay job because durable, deterministic replay was the central requirement; but Airflow remains the better default for scheduled batch ETL. The right choice depends on whether replay or schedule is your hard requirement.

The Workload

A fintech client (consumer-lending sub-segment, ~140 employees) needed to rebuild a six-month transaction ledger after a vendor data correction. The replay job had to:

  • Process roughly 2.4 billion transaction records.
  • Apply a chain of 11 transformations, three of which depended on external API enrichment with rate limits.
  • Be re-runnable from any intermediate step on partial failure, with deterministic output.
  • Produce an audit trail acceptable to the company's internal compliance review.

The team already ran a managed Airflow (MWAA) for nightly batch ETL of about 80 DAGs. The reasonable starting assumption was: extend Airflow.

Why We Did Not Extend Airflow

After a week of prototyping in MWAA, three problems hardened the decision against it:

  1. Replay semantics. Airflow's "rerun from failed task" works well, but rerunning a task chain that calls a rate-limited external API exposes the orchestration to non-determinism. The third-party API can change behavior between the original run and the replay. The team needed to be confident that a replay would produce identical output if the inputs were unchanged.
  2. Long-running activities. Several transformations took 6-12 hours of compute. Airflow can run them, but the task model is not designed for long-running, retryable activity calls with structured progress checkpoints.
  3. State persistence. The replay job had to be restartable across days, including across MWAA upgrade windows. Airflow can do this, but the team would have had to build the durable state machinery on top.

Temporal's programming model handled all three out of the box: deterministic workflow code, activities with explicit retry policies, durable state checkpointing built in.

What We Built

A single Temporal workflow in Go, running on Temporal Cloud (Starter then Production tier), calling 11 activities. Each activity was a containerized worker on the company's existing EKS cluster. The workflow ran for 18 days end-to-end, restarted twice (once for an EKS node-pool upgrade, once for a deliberate code change in activity 7), and produced a bit-for-bit identical result on a smaller verification replay run on a known historical month.

Editor's Note: We deployed Temporal Cloud for a 140-person fintech client to run a 2.4-billion-row ledger replay over 18 days. The job survived two interruptions (EKS upgrade, code change in one activity) without losing progress, and the verification replay on a known month produced bit-for-bit identical output. The honest caveat: the team had no prior Temporal experience. The first week was almost entirely learning the SDK and writing activities idempotently. If the workflow had been smaller, that learning cost would not have been justified.

The Cost Comparison

12-month total cost of ownership estimate, prepared during the decision:

Component Managed Airflow (MWAA) Temporal Cloud
Orchestration platform ~$8,400/yr (medium env) ~$28,000/yr (estimated)
Worker compute (existing EKS) ~$14,000/yr ~$14,000/yr
Engineering time (initial build) ~$2,800 (familiar) ~$5,500 (learning + build)
Audit and replay tooling (build) ~$1,000 included in platform
Total year 1 ~$26,200 ~$47,500

The premium for Temporal was around $21,300 for the year. The compliance team valued deterministic replay and the persistent audit trail at meaningfully more than that, so the decision was clear once the cost was on the table.

What Airflow Still Does Better

Honest list, not marketing copy:

  • Scheduled batch ETL. Airflow's scheduler model and the broader DAG-style ecosystem (sensors, hooks, operators for every cloud service) is more mature for the daily/hourly batch case.
  • Cost. For workloads that are mostly schedule-driven and do not need durable replay, Airflow is materially cheaper.
  • Talent pool. As of mid-2026, the data-engineering hiring pool with Airflow experience is roughly an order of magnitude larger than Temporal's.
  • Observability ecosystem. Airflow integrates with Datadog, Lineage tools (Marquez, OpenLineage), and dbt Cloud out of the box.

When To Pick Each

  • Pick Temporal when: durable, deterministic replay matters; activities are long-running; the workflow logic is genuinely complex enough to deserve a programming model rather than a DAG.
  • Pick Airflow when: the workload is scheduled batch ETL with relatively short tasks; the team already has Airflow expertise; cost sensitivity matters more than replay semantics.

Editor's Note: The fintech kept Airflow for the existing 80 DAGs and ran Temporal alongside it just for replay-class workloads. We have not seen a case yet where a single client should be all-in on one or the other. They solve overlapping but different problems.

Caveats

Cost figures are estimates from the decision document, accurate as of the project window in early 2026. Temporal Cloud and MWAA pricing have both evolved since. The 18-day total runtime included two non-Temporal-caused interruptions; on a clean run we estimated ~14 days. The team's prior Airflow expertise made the comparison less favorable to Temporal than it would be for a team with neither tool deployed.

Last updated: | By Rafal Fila

Tools Mentioned

Related Guides

Related Rankings

Common Questions