I compiled this categorized list using chatgpt, of popular tools used in ETL/ELT workflows
🚀 Orchestration & Workflow Management
Tools that schedule, coordinate, and monitor data pipelines.
- Apache Airflow
- Prefect
- Dagster
- Luigi
- Azkaban
- Apache Oozie
- Argo Workflows (Kubernetes-native)
🏗️ Transformation Frameworks (ELT)
Tools that transform data inside warehouses (dbt-like).
- dbt (Data Build Tool)
- Dataform (now part of Google Cloud)
- SQLMesh
- Transform (metrics layer)
- Coalesce
- Meltano (plugin-based pipelines including dbt)
🔄 ETL/ELT Platforms (Full Stack)
All-in-one platforms for extraction, loading, and some transformation.
- Talend
- Informatica PowerCenter / Informatica Cloud
- Fivetran
- Stitch
- Matillion
- Airbyte
- Hevo Data
- Pentaho Data Integration (Kettle)
- SSIS (SQL Server Integration Services)
- AWS Glue
🧰 Streaming & Real-Time ETL
For event-driven or continuous data processing.
- Apache Kafka
- Kafka Connect
- Apache Flink
- Apache Spark Streaming
- ksqlDB
- Redpanda
💾 Data Processing Engines
Heavy-duty compute frameworks for batch or streaming.
- Apache Spark
- Apache Beam
- Databricks
- Google Dataflow
🏢 Cloud-Native ETL Services
Managed ETL options provided by cloud platforms.
- AWS Glue
- AWS Data Pipeline
- Azure Data Factory
- Google Cloud Dataflow
- Google Cloud Data Fusion
- Snowflake Tasks & Streams
📊 Data Integration + iPaaS (Low-code)
Often business-friendly and UI-driven.
- Zapier (simple workflows)
- Make.com
- Workato
- Boomi
- Tray.io
Sure — here’s an expanded and more exhaustive list of ETL/ELT tools, including many lesser-known, specialized, and enterprise options. I’ll break them down by category for clarity.
🔥 More ETL / ELT Tools (Extended List)
🛠️ Orchestration & Workflow Engines (More)
Beyond Airflow, Prefect, Dagster, Luigi:
- Airlift
- Metaflow (Netflix)
- Kedro (QuantumBlack)
- Flyte (Lyft)
- Kubeflow Pipelines
- Apache NiFi
- Control-M (BMC)
- Tecton (feature pipelines)
- Google Cloud Composer (managed Airflow)
- Azure Data Factory Pipelines
- Astronomer (managed Airflow)
🧱 Transformation-Only Tools (More ELT / SQL Modeling)
Beyond dbt, Dataform, SQLMesh:
- Cube (metrics layer, modeling)
- MetricFlow by Transform / dbt Semantic Layer
- MindsDB (AI transforms)
- LookML (Looker) — transformation/model layer on BI side
- Y42 (full stack, but dbt-like transformations)
- Narrator.ai (modeling methodology)
- PipeRider (data testing and profiling)
- Datafold (data quality + diffs for SQL)
🔄 Extraction & Loading Tools (More Connectors)
In addition to Fivetran, Stitch, Airbyte, Hevo:
- Singer (Tap/Target framework)
- RudderStack (CDP + pipelines)
- Segment (customer data pipelines)
- Blendo
- Etleap
- Xplenty
- CloverDX
- FlyData
- Matillion (also supports transformations)
- Portable.io (long-tail connector provider)
- Grouparoo (open-source reverse ETL)
- Hightouch (reverse ETL)
- Census (reverse ETL)
🧬 Reverse ETL (More)
Sending data back to SaaS tools:
- Polytomic
- OmniAnalytics reverse ETL
- Workato (enterprise integration)
⚡ Real-Time & Streaming ETL (More)
Beyond Kafka, Flink, Spark Streaming:
- Apache Pulsar
- Apache Storm
- Materialize (streaming SQL)
- Rockset
- Confluent Cloud (managed Kafka ecosystem)
- StreamSets
- Decodable
- Quix
- Estuary Flow
💾 Processing Engines & Compute Frameworks (More)
In addition to Spark, Beam, Dataflow:
- Snowpark (Snowflake’s compute engine)
- Presto / Trino
- Dask
- Ray
- Hive
- ClickHouse pipelines
- Delta Live Tables (Databricks)
- EMR (AWS for Spark/Hadoop)
🌥️ Cloud ETL / Integration Tools (More)
Cloud-specific ETL services:
AWS
- AWS Glue Studio
- AWS Lambda pipelines
- AWS Step Functions
- Amazon AppFlow
Azure
- Azure Databricks
- Synapse Pipelines
- Logic Apps
Google Cloud
- BigQuery Data Transfer Service (DTS)
- GKE + Argo Workflows
- Workflows (GCP native orchestration)
🤖 ML-Focused Pipelines
ETL for feature engineering, ML data prep:
- Feast (feature store)
- Tecton (enterprise feature store)
- ZenML
- MLflow Pipelines
- Hopsworks
🧩 Enterprise ETL Platforms (More)
Besides Talend, Informatica, SSIS:
- IBM DataStage
- SAS Data Integration Studio
- SAP Data Services
- Oracle Data Integrator (ODI)
- Ab Initio
- Alteryx
- Qlik Replicate (formerly Attunity)
- SnapLogic
- Informatica MDM
- Collibra Data Quality (DQ Pipelines)
Top comments (0)