<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Isha Vason</title>
    <description>The latest articles on DEV Community by Isha Vason (@isha_vason).</description>
    <link>https://dev.to/isha_vason</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3807297%2Fffbefcd3-65a4-46dc-b5a7-31c067eb9339.jpg</url>
      <title>DEV Community: Isha Vason</title>
      <link>https://dev.to/isha_vason</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/isha_vason"/>
    <language>en</language>
    <item>
      <title>Orchestrating Our Way Out of Chaos: How I Compared Airflow, Prefect, and Dagster (and Picked What to Ship)</title>
      <dc:creator>Isha Vason</dc:creator>
      <pubDate>Thu, 05 Mar 2026 07:55:39 +0000</pubDate>
      <link>https://dev.to/isha_vason/orchestrating-our-way-out-of-chaos-how-i-compared-airflow-prefect-and-dagster-and-picked-what-23np</link>
      <guid>https://dev.to/isha_vason/orchestrating-our-way-out-of-chaos-how-i-compared-airflow-prefect-and-dagster-and-picked-what-23np</guid>
      <description>&lt;p&gt;A few quarters ago I inherited a lovable mess: ad‑hoc cron jobs, a couple of shell scripts duct‑taped to a BI refresh, and one heroic Python file that only ran if you patted it gently. My task was simple on paper: &lt;strong&gt;pick an orchestrator&lt;/strong&gt; that wouldn’t implode the moment we added a new source or missed a weekend run.&lt;/p&gt;

&lt;p&gt;This post is the story of how I evaluated &lt;strong&gt;Apache Airflow&lt;/strong&gt;, &lt;strong&gt;Prefect&lt;/strong&gt;, and &lt;strong&gt;Dagster&lt;/strong&gt; on a real project—with prototypes, production constraints, and the occasional oh‑no‑why‑is‑nothing‑running moment. I’ll share what I tested, what surprised me, and where each tool &lt;em&gt;shined or stumbled&lt;/em&gt; for us.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you want docs‑level definitions, they exist. Here, you’ll find the parts that mattered in practice, with links when I cite a factual claim or version detail.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The problem I had to solve (quick context)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  Ingest nightly data from 3 sources (S3 drops, a SaaS API, and a warehouse copy).&lt;/li&gt;
&lt;li&gt;  Run dbt transforms, publish a few derived tables, and trigger a downstream dashboard.&lt;/li&gt;
&lt;li&gt;  Add observability and reduce “zombie jobs” without scaling workers to the moon.&lt;/li&gt;
&lt;li&gt;  Make it easy for another engineer to onboard next sprint.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: &lt;strong&gt;solid batch orchestration, good visibility, and room to grow&lt;/strong&gt;—without a six‑week platform build.&lt;/p&gt;




&lt;h2&gt;
  
  
  Round 1: Airflow—the dependable heavyweight
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What I tried&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I spun up &lt;strong&gt;Airflow&lt;/strong&gt; on Kubernetes using the official Helm chart and Docker images. That gave me a web UI, a scheduler, and workers with minimal friction. The sheer size of the &lt;strong&gt;provider&lt;/strong&gt; ecosystem helped me quickly wire GCP, AWS, Snowflake, Slack, and dbt without reinventing hooks. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pleasant surprise&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Our long‑wait steps (e.g., waiting on BigQuery or S3 sensors) didn’t hog workers thanks to &lt;strong&gt;deferrable operators&lt;/strong&gt;. You run a lightweight &lt;strong&gt;triggerer&lt;/strong&gt; process; tasks “defer” during idle time so your cluster isn’t just… waiting. For us, this was the difference between needing more workers and reusing what we had. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Airflow is still Airflow in 2026&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Airflow’s newest 3.x cycle modernizes UI and internals (service‑oriented components, faster DAG parsing). That matters operationally—less mysterious sluggishness when you have lots of DAGs, better developer ergonomics, and a cleaner path for upgrades. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where I felt the drag&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Authoring is Pythonic (TaskFlow API), but you still think in &lt;strong&gt;DAGs&lt;/strong&gt; first. That’s great for explicit control; it can feel heavy when all you want is “run this Python, fan out over these 200 files, retry smartly.” Still, if your world is full of external tools and strict schedules, Airflow is home base. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Airflow takeaway&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If your stack touches “everything” and reliability under scale is non‑negotiable, start here. Use deferrables, lean on providers, and sleep better. &lt;/p&gt;




&lt;h2&gt;
  
  
  Round 2: Prefect—the Python‑native fast mover
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What I tried&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I ported the pipeline into &lt;strong&gt;Prefect&lt;/strong&gt; using &lt;code&gt;@flow&lt;/code&gt; and &lt;code&gt;@task&lt;/code&gt;. It felt like writing normal Python—&lt;em&gt;with perks.&lt;/em&gt; Retries, caching, and &lt;strong&gt;concurrent submits&lt;/strong&gt; are built in. For our “fan‑out over N files then aggregate” steps, the code stayed tidy and readable. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pleasant surprise&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The &lt;strong&gt;hybrid Prefect Cloud&lt;/strong&gt; model fit our security posture: we kept code/data in our VPC while using Cloud for orchestration metadata, UI, RBAC, and automations. Spinning up workers where the data lives kept latency predictable. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where it clicked for the team&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Debugging and iteration were fast. New engineers could run flows locally, push a &lt;strong&gt;deployment&lt;/strong&gt;, and watch runs in the Cloud UI—all without touching Kubernetes on day one. For a contracting engagement with a tight runway, this mattered more than we expected. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade‑offs I felt&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Compared to Airflow’s ocean of providers, you’ll sometimes write a sprinkle of glue. Not a blocker—just be aware if your org standardizes on ready‑made operators for everything. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Prefect takeaway&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If your team is Python‑heavy, iterates quickly, and wants a &lt;strong&gt;low‑friction path&lt;/strong&gt; from laptop to production with governance, Prefect is a joy. The retries/mapping model is simple and powerful. &lt;/p&gt;

&lt;h2&gt;
  
  
  Round 3: Dagster—the data‑product mindset
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What I tried&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I rewrote the pipeline as &lt;strong&gt;software‑defined assets (SDAs)&lt;/strong&gt;. Instead of “run task A then B,” I declared, “the &lt;code&gt;orders_clean&lt;/code&gt; table &lt;em&gt;exists&lt;/em&gt; and depends on &lt;code&gt;raw_orders&lt;/code&gt;.” Dagster then gave me lineage graphs, asset health, and &lt;strong&gt;re‑materialization&lt;/strong&gt; controls out of the box. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The pleasant surprise&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
We could think in &lt;strong&gt;data products&lt;/strong&gt; instead of job steps. With &lt;strong&gt;asset sensors&lt;/strong&gt; and freshness policies, it was easy to trigger downstream work when an upstream asset changed, and to backfill just the partitions we cared about. This was perfect for dbt‑heavy transformations and ML feature tables. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What the team noticed&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The UI’s asset catalog is more than pretty pictures—it made onboarding easier. New teammates grasped the pipeline by &lt;em&gt;reading the graph&lt;/em&gt;, not spelunking code. For governance and re‑runs, that visibility was gold. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade‑offs I felt&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Switching to an &lt;em&gt;asset‑first&lt;/em&gt; mental model can be a paradigm shift. If your engineers are used to task DAGs, there’s a small learning curve. And while Dagster OSS is strong, &lt;strong&gt;Dagster+&lt;/strong&gt; introduces a credits model for managed materializations—fine for many teams, but something to price out. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Dagster takeaway&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If lineage, partitions/backfills, and &lt;strong&gt;data contracts&lt;/strong&gt; are front‑and‑center, Dagster makes those concerns &lt;em&gt;first‑class&lt;/em&gt; rather than bolt‑ons. &lt;/p&gt;

&lt;h2&gt;
  
  
  Head‑to‑head (from my notebook)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Integrations at enterprise scale → Airflow&lt;/strong&gt;. The provider catalog saved me days connecting to warehouses, clouds, and SaaS. Paired with &lt;strong&gt;deferrables&lt;/strong&gt;, it’s efficient for long waits. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Developer velocity &amp;amp; hybrid Cloud → Prefect&lt;/strong&gt;. Python decorators, clean retries/mapping, and metadata‑only Cloud made shipping fast and safe. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lineage, selective re‑runs, partitions → Dagster&lt;/strong&gt;. SDAs + sensors/freshness gave us surgical control over data products and great visibility. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There’s no absolute winner—only the right fit for your constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually shipped (and why)
&lt;/h2&gt;

&lt;p&gt;For this client, we chose &lt;strong&gt;Prefect&lt;/strong&gt; for the initial rollout:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  We needed speed, low ceremony, and a gentle onboarding curve.&lt;/li&gt;
&lt;li&gt;  The workloads were Python‑heavy with dynamic fan‑out.&lt;/li&gt;
&lt;li&gt;  Security wanted a managed control plane &lt;strong&gt;without&lt;/strong&gt; data leaving our VPC.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prefect hit those goals with minimal ops and let us keep momentum while we stabilized sources and schemas. &lt;strong&gt;If&lt;/strong&gt; we had a sprawling integration surface or strict batch SLAs across many third‑party systems, I would’ve pushed Airflow. &lt;strong&gt;If&lt;/strong&gt; the mandate had been strict lineage, partitions, and data governance from day one, I’d have argued for Dagster. &lt;/p&gt;




&lt;h2&gt;
  
  
  A few practical tips from the trenches
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prototype your pain point, not a toy DAG.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If long waits are killing you, test Airflow’s &lt;strong&gt;deferrable operators&lt;/strong&gt; with &lt;em&gt;your&lt;/em&gt; warehouse jobs. The results are tangible in a day. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don’t over‑optimize day 1.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
I’ve seen teams spend weeks perfecting Kubernetes before a single pipeline is reliable. Prefect’s local → deployment workflow can buy you that time to prove value. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If governance will matter later, model as assets now.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Even if you don’t adopt Dagster, design pipelines as &lt;strong&gt;data products&lt;/strong&gt; (clear inputs/outputs, contracts). It pays off when audits or lineage questions arrive. Dagster just bakes this into the tooling. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pick one, design cleanly, keep migration possible.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Encapsulate business logic away from orchestrator glue. Then swapping engines—should you ever need to—becomes an engineering task, not a re‑platform.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Starter snippets I used (trimmed)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Airflow (TaskFlow with a deferrable wait)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.decorators&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.sensors.time_sensor&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TimeSensorAsync&lt;/span&gt;  &lt;span class="c1"&gt;# deferrable
&lt;/span&gt;
&lt;span class="nd"&gt;@dag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2025&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@daily&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;daily_ingest&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pull_s3_keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# list objects...
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt;

    &lt;span class="c1"&gt;# free the worker while waiting for a window
&lt;/span&gt;    &lt;span class="n"&gt;wait&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TimeSensorAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;window&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;03:00&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_and_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# load &amp;amp; process in warehouse
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pull_s3_keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;wait&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;load_and_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;daily_ingest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Airflow TaskFlow keeps you in Python while the &lt;strong&gt;triggerer&lt;/strong&gt; handles idle time for deferrable tasks/sensors. &lt;/p&gt;

&lt;h3&gt;
  
  
  Prefect (fan‑out with retries)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;prefect&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;

&lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retry_delay_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ingest_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# read, validate, write to bronze
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="nd"&gt;@flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;log_prints&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;nightly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ingest_one&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;@flow&lt;/code&gt;/&lt;code&gt;@task&lt;/code&gt; feels like straight Python. Retries/concurrency are built in, and you can register the flow as a &lt;strong&gt;deployment&lt;/strong&gt; to Prefect Cloud (hybrid). &lt;/p&gt;

&lt;h3&gt;
  
  
  Dagster (assets with lineage)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dagster&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dg&lt;/span&gt;

&lt;span class="nd"&gt;@dg.asset&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;raw_orders&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://bucket/raw/orders.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="nd"&gt;@dg.asset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;raw_orders&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;orders_clean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# transform + write to warehouse
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="c1"&gt;# add schedules or sensors for downstream triggers
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;strong&gt;software‑defined assets&lt;/strong&gt;, you’ll see lineage and materializations in the UI and can set freshness/backfill policies. &lt;/p&gt;




&lt;h2&gt;
  
  
  Closing thoughts
&lt;/h2&gt;

&lt;p&gt;If you’re choosing an orchestrator this quarter, let your &lt;strong&gt;constraints&lt;/strong&gt; pick the winner:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Airflow&lt;/strong&gt; if integrations + strict scheduling at scale are the game.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Prefect&lt;/strong&gt; if you want Python‑first velocity and hybrid Cloud governance. &lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Dagster&lt;/strong&gt; if data products, lineage, and partitions are the north star. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We shipped Prefect first, and I’d make the same call given the same pressure and team. But all three are excellent—pick one, keep your business logic clean, and your future self (or the next contractor) will thank you.&lt;/p&gt;

</description>
      <category>bigdata</category>
      <category>elt</category>
      <category>airflow</category>
      <category>dagster</category>
    </item>
  </channel>
</rss>
