DEV Community

Cover image for Bridging Cloud and On-Prem: Running On-Prem Jobs from Apache Airflow in AWS using ssh-rsa key pair.
Vedant Vyas
Vedant Vyas

Posted on

Bridging Cloud and On-Prem: Running On-Prem Jobs from Apache Airflow in AWS using ssh-rsa key pair.

What is this about?

In many organizations, modernization isn’t about going fully cloud-native overnight - it’s about bridging what already works.

When we moved from Appworx to Apache Airflow, our goal wasn’t to rebuild everything from scratch. We wanted a modern, scalable scheduler ; but still needed to run our on-prem scripts and SQL jobs from Airflow hosted on AWS EC2.

That turned out to be the real challenge: orchestrating on-prem jobs from the cloud.

The answer came through a secure SSH bridge, where Airflow connects to on-prem servers via SSH, executes scripts, and returns results - all without exposing credentials.

In this article, I’ll break down how we built that bridge - along with generating and placing ssh-rsa key pairs, AWS Secrets Manager and IAM roles - to modernize our pipelines without disrupting the systems that already worked.

Understanding the Shift: From legacy Workflow Orchestrators (like Appworx) to Airflow.

What Does “Workflow Orchestration” Really Mean?

Before we dive into migration stories, it’s important to understand what workflow orchestration actually is.

At its core, workflow orchestration is about making sure a set of interconnected tasks run in the right order, at the right time, with the right inputs. It’s not just about automation - it’s about coordination.

A good orchestrator handles dependencies, retries, alerts, and visibility - so when something fails, you know exactly what happened and why. Think of it as the conductor of your data orchestra - each instrument (or job) plays its part in harmony to create a reliable, repeatable process.

For example, consider a daily data pipeline:

  1. A script extracts data from an API.
  2. Once that finishes, another job loads it into a data warehouse.
  3. Finally, a third process sends out a summary report to Slack or email.

A workflow orchestrator ensures all of this happens automatically, with checks, logs, and notifications - so engineers can focus on logic instead of manual babysitting.

(If you want a quick visual explanation, this YouTube video
does a great job explaining it.)

From Appworx to Airflow - Why Teams Make the Switch

Most legacy enterprises use tools like Appworx (Automic/UC4) - robust, job-centric schedulers that have powered back-office operations for decades. They’re great at running controlled, repetitive tasks - but as data engineering evolved, the need for flexibility, code-first control, and cloud integration grew.

That’s where Apache Airflow steps in. Airflow is built for engineers - it’s Python-based, modular, and extensible, with native integrations for cloud services and data platforms. It transforms static job definitions into dynamic, version-controlled pipelines that live in Git.

Here’s what makes the migration compelling:

  1. Code-First Workflows In Appworx, you build jobs and workflows through a GUI - which works fine until you have to manage hundreds of them. Airflow, on the other hand, lets you define DAGs (Directed Acyclic Graphs) in Python.

That means:

  • Workflows are code, not config files.
  • They live in Git, with version control, review, and rollback.
  • You can use modules, libraries, and templates for reuse.

For data teams, this shift is transformative. Suddenly, scheduling becomes part of your software development lifecycle - reviewable, testable, and deployable like any other codebase.

  1. Dynamic and Programmatic Workflows One of Airflow’s biggest superpowers is that it’s dynamic. You can generate tasks programmatically - for example, looping through 100 clients and creating 100 ingestion tasks with a few lines of Python.

In Appworx, each of those 100 jobs would be a separate object you’d have to configure manually while Airflow lets you use concepts like TaskGroups, dynamic task mapping, and branching logic, giving you a flexible way to model complex data workflows that scale.

(Here’s a nice deep dive on this concept from Aldo Escobar's blog)
(More about Apache Airflow)

Migration Patterns - How Teams Actually Move

There’s no single right way to move from Appworx to Airflow. Different teams use different strategies depending on their timelines and risk appetite. But in general, migrations fall into three broad patterns:

Pattern A - Lift & Shift

Recreate Appworx jobs as simple Airflow DAGs using operators like SSHOperator or BashOperator to run the same scripts.

Pros: Fast, low effort, great for proof-of-concept.
Cons: You’re still bound by the old assumptions - brittle and not fully optimized.

Pattern B - Refactor & Improve

Take it a step further: map old jobs to Airflow’s specific operators. For example, if a job runs SQL, use PostgresOperator or SnowflakeOperator instead of a shell script.

Add:

  • Retries and alerts for resilience.
  • Templated variables for flexibility.
  • Better logging and notifications for observability.

This pattern gives you tangible improvements without overhauling everything.

Pattern C - Modularize and Decompose

Large Appworx chains can be split into multiple smaller DAGs, connected via TriggerDagRunOperator or ExternalTaskSensor.

This modular approach improves:

  • Parallelism (run independent tasks simultaneously),
  • Testability (smaller, focused DAGs), and
  • Maintainability (easier debugging and versioning).

In short:
Airflow gives you a foundation that’s not just modern - it’s scalable, transparent, and designed for engineers. The move isn’t just a migration of jobs; it’s a migration of mindset - from operations-driven to code-driven orchestration.

Building the Bridge: From Airflow on EC2 to On-Prem Servers

So now that we understand why we moved from Appworx to Airflow, let’s talk about the how.

This part of the story is hands-on - the kind where you roll up your sleeves and actually connect your cloud-hosted Airflow instance to your on-prem servers. We’re talking about secure SSH communication, IAM roles, AWS Secrets Manager, and Airflow operators that make it all work together.

If you’ve ever had to run on-prem shell or SQL scripts from a cloud environment, you know it’s not as simple as “just SSH into it.” You need to handle keys, roles, permissions, and trust - all while keeping things production-grade and secure.

This section is a copy-paste-friendly guide, walking you step-by-step through:

  • Hosting an Airflow instance on EC2,
  • Creating an IAM role that can access a private key stored in AWS Secrets Manager,
  • Writing that key to the EC2 instance,
  • Adding your on-prem server’s public key to authorized_keys using vim,
  • Populating the known_hosts file for secure connections,
  • And finally, executing .sh and .sql jobs on-prem using Airflow’s SSHOperator and SQLOperator.
  • No hand-waving, no skipped steps - everything you need to build this bridge securely and confidently.

What Exactly Is an SSH-RSA Key Pair?

At the heart of this setup lies the SSH RSA key pair - the foundation of secure, password-less authentication between Airflow (running on EC2) and your on-prem servers.

In simple terms, an SSH RSA key pair is a set of two cryptographically linked keys used to authenticate over SSH using the Rivest–Shamir–Adleman (RSA) algorithm. Here’s what that means in practice:

  • Asymmetric Cryptography: RSA uses two keys - a public key and a private key - that work together but can’t be derived from each other.
  • Authentication: Instead of typing a password, your EC2 instance uses its private key to prove its identity to the on-prem server.
  • Handshake: When Airflow connects, the server uses your public key to issue a challenge. Airflow signs it with its private key. If it matches, the connection is trusted.
  • Security: Using strong key sizes (RSA-4096 or higher) provides much stronger protection than any password-based login.

(If you want to know how SSH actually works under the hood, this DigitalOcean article gives a great visual breakdown of the connection process.)

The Core Building Blocks of Our Setup

Before diving into the commands and code, let’s clarify the main components you’ll be working with:

  • EC2 Instance (Airflow Host) : The virtual machine that runs your Airflow scheduler, webserver, and workers. You can use a single node for a simple setup or a multi-node architecture for scale.
  • IAM Role : An AWS identity attached to your EC2 instance. It lets the VM call AWS APIs (like Secrets Manager) without hardcoding credentials. You’ll grant it the permission secretsmanager:GetSecretValue so it can securely retrieve the SSH private key at runtime.
  • AWS Secrets Manager : The secure vault for storing your private SSH key. Airflow fetches this key just before connecting to the on-prem servers - meaning no key files are stored in plain text or checked into Git.
  • SSH Key Pair (RSA-4096) : This is the actual key pair that forms the authentication bridge. The private key lives in Secrets Manager and is fetched to /home/airflow/.ssh/airflow_onprem_rsa (with 0600 permissions). The public key is added to the ~/.ssh/authorized_keys file on the on-prem servers using vim.

  • known_hosts File : A file on your EC2 instance (~/.ssh/known_hosts) that stores trusted host fingerprints. This prevents man-in-the-middle attacks by ensuring Airflow only connects to servers it has previously verified.

  • SSHOperator, SQLOperator & SSHHook (Airflow) : Provided by the apache-airflow-providers-ssh package, these components allow Airflow to open SSH connections and execute remote commands. SSHHook handles the connection setup, authentication, and session management. SSHOperator and SQLOperator actually run your .sh or .sql scripts remotely over that connection.

Together, these layers form a secure, reproducible bridge that lets your Airflow workflows orchestrate on-prem tasks - cleanly, automatically, and without exposing any credentials.

Workflow Diagram

Now that we understand the architecture of our hybrid setup - with the Airflow instance on EC2 securely assuming an IAM role, retrieving its private key from Secrets Manager, and connecting to on-prem servers via SSH - let’s dive into how to configure each piece end-to-end.

Step 1 : Generate an RSA 4096 Key Pair (on EC2 or locally)

You can generate the key pair directly on your EC2 instance or locally, and then securely store the private key in AWS Secrets Manager.

# Generate RSA 4096 key pair and store in ~/.ssh

ssh-keygen -t rsa -b 4096 -m PEM -f ~/.ssh/airflow_onprem_rsa -C "airflow@ec2" -N ""

# -N "" : empty passphrase (use Secrets Manager for security)
# -m PEM : ensures traditional PEM formatting (broad compatibility)

Enter fullscreen mode Exit fullscreen mode

This produces:

  • ~/.ssh/airflow_onprem_rsa (private key)
  • ~/.ssh/airflow_onprem_rsa.pub (public key)

Security Note: If you generate the key locally, upload the private key securely to Secrets Manager and then delete the local copy.

Step 2 : Install the Public Key on On-Prem Servers

Two options:

Option A - Using ssh-copy-id:

ssh-copy-id -i ~/.ssh/airflow_onprem_rsa.pub airflow-runner@onprem-host.example.com

Enter fullscreen mode Exit fullscreen mode

Option B - Manual Method:

cat ~/.ssh/airflow_onprem_rsa.pub  # copy the single-line result
ssh airflow-runner@onprem-host.example.com
mkdir -p ~/.ssh && chmod 700 ~/.ssh
vim ~/.ssh/authorized_keys  # paste the key, then :wq
chmod 600 ~/.ssh/authorized_keys
chown airflow-runner:airflow-runner ~/.ssh/authorized_keys
Enter fullscreen mode Exit fullscreen mode

Permissions matter:
~/.ssh → 700
authorized_keys → 600
SSH daemon rejects insecure permissions.

Step 3 : Add the On-Prem Host to known_hosts

Add your on-prem server’s host key to avoid terminal prompts and verify identity.

# Add by hostname
ssh-keyscan -H onprem-host.example.com >> ~/.ssh/known_hosts

# Or by IP
ssh-keyscan -H 10.0.1.5 >> ~/.ssh/known_hosts

# Verify
tail -n 5 ~/.ssh/known_hosts
Enter fullscreen mode Exit fullscreen mode

-H hashes hostnames for privacy.
If you prefer, open ~/.ssh/known_hosts and confirm the host key line.

Step 4 : Store the Private Key in AWS Secrets Manager

Store the private key only (not the public key).

aws secretsmanager create-secret \
--name "airflow/onprem/private_key" \
--description "Private SSH key for Airflow EC2 to connect to on-prem servers" \
--secret-string file://~/.ssh/airflow_onprem_rsa
Enter fullscreen mode Exit fullscreen mode

This returns an ARN like:

arn:aws:secretsmanager:ap-south-1:123456789012:secret:airflow/onprem/private_key-abc123
Enter fullscreen mode Exit fullscreen mode

Save this ARN for your IAM policy.

Step 5 : Create IAM Role (Instance Profile) with Least-Privilege Access

Your EC2 instance will assume this role to read the secret.

Trust policy (trust-policy.json):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"Service": "ec2.amazonaws.com"},
    "Action": "sts:AssumeRole"
  }]
}
Enter fullscreen mode Exit fullscreen mode

Least-privilege policy (airflow-secrets-policy.json):

{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Sid":"ReadAirflowSSHPrivateKey",
      "Effect":"Allow",
      "Action":[
        "secretsmanager:GetSecretValue",
        "secretsmanager:DescribeSecret"
      ],
      "Resource":"arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:airflow/onprem/private_key-XXXXXXXX"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Commands:

# 1) Create role
aws iam create-role \
--role-name AirflowEC2SecretsRole \
--assume-role-policy-document file://trust-policy.json

# 2) Attach inline policy
aws iam put-role-policy \
--role-name AirflowEC2SecretsRole \
--policy-name AllowSecretsRead \
--policy-document file://airflow-secrets-policy.json

# 3) Create instance profile and attach role
aws iam create-instance-profile --instance-profile-name AirflowEC2InstanceProfile
aws iam add-role-to-instance-profile \
--instance-profile-name AirflowEC2InstanceProfile \
--role-name AirflowEC2SecretsRole

# 4) Attach profile to EC2
aws ec2 associate-iam-instance-profile \
--instance-id i-0123456789abcdef0 \
--iam-instance-profile Name=AirflowEC2InstanceProfile

Enter fullscreen mode Exit fullscreen mode

Verification:

curl -s http://169.254.169.254/latest/meta-data/iam/info
aws sts get-caller-identity
Enter fullscreen mode Exit fullscreen mode

Step 6 : Fetch the Private Key on EC2 Securely

Fetch the secret and write it to disk with strict permissions.

export AWS_REGION=ap-south-1
aws secretsmanager get-secret-value \
--secret-id "airflow/onprem/private_key" \
--query SecretString \
--output text > /home/airflow/.ssh/airflow_onprem_rsa

chmod 600 /home/airflow/.ssh/airflow_onprem_rsa
chown airflow:airflow /home/airflow/.ssh/airflow_onprem_rsa
Enter fullscreen mode Exit fullscreen mode

Run this during EC2 startup (user-data/bootstrap) so Airflow SSHOperator has the key ready before DAGs start.

Step 7 : Configure the SSH Connection in Airflow

You can either store the SSH connection in Airflow UI/CLI or build it programmatically.

Option A - Airflow CLI:

airflow connections add 'onprem_ssh' \
--conn-type 'ssh' \
--conn-host 'onprem-host.example.com' \
--conn-login 'airflow-runner' \
--conn-port '22' \
--conn-extra '{"key_file":"/home/airflow/.ssh/airflow_onprem_rsa"}'
Enter fullscreen mode Exit fullscreen mode

Option B - DAG Code Example:

from airflow import DAG
from airflow.providers.ssh.operators.ssh import SSHOperator
from datetime import datetime, timedelta

default_args = {'owner': 'data-team', 'retries': 1, 'retry_delay': timedelta(minutes=5)}

with DAG(dag_id='onprem_run',
         start_date=datetime(2025,1,1),
         schedule_interval='@daily',
         catchup=False,
         default_args=default_args) as dag:

    run_remote = SSHOperator(
        task_id='run_script_on_onprem',
        ssh_conn_id='onprem_ssh',
        command='bash /opt/scripts/daily_refresh.sh',
        timeout=60*60
    )
Enter fullscreen mode Exit fullscreen mode

Advanced Airflow Operators and Patterns

1. SSH Basics : SSHHook and SSHOperator

SSHHook: lower-level helper built on Paramiko for programmatic SSH connections.

SSHOperator: wraps SSHHook to execute shell commands remotely.

Use them when:

  • You need to run on-prem binaries/scripts.
  • Airflow can’t directly access the target DB/network.
  • You want retries, logging, and alerting.

Example:

from airflow import DAG
from airflow.providers.ssh.operators.ssh import SSHOperator
from datetime import datetime, timedelta

default_args = {'owner': 'data-team', 'retries': 2, 'retry_delay': timedelta(minutes=5)}

with DAG('onprem_ssh_dag', start_date=datetime(2025,1,1),
         schedule_interval='0 2 * * *', catchup=False, default_args=default_args) as dag:
    run_shell = SSHOperator(
        task_id='run_refresh_script',
        ssh_conn_id='onprem_ssh',
        command='bash /opt/scripts/daily_refresh.sh',
        timeout=60*60,
        do_xcom_push=False
    )
Enter fullscreen mode Exit fullscreen mode

Tips:

  • Never embed private keys in DAGs.
  • Use strict file permissions (0600).
  • Set timeouts and avoid large XCom pushes.
  • Use Airflow pools to protect on-prem resources.

2. Running SQL : PostgresOperator / MySqlOperator / MSSqlOperator / SqlSensor

Use SQL operators when Airflow can directly reach the DB.
Use SSHOperator + local script when Airflow must connect indirectly.

Example:

from airflow.providers.postgres.operators.postgres import PostgresOperator
run_sql = PostgresOperator(
    task_id='cleanup_db',
    postgres_conn_id='onprem_postgres',
    sql='/opt/sql/cleanup.sql',
    autocommit=True
)
Enter fullscreen mode Exit fullscreen mode

3. SSHHook : Programmatic SSH Control

from airflow.providers.ssh.hooks.ssh import SSHHook

hook = SSHHook(ssh_conn_id='onprem_ssh')
client = hook.get_conn()
stdin, stdout, stderr = client.exec_command('bash /opt/scripts/do_work.sh')
out, err = stdout.read().decode(), stderr.read().decode()
exit_code = stdout.channel.recv_exit_status()
client.close()
Enter fullscreen mode Exit fullscreen mode

Best Practices:

  • Always close connections.
  • Handle non-zero exit codes.
  • Use timeouts to prevent blocking.

4. ExternalTaskSensor : Multi-DAG Dependencies

Use it to make one DAG wait for a task in another DAG to complete.

from airflow.sensors.external_task import ExternalTaskSensor

wait_for_A = ExternalTaskSensor(
    task_id='wait_for_producer',
    external_dag_id='producer_dag',
    external_task_id='produce_file',
    allowed_states=['success'],
    failed_states=['failed','skipped'],
    mode='reschedule',
    timeout=60*60*6
)
Enter fullscreen mode Exit fullscreen mode

Use mode='reschedule' or deferrable sensors to free worker slots.

5. Putting It Together : Hybrid DAG Example

from airflow import DAG
from datetime import datetime, timedelta
from airflow.providers.postgres.operators.postgres import PostgresOperator
from airflow.providers.ssh.operators.ssh import SSHOperator
from airflow.sensors.external_task import ExternalTaskSensor

default_args = {'owner':'data-team','retries':1,'retry_delay':timedelta(minutes=5)}

with DAG('hybrid_onprem_flow', default_args=default_args,
         start_date=datetime(2025,1,1), schedule_interval='@daily', catchup=False) as dag:

    wait_for_producer = ExternalTaskSensor(
        task_id='wait_for_producer',
        external_dag_id='producer_dag',
        external_task_id='produce_ready_flag',
        mode='reschedule',
        timeout=60*60*3
    )

    cleanup_local = PostgresOperator(
        task_id='cleanup_local_db',
        postgres_conn_id='onprem_postgres',
        sql='sql/cleanup.sql'
    )

    cleanup_remote = SSHOperator(
        task_id='cleanup_remote_db',
        ssh_conn_id='onprem_ssh',
        command='psql -U db_user -d prod_db -f /opt/sql/cleanup.sql',
        timeout=30*60
    )

    wait_for_producer >> (cleanup_local, cleanup_remote)

Enter fullscreen mode Exit fullscreen mode

Common Pitfalls & Troubleshooting

  • Permission denied (SSH): check file perms (700/600) and ownership.
  • Host key verification failed: confirm known_hosts entry.
  • Secrets Manager AccessDenied: validate IAM policy ARN.
  • Airflow connection not found: ensure correct conn_id.
  • Hanging SSH tasks: set timeout and do_xcom_push=False.
  • Key not available on startup: verify EC2 bootstrap script timing.

Conclusion

At its core, this migration wasn’t just about moving to Airflow - it was about making the cloud and on-prem talk securely. Using SSHHook and SSHOperator, our EC2-based Airflow instance could assume an IAM role, fetch the private key from Secrets Manager, and establish an ssh-rsa connection to on-prem servers with matching public keys.

That simple bridge turned a tricky hybrid setup into a clean, reliable workflow. Airflow handled the orchestration, retries, and visibility - while the scripts kept running exactly where they were meant to. A small key-pair, a few lines of config, and suddenly, modernization felt less like a rebuild and more like a handshake between old and new.

Top comments (0)