DEV Community

nafisah badmos
nafisah badmos

Posted on

Automating EL pipeline using Azure Functions(Python)

In one of my recent projects, I needed a reliable way to trigger pipelines whenever new files landed in Azure Blob Storage.

Previously, I had an ADF pipeline for every copy activity which relied on triggers.This wasn't efficient as errors will come up every now and then, as well as the pipelines not being easily scalable. I wanted something event-driven, secure, and easy to extend.

In this post, I’ll walk through how I used Azure Functions (Python) to build an event-driven pipeline for multiple EL jobs using just one function app. When a new file is uploaded, an Event Grid event fires, an Azure Function processes the event, and the right pipeline is triggered to copy data into the correct locations.

All of the code for this post is available publicly on GitHub:
👉 My Github

Problem & Context

The problem I wanted to solve
As a data engineer, I was in charge of developing pipelines to move data from one destination to another. The existing pipelines were working, but not efficient and scalable. we had triggers on each of the ADF pipelines that kicked off the transfer each time a new blob was created. But this wasn't enough. We were looking at long term. We wanted something reproducible, simple, fast and scalable which let to this Azure functions architecture.

Architecture at a glance

  1. A new file is uploaded to a source Storage Account.

  2. Event Grid raises a BlobCreated event.

  3. An Azure Function (Python, Event Grid trigger) receives the event.

  4. The function:

  • Parses the event payload to identify the container / blob.

  • Determines the origin (data source) and what should happen next using environment variables.

  • Triggers the appropriate pipeline.

  • The pipeline executes, copying the file(s) to the correct folders in the destination Storage Account(s) or triggering the ADF pipeline to run.

The architecture allows for one Azure function App to handle multiple Azure functions. The functions perform similar, yet different tasks, but are all related. There are 4 functions in total, 2 for copying CSV files, 1 for Apache Parquet file transfer and 1 for triggering an ADF pipeline that handles SQL Bak files.

For each function, the correct event is created in the storage container where we are expecting the file type to be deposited. For example, the azure function that handles Apache Parquet file will be associated with the storage account that receives such files in order for the pipeline to run smoothly.

Config & env vars

Rather than hardcoding storage paths and pipelines, I pushed most of the configuration into environment variables on the Function App.

For example:

  • Source containers

  • Destination folders (e.g. latest, archive)

  • The ADF pipeline to call for a given “origin”

This makes it much easier to:

  • Support multiple data sources

  • Promote the solution across environments (dev/test/prod)

  • Update destinations without redeploying the function code

For example, one of our data sources for the CSV file transfer was Alderhey. The data was to be transferred to 3 destinations, so the env variables were configured as "Alderhey_1_destination_folder", "Alderhey_1_destination_storage_account", and this is the same config that is done for each of the multiple destinations.

Auth & Networking

For authentication, I used App registration and the details were added in the env variables as well. The App was granted roles such as:

  • Storage Blob Data Contributor on the relevant Storage Accounts
  • Data Factory Contributor on the ADF instance.

Depending on your setup, you may also need to handle:

  • VNet integration for the Function App

  • Private endpoints for Storage and Data Factory

  • Firewall rules that allow Event Grid → Function and Function → Storage/ADF

This part is where most of the “it works on my machine” issues show up, so I recommend testing connectivity step by step.

How to deploy

At a high level, deployment looks like this:

  1. Create an Azure Function App (Python runtime) in your chosen region.

  2. Deploy the function code from VS Code or Azure DevOps/GitHub Actions.

  3. Configure Application Settings (environment variables) for:

  • Source/destination storage details

  • Pipeline names / resource IDs

  1. Create the App Registration and assign the required roles.

  2. Create an Event Grid subscription on your Storage Account for BlobCreated events, pointing to the Function endpoint.

  3. Upload a test file and confirm the pipeline runs and data lands in the expected locations.

Lesson Learned

Working on this project was interesting as it showed me the possibilities of implementing EL pipelines using Azure Functions. Minimal coding, reusability and automation guaranteed, and best of all, it is replicable anywhere( dev, another tenant, more destinations, you name it!). I love how easy it was to implement overall.

If you’re working with Azure Functions, Event Grid, or ADF and want to swap ideas, feel free to reach out.
You can find the full code here: My Github and connect with me on LinkedIn: Nafisah Badmos.

Top comments (0)