Building a Serverless Image Processing Pipeline on AWS with Terraform

#aws #serverless #cloud #terraform

Introduction

I recently built a production-ready pipeline that allows users to upload images, have them processed such as resized or optimized, and then stored in a separate location all with minimal manual intervention. The project is available on GitHub at aws-image-processing-pipeline. The goal was to leverage serverless architecture and Infrastructure as Code so that the solution is decoupled, scalable and maintainable.

The Problem

Many applications need to offload image processing or other heavy tasks so that the front end remains responsive and system components can scale independently. By introducing a messaging queue and event-driven processing, we separate upload, processing and storage. This enables high throughput, error isolation and simpler operational overhead.

Architecture Overview

The workflow is as follows

A user uploads an image to an S3 bucket for uploads
A message is sent to an SQS queue
A Lambda function polls the queue, downloads the image, processes it, and uploads the result to the processed bucket
Messages that fail after retries go to a Dead-Letter Queue for inspection
Monitoring is provided via CloudWatch logs and metrics
Terraform defines all infrastructure for versioning and reuse

Tech Stack and Tools

AWS S3 for storing raw and processed images
AWS SQS for decoupled messaging
AWS Lambda with Python 3.11 for processing logic
Terraform for defining and deploying resources
Bash scripts for deployment and testing

Implementation Details

The project uses a modular Terraform structure with separate modules for S3, SQS and Lambda. Each module is reusable and focused on a single responsibility.

Terraform Examples

provider.tf


hcl
provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "uploads" {
  bucket = "uploads-bucket-example"
  acl    = "private"
}

resource "aws_s3_bucket" "processed" {
  bucket = "processed-bucket-example"
  acl    = "private"
}


resource "aws_sqs_queue" "image_queue" {
  name = "image-processing-queue"
}


resource "aws_lambda_function" "image_processor" {
  function_name = "image_processor"
  handler       = "image_processor.lambda_handler"
  runtime       = "python3.11"
  role          = aws_iam_role.lambda_role.arn
  filename      = "lambda/image_processor.zip"

  environment {
    variables = {
      UPLOADS_BUCKET   = aws_s3_bucket.uploads.bucket
      PROCESSED_BUCKET = aws_s3_bucket.processed.bucket
    }
  }
}

resource "aws_lambda_event_source_mapping" "sqs_trigger" {
  event_source_arn = aws_sqs_queue.image_queue.arn
  function_name    = aws_lambda_function.image_processor.arn
}

import boto3
from PIL import Image
import os
import tempfile

s3 = boto3.client('s3')

def lambda_handler(event, context):
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        download_path = os.path.join(tempfile.gettempdir(), key)
        upload_path = os.path.join(tempfile.gettempdir(), "processed-" + key)

        s3.download_file(bucket, key, download_path)

        with Image.open(download_path) as image:
            image = image.resize((800, 600))
            image.save(upload_path)

        s3.upload_file(upload_path, os.environ['PROCESSED_BUCKET'], key)

Deployment Walkthrough

Configure AWS credentials locally with AWS CLI

Clone the repository from GitHub

Copy terraform.tfvars.example to terraform.tfvars and update bucket names

Zip the Lambda code

cd lambda
zip -r image_processor.zip image_processor.py
mv image_processor.zip ../
cd ..


From the project root run

terraform init
terraform plan
terraform apply


Upload an image to the upload bucket to test the pipeline

Monitor Lambda execution in CloudWatch logs

Inspect the processed bucket and Dead-Letter Queue for failures

Clean up with

terraform destroy

Challenges and Learnings

Configuring SQS visibility timeout and retry logic required careful planning

IAM role policies had to be restrictive yet functional

Handling large images without Lambda timeouts required optimization

Storage costs were controlled with S3 lifecycle policies

Conclusion

This pipeline provides a solid foundation for serverless asynchronous workloads. Possible extensions include notifications when processing completes, multiple image transformations or CDN integration. Building this project deepened my skills in AWS, Terraform and event-driven architectures