Running Python script with aws lambdas

#webdev #programming #python #aws

As a software engineer sometimes you have to write a script to add or update some records which you don't want to handle in your codebase. For a few records, you can run your script locally even for hundreds of records you can use multithreading but when you have to deal with thousands of records these methods are not enough. So here comes AWS lambdas or GCP function to streamline your whole script where you don't need to worry about rerunning your script if there is any network or machine interruption.

In this article: You will learn how to efficiently handle large record updates using AWS Lambdas, with a driver function and a target function.

Set up your Lambdas:

Set up two AWS lambda functions, one driver function and the other target function. The driver function handles source(Record location) authentication and data processing before invoking the target function asynchronously.

For example, if you need to modify data stored in S3 bucket, here is a simple driver/iterator function:


import boto3

client = boto3.client('lambda')

def lambda_handler(event, context):
    index = event['iterator']['index'] + 1

    # update payload according to your requirements
    response = client.invoke(
        FunctionName='LAMBDA_TO_INVOKE',
        InvocationType='Event',
        Payload=json.dumps({
            'bucket_name': 'YOUR_BUCKET_NAME',
            'file_key': 'YOUR_FILE_KEY'
        })
    )

    return {
        'index': index,
        'continue': index < event['iterator']['count'],
        'count': event['iterator']['count']
    }

The target function fulfills the purpose and performs read-write operations or other modifications at the source.

Here is a simple example of a target/invoked function :


import boto3
s3_client = boto3.client('s3')

def lambda_handler(event, context):
    # Extracting necessary information from the event
    bucket_name = event['bucket_name']
    file_key = event['file_key']

    # Get file from S3
    s3_response = s3_client.get_object(Bucket=bucket_name, Key=file_key)
    file_content = s3_response['Body'].read()

    # Perform modification on the file content (example: convert to uppercase)
    modified_content = file_content.upper()

    # Upload the modified file back to S3
    s3_client.put_object(Bucket=bucket_name, Key=file_key, Body=modified_content)

    return {
        'statusCode': 200,
        'body': 'File modified successfully'
    }

Below are some valuable tips for automation:
Before you run your script, ensure these things for better results:

Add proper logging to track your modified and remaining records and overall progress.
Keep your OAuth credentials in your credentials manager or any safe place.
Structure your program so that in any case if you have to rerun your program it shouldn't re-modify your data.

When we use any package outside of python environment like pandas we will probably face issue related to "package not resolved" or something like that so you have to create python environment install all required dependencies and then package them all to upload lambdas.

If you are facing a packaging issue, I have discussed that in the below article:
Will write it soon :)

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

DEV Community

Running Python script with aws lambdas

Set up your Lambdas:

This site is built on Heroku

Top comments (0)

Read next

WordPress and React: The Future is Here

All the latest feature releases, updates and announcements of AWS re:Invent 2024

Revolutionary AI System Creates Ultra-Precise 3D Maps for Self-Driving Cars

AI's Math Skills Jump 20% with New Hybrid Token Method