DEV Community

loading...
Cover image for Listening and Reacting To AWS Batch Events

Listening and Reacting To AWS Batch Events

Harshit Singh
Just another keyboard junkie who likes to dream, code and create ( in no specific order ).
・3 min read

Scenario

You are using AWS batch for certain processing, and want to keep a track of your tasks/jobs as they transition from one state to another. Perhaps you want to persist state changes in database, send out a notification, or take another action.

In this post
we'll take a look at how you can enable actions based on batch events, persist that information in a dynamo db, and send out a notification in a webex team group (I'll be writing another blog on this). I assume some level of familiarity with dynamo db, lambda, sqs, and of course AWS batch. If you'd like a post on those do let me know in the comments.
This is a crude representation of what we are going for:
Stack Diagram

Let's Get Started

What is AWS Batch Events
Whenever the state of a batch job changes, AWS sends an event to cloudwatch with details on the event. In general this is what the event message looks like :

{
  "version": "0",
  "id": "c8f9c4b5-76e5-d76a-f980-7011e206042b",
  "detail-type": "Batch Job State Change",
  "source": "aws.batch",
  "account": "aws_account_id",
  "time": "2017-10-23T17:56:03Z",
  "region": "us-east-1",
  "resources": [
    "arn:aws:batch:us-east-1:aws_account_id:job/4c7599ae-0a82-49aa-ba5a-4727fcce14a8"
  ],
  "detail": {
    "jobName": "event-test",
    "jobId": "4c7599ae-0a82-49aa-ba5a-4727fcce14a8",
    "jobQueue": "arn:aws:batch:us-east-1:aws_account_id:job-queue/HighPriority",
    "status": "RUNNABLE",
    "attempts": [],
    "createdAt": 1508781340401,
    "retryStrategy": {
      "attempts": 1
    },
    "dependsOn": [],
    "jobDefinition": "arn:aws:batch:us-east-1:aws_account_id:job-definition/first-run-job-definition:1",
    "parameters": {},
    "container": {
      "image": "busybox",
      "vcpus": 2,
      "memory": 2000,
      "command": [
        "echo",
        "'hello world'"
      ],
      "volumes": [],
      "environment": [],
      "mountPoints": [],
      "ulimits": []
    }
  }
}

You can see that the message body contains pretty much all the information about the job, in particular take note of jobId, jobName, status, and timestamp. Note that if the container has environment variables, then that would show up under environment attribute.
With no one listening, these messages get lost like tears in rain. In order to prevent that we'll

Setup A Cloudwatch Event Rule

This rule will allow us to filter the batch events per our requirement and send it forward for further processing.
There are 3 filters that we are looking for:
a- The event should have aws batch as its source
b- It should correspond to batch job state change
c- It should have belong to a particular job queue, otherwise you'll end up listening to all the batch jobs running in that account.
Once these situations are met we want to send the event message to sqs so that it can be processed further. Now if you use cloudformation for resource orchestration, you can add create a rule, add filter, and specify a target by writing something like this:

{
    "Type": "AWS::Events::Rule",
    "Properties": {
        "Description": "Batch Event Rule Description",
        "Name": {
            "Ref": "BatchEventRuleName"
        },
        "State": "ENABLED",
        "EventPattern": {
            "source": ["aws.batch"],
            "detail-type": [
                "Batch Job State Change"
            ],
            "detail": {
                "jobQueue": [{
                    "Ref": "specificQueueToObserve"
                }]
            }

        },
        "Targets": [{
            "Arn": {
                "Fn::GetAtt": [
                    "forwardSQSQueue",
                    "Arn"
                ]
            },
            "Id": {
                "Ref": "targetId"
            }
        }]
    }
}

Note that its important provide an Id (name) for the target (in this case SQS) to which you want to forward your request.

Now if you have a lambda listening to the SQS queue, you could then process the event message, extract useful information, such as jobID, jobName, current status, timestamp and persist those in a dynamo db. (I am not going write about dynamo details here, maybe a different post or I'll edit this one)

Side Note:
The event message contains event time in epoch, while that is cool its not really human readable. So if you are coding in java, you could use following code to convert that into simple date format

public static String epochToDateTime(String epoch){
        Date date = new Date(Long.parseLong(epoch));
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat("yyyyMMdd.HHmmss");
        simpleDateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
        return simpleDateFormat.format(date);
    }

Discussion (0)