AWS Batch is a great service for submitting asynchronous and background work to. It's a managed service that adds job queue and compute scaling functionality to AWS container orchestration services - ECS and EKS for these types of workloads. AWS Batch is optimized for jobs that are at least a few minutes long, and I've run days-long processes on it.
Sometimes you need to connect to the underlying EC2 instance to debug, or inspect the outputs of running containers, but the underlying container instance ID is not directly available to you using the Batch API. For processes running on ECS I wrote a small Python script that queries for the underlying instance ID of the job using boto3
(and a little regex
) and then prints out a CLI command to connect to the instance using a AWS Systems Manager session. The script takes in an AWS Batch job ID as a required parameter.
#!/usr/bin/env python
import boto3, regex
from argparse import ArgumentParser
# get a job id from the command line
parser = ArgumentParser()
parser.add_argument("job_id", help="The AWS Batch job ID to get the EC2 instance ID on which it ran.")
args = parser.parse_args()
# Get a Boto session
session = boto3.Session()
# Get a client for AWS Batch
batch_client = session.client('batch')
# Get a client for AWS ECS
ecs_client = session.client('ecs')
# Describe a batch job
job_description = batch_client.describe_jobs(jobs=[args.job_id])
container_instance_arn = job_description["jobs"][0]["container"]["containerInstanceArn"]
# regex for pulling out the ECS cluster ID and container instance ID from a container instance ARN
regex_pattern = r"arn:aws:ecs:(?P<region>.*):(?P<account_id>.*):container-instance/(?P<cluster_id>.*)/(?P<container_instance_id>.*)"
match = regex.match(regex_pattern, container_instance_arn)
cluster_id = match.group("cluster_id")
container_instance_id = match.group("container_instance_id")
# Describe a container instance and get the instance ID
container_instance_description = ecs_client.describe_container_instances(cluster=cluster_id, containerInstances=[container_instance_id])
ec2_instance_id = container_instance_description["containerInstances"][0]["ec2InstanceId"]
print("To connect to the EC2 instance use the AWS CLI like so:")
print("aws ec2 connect-to-instance --instance-id " + ec2_instance_id)
If you need to find and connect to the underlying EC2 Instance for an AWS Batch job, I hope this script helps.
Updated Jan 22, 2024 to use a boto3.Session
as per advice from this post which I agree with.
Top comments (2)
Great tip - thanks Angel! Going to be exploring AWS Batch a lot more soon.
Thanks! If you have any other Batch questions (this one came from a customer) let me know and I may be able to help!