One of the key features of AWS DevOps Agent is multi-account support. In this hands-on article, we'll verify the multi-account functionality and discuss practical considerations for real-world implementation.
Disclaimer
This blog content is based on testing during the preview phase.
The information may change as updates are released.
Target Audience
- Those who want to explore AWS DevOps Agent's multi-account functionality
- Those who want to understand considerations when testing multi-account setups
Key Takeaways
- Fault detection in multi-account configurations is achievable
- Multi-account setup itself is straightforward, but practical implementation requires careful consideration
- IAM role names and permission management need to be determined before implementing DevOps Agent
About the Test Environment
We'll conduct verification using the test sample code from the AWS DevOps Agent official documentation. (The sample code allows you to test EC2 stress tests and Lambda processing performance)
The implementation steps are described in the guide, so we'll omit them from this blog.
Note: Since AWS DevOps Agent is in preview, please create resources in the US East (N. Virginia) region
Note: Indentation corrections are required for Test A/Test B YAML files
Test A
AWSTemplateFormatVersion: '2010-09-09'
Description: 'AWS DevOps Agent EC2 CPU Test Stack'
Parameters:
MyIP:
Type: String
Description: Your current IP address for SSH access (find at https://whatismyipaddress.com)
Default: '0.0.0.0/0'
Resources:
# Security Group for SSH access
TestSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: AWS-DevOpsAgent-test-sg
GroupDescription: AWS DevOps Agent beta testing security group
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: !Ref MyIP
Description: SSH access from your IP
Tags:
- Key: Name
Value: AWS-DevOpsAgent-Test-SG
- Key: Purpose
Value: AWS-DevOpsAgent-Testing
# Key Pair for SSH access
TestKeyPair:
Type: AWS::EC2::KeyPair
Properties:
KeyName: AWS-DevOpsAgent-test-key
KeyType: rsa
Tags:
- Key: Name
Value: AWS-DevOpsAgent-Test-Key
- Key: Purpose
Value: AWS-DevOpsAgent-Testing
# EC2 Instance for CPU testing
TestInstance:
Type: AWS::EC2::Instance
Properties:
InstanceType: t3.micro
ImageId: '{{resolve:ssm:/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-6.1-x86_64}}'
KeyName: !Ref TestKeyPair
SecurityGroupIds:
- !Ref TestSecurityGroup
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
yum update -y
yum install -y htop
# Create the CPU stress test script
cat > /home/ec2-user/cpu-stress-test.sh << 'EOF'
#!/bin/bash
echo "Starting AWS DevOpsAgent CPU Stress Test"
echo "Time: $(date)"
echo "Instance: $(curl -s http://169.254.169.254/latest/meta-data/instance-id)"
echo ""
# Get number of CPU cores
CORES=$(nproc)
echo "CPU Cores: $CORES"
echo ""
echo "Starting stress test (5 minutes)..."
echo "This will generate >70% CPU usage to trigger CloudWatch alarm"
echo ""
# Create CPU load using yes command
echo "Starting CPU load processes..."
for i in $(seq 1 $CORES); do
(yes > /dev/null) &
CPU_PID=$!
echo "Started CPU load process $i (PID: $CPU_PID)"
echo $CPU_PID >> /tmp/cpu_test_pids
done
# Auto-cleanup after 5 minutes
(sleep 300 && echo "Stopping CPU load processes..." && kill $(cat /tmp/cpu_test_pids 2>/dev/null) 2>/dev/null && rm -f /tmp/cpu_test_pids) &
echo ""
echo "CPU load processes started for 5 minutes"
echo "Check CloudWatch for alarm trigger in 3-5 minutes"
EOF
chmod +x /home/ec2-user/cpu-stress-test.sh
chown ec2-user:ec2-user /home/ec2-user/cpu-stress-test.sh
# Create auto-shutdown script (safety mechanism)
cat > /home/ec2-user/auto-shutdown.sh << 'SHUTDOWN_EOF'
#!/bin/bash
echo "Auto-shutdown scheduled for 2 hours from now: $(date)"
sleep 7200
echo "Auto-shutdown executing at: $(date)"
sudo shutdown -h now
SHUTDOWN_EOF
chmod +x /home/ec2-user/auto-shutdown.sh
nohup /home/ec2-user/auto-shutdown.sh > /home/ec2-user/auto-shutdown.log 2>&1 &
echo "AWS DevOpsAgent test setup completed at $(date)" > /home/ec2-user/setup-complete.txt
Tags:
- Key: Name
Value: AWS-DevOpsAgent-Test-Instance
- Key: Purpose
Value: AWS-DevOpsAgent-Testing
# CloudWatch Alarm for CPU utilization
CPUAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: AWS-DevOpsAgent-EC2-CPU-Test
AlarmDescription: AWS-DevOpsAgent beta test - EC2 CPU utilization alarm
MetricName: CPUUtilization
Namespace: AWS/EC2
Statistic: Average
Period: 60
EvaluationPeriods: 1
Threshold: 70
ComparisonOperator: GreaterThanThreshold
Dimensions:
- Name: InstanceId
Value: !Ref TestInstance
TreatMissingData: notBreaching
Outputs:
InstanceId:
Description: EC2 Instance ID for testing
Value: !Ref TestInstance
SecurityGroupId:
Description: Security Group ID
Value: !Ref TestSecurityGroup
AlarmName:
Description: CloudWatch Alarm Name
Value: !Ref CPUAlarm
SSHCommand:
Description: SSH command to connect to instance
Value: !Sub 'ssh -i "AWS-DevOpsAgent-test-key.pem" ec2-user@${TestInstance.PublicDnsName}'
Test B
AWSTemplateFormatVersion: '2010-09-09'
Description: 'AWS DevOpsAgent Lambda Error Test Stack'
Resources:
# IAM Role for Lambda function
LambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
RoleName: AWS-DevOpsAgentLambdaTestRole
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Tags:
- Key: Name
Value: AWS-DevOpsAgent-Lambda-Test-Role
- Key: Purpose
Value: AWS-DevOpsAgent-Testing
# Lambda function that generates errors
TestLambdaFunction:
Type: AWS::Lambda::Function
Properties:
FunctionName: AWS-DevOpsAgent-test-lambda
Runtime: python3.12
Handler: index.lambda_handler
Role: !GetAtt LambdaExecutionRole.Arn
Code:
ZipFile: |
import json
import random
import time
from datetime import datetime
def lambda_handler(event, context):
print(f"AWS DevOpsAgent Test Lambda - {datetime.now()}")
print(f"Event: {json.dumps(event)}")
# Intentionally generate errors for testing
error_scenarios = [
"Simulated database connection timeout",
"Test API rate limit exceeded",
"Intentional validation error for AWS DevOpsAgent testing"
]
# Always throw an error for testing purposes
error_message = random.choice(error_scenarios)
print(f"Generating test error: {error_message}")
# This will create a Lambda error that CloudWatch will detect
raise Exception(f"AWS DevOpsAgent Test Error: {error_message}")
Description: AWS DevOpsAgent beta test function - intentionally generates errors
Timeout: 30
Tags:
- Key: Name
Value: AWS-DevOpsAgent-Test-Lambda
- Key: Purpose
Value: AWS-DevOpsAgent-Testing
# CloudWatch Alarm for Lambda errors
LambdaErrorAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: AWS-DevOpsAgent-Lambda-Error-Test
AlarmDescription: AWS-DevOpsAgent beta test - Lambda error rate alarm
MetricName: Errors
Namespace: AWS/Lambda
Statistic: Sum
Period: 60
EvaluationPeriods: 1
Threshold: 0
ComparisonOperator: GreaterThanThreshold
Dimensions:
- Name: FunctionName
Value: !Ref TestLambdaFunction
TreatMissingData: notBreaching
Outputs:
LambdaFunctionName:
Description: Lambda Function Name for testing
Value: !Ref TestLambdaFunction
LambdaFunctionArn:
Description: Lambda Function ARN
Value: !GetAtt TestLambdaFunction.Arn
AlarmName:
Description: CloudWatch Alarm Name
Value: !Ref LambdaErrorAlarm
TestCommand:
Description: AWS CLI command to test the function
Value: !Sub 'aws lambda invoke --function-name ${TestLambdaFunction} --payload "{\"test\":\"AWS DevOpsAgent validation\"}" response.json'
Once the CloudFormation processing completes successfully, the test environment is ready.
Since it takes time for error conditions to occur, let's configure AWS DevOps Agent in the meantime.
AWS DevOps Agent Multi-Account Configuration
First, create an AWS DevOps Agent for multi-account integration.
By the way, the top screen when creating an Agentspace has changed.
Since it's a preview version, breaking changes are likely occurring.
Link the target AWS account to the Secondary Source.
Click the Add button.
After clicking, IAM role information to be created in the target account is displayed. Navigate to the target AWS account and create the IAM role.
Let's try creating the role with a name different from the initially instructed name
The IAM role has been created, but don't forget to configure the inline policy.
Looking at the inline policy configuration, only minimal resource settings are permitted. For example, if you have needs like storing logs in S3 or wanting to monitor Control Tower, you'll need to tune the inline policy configuration.
The IAM role for AWS DevOps Agent should follow these rules:
You should see alarms displayed, so let's check them.
The Lambda alarm went into alarm state, but no alarm occurred for EC2. The script itself didn't seem to be running either, so it might be better to create your own verification resources.
Return to the AWS console on the AWS DevOps Agent side and configure multi-account settings. Configure the following:
Once configuration is complete, the AWS account for multi-account setup will be displayed in the Secondary Source.
An error will occur if the IAM role name doesn't match
By correcting the IAM role name, it becomes Valid.
Testing with AWS DevOps Agent
Now let's verify whether AWS DevOps Agent can detect Lambda failures.
The test alarm name is appearing, so multi-account is functioning properly.
The following was output as the investigation result.
The cause was identified as intentional Lambda code designed to output test errors, confirming that detection is working properly.
The Lambda function AWS-DevOpsAgent-test-lambda, intentionally designed to generate test errors, experienced a 100% error rate (4 errors out of 4 invocations) between 13:40:00 and 13:41:00 UTC. This function was deployed via CloudFormation at 13:15:04 UTC with the description "AWS DevOpsAgent beta test function - intentionally generates errors". Log analysis revealed that the function code (index.py, line 21) intentionally raises exceptions with three different test scenarios: "Test API rate limit exceeded", "Intentional validation error for AWS DevOpsAgent testing", and "Simulated database connection timeout". All errors follow the pattern "AWS DevOpsAgent Test Error: {error_message}". Metrics confirmed these were immediate failures (5-19 millisecond durations) without throttling or timeout issues, consistent with intentional error generation. The CloudWatch alarm "AWS-DevOpsAgent-Lambda-Error-Test" triggered as designed when detecting these errors. This is expected behavior for this test function and not an incident requiring remediation in production.
Insights Gained Through Verification
Through multi-account configuration verification, we gained several insights.
Insight 1: Configuration itself is not difficult with the guide
Since the AWS DevOps Agent multi-account configuration screen provides guidance on what policies and roles to configure, even beginners can handle the setup. However, the default policy permissions are minimal, so customization may be necessary depending on the project.
Insight 2: There are risks with current specifications when configuring for user environments
When performing operational support or monitoring configuration, settings are likely needed for both the vendor's AWS account (where DevOps Agent is set up) and the user's AWS account (where IAM roles are configured). In such cases, if IAM role names and permission settings are not discussed in advance, there's a possibility that the desired troubleshooting cannot be achieved.
Insight 3: Introducing multi-account makes DevOps Agent operations more complex
In this case, we're targeting one AWS account, so management isn't too complex, but when managing multiple AWS accounts with a single DevOps Agent, it becomes difficult to track which AgentSpace manages which AWS account. (You need to check each AgentSpace's accounts one by one)
Future Outlook
Writing this blog has raised the need to consider the following:
- Permission management considering multi-account configuration (IAM Identity Center)
- DevOps Agent management itself considering IaC
- Accuracy verification when registering multiple multi-accounts + same failure occurs simultaneously
Additionally, we'd like to conduct verification from these perspectives:
- Integration with unsupported observability tools
- Implementation ideas for enterprise environments


















Top comments (0)