DEV Community

Cover image for Lambda Fleet Monitoring with OpenSearch: Real-Time Insights at Scale
Yaar Naumenko
Yaar Naumenko

Posted on

Lambda Fleet Monitoring with OpenSearch: Real-Time Insights at Scale

Do you manage multiple AWS accounts with countless Lambda functions — and feel overwhelmed by the complexity of monitoring them all?
Look no further. The Lambda Fleet Monitoring Solution is a fully automated cross-account approach that tracks real-time metrics (invocations, errors, duration, and even cold starts) and funnels them into an OpenSearch cluster for robust analysis and visualization.
This article walks through this solution's architecture, features, and setup. To dive deeper into the code and additional details, check out the opensearch-monitoring GitHub repository.

Why This Matters

As serverless adoption grows, monitoring Lambda metrics becomes increasingly challenging, especially if you have multiple AWS accounts.

With the Lambda Fleet Monitoring Solution, you gain:
Visibility into every function’s performance and execution patterns.
Centralized dashboards for easier troubleshooting.
Scalability that covers as many AWS accounts as you need.

High-Level Architecture

Image description

Key Components:

  1. Amazon EventBridge: Schedules the monitoring Lambda to run on a configurable interval.
  2. Monitoring Lambda: Assumes roles in other AWS accounts to gather CloudWatch metrics and push them to OpenSearch.
  3. OpenSearch Domain: Serves as the data store for all metrics.
  4. OpenSearch Dashboards: Provides out-of-the-box (and customizable) visualization tools. Core Features • Cross-Account Monitoring: Leverage IAM roles to gather data from multiple AWS accounts. • Real-Time Metrics: Track invocation rates, error counts, memory usage, duration statistics, cold starts, etc. • Custom Dashboards: Quickly visualize performance trends and identify anomalies. • Automated Setup: Minimal manual configuration required — Terraform automates resource creation. • Customizable Alerts: Integrate with AWS services or third-party tools for alerting on critical thresholds. • Memory & Timeout Insights: Optimize Lambda performance and costs based on usage patterns.

Metrics You’ll See

  1. Invocation Count
  2. Error Rates
  3. Duration Statistics
  4. Memory Utilization
  5. Cold Start Frequency
  6. Timeout Proximity
  7. Runtime Distribution
  8. Cost Metrics

Prerequisites
To get started, ensure you have:
• AWS CLI configured with the right permissions.
• Terraform v1.5.0+ installed.
• Python 3.9+ installed.
• Cross-account IAM roles set up in each AWS account you wish to monitor.
• Permission to create:
• Lambda functions
• OpenSearch domains
• IAM roles and policies
• CloudWatch events
• S3 buckets

QuickStart Installation

Clone the Repository

git clone https://github.com/cloudon-one/opensearch-monitoring.git
cd opensearch-monitoring/lambda/terraform
Enter fullscreen mode Exit fullscreen mode
  1. Configure Variables In a terraform.tfvars file, define your settings:
aws_region                   = "us-west-1"
monitored_accounts           = ["123456789012", "098765432109"]
opensearch_master_user_password = "your-secure-password"
opensearch_instance_type     = "t3.small.search"
opensearch_instance_count    = 1
opensearch_volume_size       = 10
Enter fullscreen mode Exit fullscreen mode
  1. Initialize Terraform terraform init
  2. Plan & Apply
terraform plan
terraform apply
Enter fullscreen mode Exit fullscreen mode

This will provision the OpenSearch domain, monitoring Lambda, IAM roles, and other necessary resources.

Securing Your Setup

  1. Regular Rotation • Rotate access keys and review roles periodically.
  2. Access Logging • Enable CloudTrail logging for all AWS API activities.
  3. Least Privilege • Minimize permissions where possible and remove unused policies.
  4. Organization Controls • Use AWS Organizations Service Control Policies (SCPs) for additional governance.

Wrapping Up
The Lambda Fleet Monitoring Solution offers a robust, scalable way to track and analyze performance for all your AWS Lambda functions — regardless of how many accounts you manage. By combining real-time CloudWatch metrics with the visualization power of OpenSearch, this solution ensures you stay on top of function behaviour, performance trends, and potential cost optimizations.
For a deeper dive, including best practices, troubleshooting tips, and advanced configuration options, head to the opensearch-monitoring GitHub repository and explore the documentation.

Feel free to fork, submit issues, or contribute enhancements!
Have thoughts or questions?

Comment below or open an issue on GitHub to share your ideas.
Happy monitoring!

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more