Subnet IP exhaustion can lead to operational bottlenecks and service outages in cloud environments. To address this challenge, we’ll build an AWS Lambda function to monitor the availability of free IP addresses in subnets across your VPCs and send real-time Slack alerts when the count drops below a threshold.
But wait a sec?! Doesn't AWS provide monitoring for this out of the box?!
- Of course, they are providing VPC IPAM, and you may use it (to be honest I would prefer also to use it if there was no budget limitation). I always would prefer the managed solution, but this time, when we needed to get approval for a budget, if it was for 1000 IPs, it wasn't going to be a big deal because pricing is $0.00027 per hour per monitored IP by IPAM. However, in my case, with hundreds of thousands of IPs, the budget was not approved, and needed to think of something custom and cost-effective solution.
Key Features
- Proactive Monitoring: Checks the available IPs in each subnet at regular intervals.
- Real-Time Alerts: Sends Slack notifications when IP availability in a subnet is critically low.
- Secure Configuration: Retrieves Slack webhook URL from AWS Secrets Manager, ensuring secure handling of sensitive data.
Prerequisites
- AWS CLI installed and configured
- AWS SAM CLI installed
- Python 3.x
- Basic understanding of AWS Lambda, Slack API, and VPC networking
- A Slack webhook URL stored in AWS Secrets Manager
Step 1: Create the SAM Project
Initialize a new SAM project:
sam init
Choose the following options:
-
Template:
AWS Quick Start Templates
-
Runtime:
python3.9
(or the latest version) -
Application:
Hello World Example
Navigate to the project directory:
cd <your-project-directory>
Step 2: Implement the Lambda Code
Replace the contents of app.py
with the following code:
import boto3
import logging
import requests
import os
from botocore.exceptions import ClientError
# Logging setup
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# AWS client
ec2_client = boto3.client('ec2')
# Fetch the Slack webhook URL from environment variable
SLACK_WEBHOOK_URL = os.getenv('SLACK_WEBHOOK_URL')
if not SLACK_WEBHOOK_URL:
raise ValueError("SLACK_WEBHOOK_URL environment variable is not set.")
def fetch_subnets():
"""Retrieve all subnets in all VPCs."""
try:
paginator = ec2_client.get_paginator('describe_subnets')
for page in paginator.paginate():
for subnet in page['Subnets']:
yield subnet
except ClientError as e:
logger.error(f"Error fetching subnets: {e}")
raise
def send_slack_notification(message):
"""Send a message to Slack."""
try:
response = requests.post(SLACK_WEBHOOK_URL, json={"text": message})
if response.status_code != 200:
logger.error(f"Failed to send Slack notification: {response.text}")
else:
logger.info("Slack notification sent successfully.")
except Exception as e:
logger.error(f"Error sending Slack notification: {e}")
def lambda_handler(event, context):
"""Main Lambda handler to monitor subnets and send alerts."""
logger.info("Starting subnet monitoring.")
try:
for subnet in fetch_subnets():
subnet_id = subnet['SubnetId']
cidr_block = subnet['CidrBlock']
available_ips = subnet['AvailableIpAddressCount']
logger.info(f"Subnet {subnet_id} ({cidr_block}) has {available_ips} available IPs.")
if available_ips < 10:
message = (f"🚨 *Low IP Alert*: Subnet {subnet_id} ({cidr_block}) has only {available_ips} free IPs left. "
f"Immediate action may be required!")
send_slack_notification(message)
except Exception as e:
logger.error(f"Unexpected error: {e}")
Step 3: Configure the SAM Template
Update template.yaml
to define the Lambda function and required permissions:
Resources:
SubnetMonitorFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: .
Handler: app.lambda_handler
Runtime: python3.12
Timeout: 60
Policies:
- EC2ReadOnlyAccess
- SecretsManagerReadWrite
Environment:
Variables:
SLACK_WEBHOOK_URL: '{{resolve:secretsmanager:MySecret}}'
Events:
ScheduledEvent:
Type: Schedule
Properties:
Schedule: rate(5 minutes)
Step 4: Build and Deploy
Build your application:
sam build
Deploy your application:
sam deploy --guided
During deployment, you’ll be prompted to provide a stack name and parameter values. Ensure that your Slack webhook URL is securely stored in Secrets Manager.
Step 5: Test and Monitor
After deployment:
- Check the Lambda execution logs in CloudWatch Logs for subnet data and alert activity.
- Trigger a test scenario by reducing the available IP count in a subnet (e.g., by launching instances).
- Verify that alerts appear in your Slack channel.
Summary
By following this tutorial, you’ve created a serverless solution to:
- Proactively monitor free IP availability in VPC subnets.
- Notify your team on Slack when IP resources are critically low.
This system ensures you can address IP exhaustion issues before they impact your applications, giving you peace of mind in managing your cloud network resources.
Top comments (0)