DEV Community

Cover image for Day 60: Decoupling State and CloudWatch FinOps
Eric Rodríguez
Eric Rodríguez

Posted on

Day 60: Decoupling State and CloudWatch FinOps

Today, on Day 60 of my 100 Days of Cloud challenge, I had to pause building new features to clean up some critical technical debt in my Serverless AI Financial Agent. When you are preparing an application to scale and handle real users, the little things you hardcoded during the sandbox phase will inevitably come back to haunt you. I faced two specific operational leaks today: one affecting my application's state, and the other quietly threatening my cloud billing.

Fix 1: Decoupling Identity with Lambda Environment Variables

My application was processing duplicate user reports, specifically sending two emails at the exact same time every afternoon. After thoroughly investigating Amazon DynamoDB and confirming the database was completely clean, I realized the issue was hiding inside the Lambda execution environment itself. During my initial testing weeks ago, I left a mock USER_ID hardcoded as a fallback in my Python logic.

Because this hardcoded ID didn't match my real Amazon Cognito UUID in the database, the code generated a fake profile in memory and merged it with the real database records just before processing the SQS queue. The solution was to completely decouple the configuration from the code. I stripped the hardcoded ID from the Python script and injected the target user securely via AWS Lambda Environment Variables. Now, the code is dynamic, stateless, and ready to handle multiple tenants without identity collisions. Your configuration should always live outside your code.

Fix 2: The Infinite Log Trap in Amazon CloudWatch

The second issue was a silent FinOps time bomb. If you use AWS Lambda, you know that it automatically logs all output to Amazon CloudWatch. However, you might not realize that by default, the retention policy for these Log Groups is set to "Never Expire".

If you have a high-traffic application, retaining debug logs indefinitely will eventually result in a hefty and unnecessary storage bill. I navigated to the CloudWatch console and changed the retention policy for my Lambda functions to 14 days. This quick 30-second fix acts as an automated garbage collector. It gives me a comfortable two-week sliding window to troubleshoot any bugs, while AWS automatically destroys the useless historical text logs before I have to pay for storing them.

Architecture is not just about what you build, but also about what you actively choose not to keep. Never hardcode your state, and never keep your logs forever!

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.