Generative AI is stateless. It doesn't remember that it answered the same question 5 seconds ago. This leads to redundant token usage and unnecessary latency. Today, I solved this by implementing a Memoization Pattern using DynamoDB.
The Strategy
Instead of blindly calling bedrock.invoke_model(), I wrapped the call in a caching function:
Fingerprinting: I serialize the transaction list and hash it using MD5. This creates a unique key (cache_key).
Lookup: The Lambda checks DynamoDB for this key.
TTL (Time To Live): I set a 1-hour expiration on cached items using DynamoDB's native TTL feature.
Now, when I refresh my dashboard, I see a ⚡ icon next to the analysis. That tells me: "This didn't cost you a penny."

Top comments (0)