ajithmanmu

Posted on Oct 15

How I Built an Autonomous AI Customer Retention Agent with AWS Bedrock AgentCore

#aws #ai #agents #portfolio

Built for the AWS AI Agent Global Hackathon

Introduction

After building a serverless data analytics pipeline for customer churn, I had clean, query-ready customer data sitting in Amazon Athena. The next logical step was to make that data actionable — not just for analysts, but for customers themselves.

That's where the Customer Retention Agent comes in. This is a fully autonomous AI agent built on AWS Bedrock AgentCore that identifies at-risk customers and proactively offers them personalized retention deals through natural conversation. I built this as part of the AWS AI Agent Global Hackathon, and it's a natural continuation of my previous project.

Before diving into the build, I spent time going through the Amazon Bedrock AgentCore Samples repository. The tutorials there were incredibly helpful for getting up to speed with AgentCore concepts — from Runtime and Gateway to Memory and Identity. If you're new to AgentCore, I highly recommend starting there.

The goal was simple: What if customers could talk to an AI agent that knows their churn risk and can instantly generate personalized discount codes? No forms, no waiting for customer service — just a conversation that might save their subscription.

Architecture

Here's the high-level design:

Core Components:

Amazon Bedrock AgentCore (Runtime, Gateway, Memory) — The brain of the system. Runtime hosts the agent, Gateway connects to external tools, and Memory persists conversation context.
Claude 3.7 Sonnet — Powers autonomous reasoning and multi-step decision-making.
Next.js Frontend — Chat interface deployed on Vercel with streaming responses.
AWS Lambda (3 functions) — Churn Data Query, Retention Offer, Web Search exposed via MCP protocol.
Amazon Athena — Queries the Telco customer churn dataset (from my previous project).
Amazon Cognito — Dual authentication: web client for users, M2M client for agent-to-Gateway communication.
Bedrock Knowledge Base — RAG implementation with company policies and troubleshooting guides.
Amazon S3 — Stores customer data and knowledge base documents.

You can find the full implementation here: https://github.com/ajithmanmu/customer-retention-agent

Demo Video

https://www.youtube.com/watch?v=nt2-iE_qBIw
URL: https://customer-retention-agent.vercel.app/
Demo showing the agent in action - analyzing churn risk and generating discount codes

Walkthrough

1. The User Journey

When a customer logs into the chat interface:

Authentication: Frontend authenticates via Cognito, receives JWT token
JWT Mapping: Token contains Cognito user ID (UUID) which gets mapped to actual customer ID in the dataset (e.g., "3916-NRPAP")
Conversation Starts: User sends a message, AgentCore Runtime receives request with JWT
Memory Retrieval: Before responding, agent pulls customer context from Memory
Agent Reasoning: Claude 3.7 Sonnet decides which tools to call (if any)
Tool Execution: Agent calls Lambda functions via Gateway for data/actions
Response Generation: Claude synthesizes response with retrieved data
Memory Saving: Interaction gets saved to Memory for future conversations

2. Dual Authentication Architecture

This was one of the trickier parts. The system needs two separate authentication flows:

Web Client (User → Runtime):

User logs in with username/password
Cognito returns JWT token
Frontend includes JWT in every request to AgentCore Runtime
Token contains sub field with user ID

M2M Client (Agent → Gateway):

Agent needs to call Lambda functions via Gateway
Uses OAuth 2.0 client credentials flow
Confidential client with client secret stored in SSM
Access token validates at Gateway before allowing tool calls

Working with Cognito was more complicated than I expected — configuring two different clients, getting the OAuth flows right, and debugging token scopes took several iterations. But it was a valuable learning experience in production authentication patterns.

3. The Agent's Brain: AgentCore Runtime + Memory

The agent runs on AgentCore Runtime, which is a fully managed, serverless platform for hosting AI agents. No servers to manage, auto-scaling built-in.

Memory Integration is what makes this agent truly conversational:

class CustomerRetentionMemoryHooks:
    def __init__(self, memory_id, customer_id, session_id, region):
        self.memory_client = boto3.client('bedrock-agent-runtime')
        self.memory_id = memory_id
        self.actor_id = customer_id  # Maps to customer in dataset

Three memory strategies work together:

USER_PREFERENCE: Stores explicit preferences ("I prefer email contact")
SEMANTIC: Vector-based semantic memory for conversation context
SUMMARIZATION: Condensed conversation summaries

This means if a customer says "My customer ID is 3916-NRPAP" in one session, the agent remembers it in future conversations.

4. Tools Layer: Lambda Functions via Gateway

I created three Lambda functions, each with a specific purpose:

Churn Data Query Lambda:

# Queries Athena with SQL
query = f"""
SELECT customerid, churn_risk_score, tenure, contract, monthlycharges 
FROM telco_augmented_vw 
WHERE customerid = '{customer_id}'
"""

This function:

Hits Amazon Athena (the data from my previous pipeline project!)
Returns customer profile, churn risk score, usage patterns
Uses cancel_intent field as our "synthetic churn model" — no separate ML training needed

Retention Offer Lambda:

Generates personalized discount codes based on risk level
High risk (>70%): 20-30% off for 3 months (code: SAVE25)
Medium risk (40-70%): 15-25% off for 2 months
Low risk (<40%): Service upgrades and add-ons

Web Search Lambda:

DuckDuckGo API for real-time information
Helps agent answer general retention strategy questions

Internal Tool: Product Catalog

In addition to the three external Lambda functions, the agent also has an internal tool that runs directly within the AgentCore Runtime - no external API calls needed. The get_product_catalog() tool provides real-time information about available telecom plans, pricing, add-on services, and retention offers. This tool is perfect for answering customer questions like "What plans do you offer?" or "Tell me about your premium features" without making external API calls. Having this as an internal tool means faster response times and reduced latency for common queries.

@tool
def get_product_catalog() -> str:
    """Get information about available telecom plans and services."""
    # Returns plan details, pricing, features, and retention offers
    return formatted_catalog_info

This demonstrates a key architectural pattern: use internal tools for static/reference data that doesn't require external systems, and use external tools (via Gateway) for dynamic data queries or actions that need database access.

All three functions are exposed via AgentCore Gateway using the MCP (Model Context Protocol). The Gateway handles authentication, request routing, and response formatting.

5. The Autonomous Reasoning Flow

Here's what happens when a customer asks: "Can you give me a discount code?"

Agent Receives Request: Claude reads the prompt and system instructions
Decision Making: Agent decides it needs customer churn data first
Tool Call #1: Calls churn_data_query via Gateway → Lambda → Athena
Risk Analysis: Receives churn risk score (e.g., 85% — HIGH risk)
Decision Making: Agent decides to generate retention offer
Tool Call #2: Calls retention_offer with customer data
Offer Generation: Lambda generates SAVE25 discount code (25% off)
Response: Agent synthesizes natural response with discount code

The agent makes all these decisions autonomously — I didn't hardcode the workflow. The system prompt guides the agent, but Claude decides when and how to use tools.

6. RAG with Bedrock Knowledge Base

The Knowledge Base stores:

Company policies
Troubleshooting guides
FAQ documents

RAG Flow:

User Query → Agent → Knowledge Base → Retrieved Context → Enhanced Response

Using Amazon Titan Embeddings, documents get vectorized for semantic search. When a customer asks about policies, the agent retrieves relevant sections and includes them in the response.

7. Data Connection: From Previous Project

The customer data comes from my previous serverless pipeline project. That pipeline:

Ingested the Kaggle Telco dataset
Converted CSV to Parquet with Glue ETL
Partitioned data in S3
Made it queryable via Athena

This agent project is the natural next step — taking that clean, query-ready data and making it accessible through conversational AI.

Key Technical Decisions

Why AgentCore Over DIY?

I could have built this with raw Lambda functions and LangChain, but AgentCore provided:

Built-in Memory: No need to build my own vector database
Gateway with MCP: Standardized protocol for tool integration
Managed Runtime: No ECS clusters or container management
Observability: CloudWatch integration out of the box

Why Dual Cognito Architecture?

Security: Separates user authentication from agent-to-service authentication
Scalability: M2M tokens can be cached and reused
Best Practice: Follows OAuth 2.0 patterns for service-to-service communication

Why Synthetic Churn Model?

The dataset includes a cancel_intent field which acts as our "pretend ML model." For a hackathon demo, this works perfectly without needing to train and deploy a separate ML model. In production, you'd integrate with SageMaker for real churn predictions.

Security

Even for a hackathon project, I applied production security practices:

IAM Roles: Least-privilege access for Lambda, Runtime, and Gateway
JWT Authentication: Secure token-based auth with Cognito
SSM Parameter Store: All secrets and config stored securely
S3 Encryption: SSE-S3 for data at rest
Private Lambda (TODO): Current Lambdas are public; production would use VPC

Challenges & Learnings

1. Cognito Complexity

Setting up dual authentication was harder than expected. Key lessons:

USER_PASSWORD_AUTH flow must be explicitly enabled
M2M clients need proper scopes configured
Discovery URLs must be exact (.well-known/openid-configuration)
Token decoding requires proper base64 padding

Working with Cognito was more complicated than I anticipated, but it forced me to deeply understand OAuth 2.0 flows and JWT token structure.

2. Cold Start Problem

The first request to the agent often timed out. Classic serverless cold start:

AgentCore Runtime takes time to spin up
Solution: Better error handling and retry logic
Future: Consider provisioned concurrency for production

3. Multi-Step Tool Calling

Getting Claude to call churn_data_query first, then pass that data to retention_offer required explicit prompt engineering:

SYSTEM_PROMPT = """
IMPORTANT: When customers ask for discount codes, you MUST:
1. First call the churn_data_query tool to get customer data
2. Then call the retention_offer tool with the complete churn_data
"""

Learning: LLMs need very explicit instructions for sequential workflows.

4. SSM Parameter Store Permissions

The auto-created Runtime execution role didn't include SSM permissions. Quick fix:

{
    "Effect": "Allow",
    "Action": ["ssm:GetParameter"],
    "Resource": "arn:aws:ssm:*:*:parameter/customer-retention-agent/*"
}

Learning: Always verify IAM permissions when integrating AWS services.

5. Local Development Setup

Testing locally before deploying was crucial:

Used agentcore invoke --local to simulate Runtime
Created automated test suite (test_invoke_local.py)
Tested with real AWS services (Lambda, Athena, Memory)

Learning: Local-first development saves time and AWS costs.

6. On-Demand Throughput Not Supported

Discovered that not all Bedrock models support on-demand throughput. Had to adjust model selection.

Learning: Read the AWS documentation carefully for service limitations.

7. Boto3 Sessions

Lambda functions need proper boto3 session management:

athena_client = boto3.client('athena', region_name='us-east-1')

Learning: Always specify region explicitly in Lambda functions.

What I Learned

Technical:

AgentCore primitives (Runtime, Gateway, Memory) work incredibly well together
MCP protocol standardizes tool integration
Memory strategies: USER_PREFERENCE for explicit data, SEMANTIC for context
JWT token structure and OAuth 2.0 flows
RAG implementation with Bedrock Knowledge Base
Serverless cold starts are real — plan accordingly

Architectural:

Dual authentication is complex but necessary for production systems
Tool design matters: focused, single-responsibility functions compose well
Explicit prompt engineering is crucial for multi-step workflows
Local testing infrastructure saves time and money

Data:

Synthetic data (like cancel_intent) works great for demos
Previous data pipeline projects can be extended with AI layers
Parquet + Athena = fast, cost-effective queries

Next Steps

If I continue this project:

Security Enhancements:
- Make Lambdas private
- Set up VPC and subnets
- Add Web Application Firewall (WAF)
Responsible AI:
- Content moderation with Bedrock Guardrails
- Human oversight for high-value offers
- Policy checks before generating discounts
Production Features:
- Real-time alerts when high-risk customers detected
- A/B testing for retention strategies
- Analytics dashboard for offer effectiveness
- Sentiment analysis for conversation tone
Integration:
- Connect to Confluence for live policy updates (Bedrock KB supports this!)
- Integrate with CRM (Salesforce/HubSpot)
- Multi-channel support (SMS, email, phone)

Conclusion

Building the Customer Retention Agent taught me that autonomous AI agents are production-ready today. With AWS Bedrock AgentCore, I went from idea to working demo faster than expected.

The hardest parts weren't the AI — they were the authentication, cold starts, and getting all the AWS services to work together. But that's the reality of building production systems.

This project is a natural continuation of my data pipeline work. The pipeline gave me clean data in Athena; the agent makes that data actionable through conversation. Together, they demonstrate how serverless + AI can solve real business problems.

Key takeaway: Modern cloud platforms make it possible to build sophisticated AI agents without managing infrastructure. The future of customer service is autonomous, personalized, and conversational.

Thanks to Devpost for hosting the AI Agent Global Hackathon and creating AgentCore. Building with these tools has been an incredible learning experience! 🚀