DEV Community: YUVARAJ. R

AWS ChatBot

YUVARAJ. R — Sat, 18 Apr 2026 07:26:27 +0000

Chatbot with Amazon Bedrock Agent + Knowledge Base RAG — Ecological Laundry
How we built a conversational AI assistant for an eco-friendly laundry using Amazon Bedrock Agent, Knowledge Base RAG, AWS Lambda and a 100% static landing page — without servers, without heavy frameworks, without complications.

Why Amazon Bedrock Agent?
Most simple chatbots only respond with generic text. A Bedrock Agent it goes further:

Native session memory — the agent remembers the context of the conversation without you having to implement anything
Integrated RAG Knowledge Base — connect the agent to your own documents (PDFs, FAQs, manuals) and respond with real business information, not hallucinations
Automatic orchestration — the agent decides when to query the KB, when to respond directly, and how to combine sources
Nova-lite model — fast, economical and powerful enough for conversational use cases in production
No ML infrastructure — no models to maintain, no GPUs, no training pipelines
For a local business like a laundromat, this means having an assistant who knows your prices, schedules, green processes, and policies — all automatically pulled from your own documents.

Complete architecture
Cliente (navegador)
│
▼
CloudFront CDN
┌─────────────────────────────────────────┐
│ / → S3 Bucket │
│ (index.html + widget JS) │
│ │
│ /chat → API Gateway HTTP API │
│ → Lambda (Flask) │
│ → Bedrock Agent Runtime │
│ ┌──────────────────────┐ │
│ │ Bedrock Agent │ │
│ │ (Nova-Lite v1) │ │
│ │ │ │ │
│ │ Knowledge Base RAG │ │
│ │ (OpenSearch + S3) │ │
│ └──────────────────────┘ │
└─────────────────────────────────────────┘
AWS Service Role in the project
Amazon Bedrock Agent Orchestra the conversation, hold session, decide when to consult the KB
Bedrock Knowledge Base Indexes business documents, responds with RAG (Retrieval-Augmented Generation)
Amazon Nova-Lite v1 Fast and economical — language model for chat in production
AWS Lambda Runs the Flask backend without servers, scales to zero when there is no traffic
HTTP API Gateway API Exposes the endpoint /chat with low latency and CORS handling
Amazon S3 Stores the static frontend and source documents of the Knowledge Base
Amazon CloudFront global — CDN serves the frontend and routes /chat to the backend from the same domain
AWS IAM Control minimum permissions between Lambda and Bedrock
How the flow of a question works
Usuario escribe: "¿Cuánto cuesta lavar un traje?"
│
▼
chatbot-wiget.js → POST /chat { sessionId, message, context }
│
▼
Lambda (Flask) → bedrock_client.py
│
▼
invoke_agent(agentId, agentAliasId, sessionId, inputText)
│
▼
Bedrock Agent evalúa:
¿Tengo esta info en la KB? → Sí
│
▼
Knowledge Base RAG:

Embedding de la pregunta
Búsqueda semántica en OpenSearch
Recupera fragmentos relevantes del PDF de precios
Nova-Lite genera respuesta contextualizada │ ▼ Stream de respuesta → Lambda → API Gateway → Widget Bedrock Agent configuration Create the Knowledge Base # 1. Subir documentos fuente al bucket S3 aws s3 cp documentos/ s3://tintorerias-kb-docs/ --recursive

2. Crear la Knowledge Base (desde consola o CLI)

aws bedrock-agent create-knowledge-base \
--name "tintorerias-ecologicas-kb" \
--description "Documentos de servicios, precios y procesos ecológicos" \
--role-arn arn:aws:iam::ACCOUNT_ID:role/AmazonBedrockExecutionRoleForKnowledgeBase \
--knowledge-base-configuration '{
"type": "VECTOR",
"vectorKnowledgeBaseConfiguration": {
"embeddingModelArn": "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v1"
}
}' \
--storage-configuration '{
"type": "OPENSEARCH_SERVERLESS",
"opensearchServerlessConfiguration": {
"collectionArn": "arn:aws:aoss:us-east-1:ACCOUNT_ID:collection/COLLECTION_ID",
"vectorIndexName": "tintorerias-index",
"fieldMapping": {
"vectorField": "embedding",
"textField": "text",
"metadataField": "metadata"
}
}
}'
Create the Data Source and synchronize

Crear data source apuntando al bucket S3

aws bedrock-agent create-data-source \
--knowledge-base-id TU_KB_ID \
--name "documentos-tintorerias" \
--data-source-configuration '{
"type": "S3",
"s3Configuration": {
"bucketArn": "arn:aws:s3:::tintorerias-kb-docs"
}
}'

Iniciar sincronización (indexación de documentos)

aws bedrock-agent start-ingestion-job \
--knowledge-base-id TU_KB_ID \
--data-source-id TU_DATA_SOURCE_ID

Verificar estado de la sincronización

aws bedrock-agent get-ingestion-job \
--knowledge-base-id TU_KB_ID \
--data-source-id TU_DATA_SOURCE_ID \
--ingestion-job-id TU_JOB_ID
Create the Agent and associate the Knowledge Base

Crear el agente

aws bedrock-agent create-agent \
--agent-name "asistente-tintorerias" \
--agent-resource-role-arn arn:aws:iam::ACCOUNT_ID:role/AmazonBedrockExecutionRoleForAgents \
--foundation-model "amazon.nova-lite-v1:0" \
--instruction "Eres el asistente virtual de una lavandería ecológica. Responde preguntas sobre servicios, precios, horarios y procesos de limpieza ecológica de manera amable y concisa. Usa la información de los documentos para dar respuestas precisas."

Asociar la Knowledge Base al agente

aws bedrock-agent associate-agent-knowledge-base \
--agent-id TU_AGENT_ID \
--agent-version DRAFT \
--knowledge-base-id TU_KB_ID \
--description "Base de conocimiento de servicios y precios"

Preparar y publicar el agente

aws bedrock-agent prepare-agent --agent-id TU_AGENT_ID

aws bedrock-agent create-agent-alias \
--agent-id TU_AGENT_ID \
--agent-alias-name "produccion"
Bedrock Customer Code
Invoke the agent (with streaming)
import boto3

client = boto3.client("bedrock-agent-runtime", region_name="us-east-1")

def chat_con_agente(mensaje: str, session_id: str) -> str:
response = client.invoke_agent(
agentId="PPJ6T6QBDD",
agentAliasId="VQKXHLAVG1",
sessionId=session_id, # Bedrock mantiene el contexto por sessionId
inputText=mensaje,
)

# La respuesta llega como stream de eventos

texto_completo = ""

for event in response.get("completion", []):

    chunk = event.get("chunk", {})

    if "bytes" in chunk:

        texto_completo += chunk["bytes"].decode("utf-8")

return texto_completo or "(sin respuesta)"

Uso

respuesta = chat_con_agente("¿Cuánto cuesta lavar una chamarra de cuero?", "sesion-123")
print(respuesta)
Consult the Knowledge Base directly (without agent)
def consultar_kb(pregunta: str, kb_id: str) -> str:
response = client.retrieve_and_generate(
input={"text": pregunta},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"knowledgeBaseId": kb_id,
"modelArn": "amazon.nova-lite-v1:0",
"generationConfiguration": {
"promptTemplate": {
"textPromptTemplate": (
"Eres un asistente de lavandería ecológica. "
"Responde basándote en esta información:\n\n"
"$search_results$"
)
}
},
},
},
)
return response["output"]["text"]
Multi-context support (multiple clients, one backend)

bedrock_client.py — patrón usado en este proyecto

CONFIGS = {
'aquamax': {
'type': 'agent',
'agent_id': 'PPJ6T6QBDD',
'agent_alias_id': 'VQKXHLAVG1',
},
'otro_cliente': {
'type': 'kb',
'kb_id': 'KB_ID_DEL_CLIENTE',
'model_arn': 'amazon.nova-lite-v1:0',
'prompt': "Eres el asistente de...\n\n$search_results$",
},
}

def chat(mensaje: str, session_id: str, contexto: str) -> str:
cfg = CONFIGS.get(contexto, CONFIGS['aquamax'])
if cfg['type'] == 'agent':
return _invoke_agent(mensaje, session_id, cfg)
return _retrieve_and_generate(mensaje, cfg)
Deployment of the backend in Lambda
Package and create the function
cd tintorerias-chatbot

Instalar dependencias en carpeta local

pip install flask flask-cors boto3 python-dotenv -t package/
cp app.py bedrock_client.py config.py lambda_handler.py package/

Crear ZIP de despliegue

cd package && zip -r ../lambda_deployment.zip . && cd ..

Obtener Account ID

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

Crear rol IAM con permisos mínimos

aws iam create-role \
--role-name tintorerias-lambda-role \
--assume-role-policy-document '{
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Principal":{"Service":"lambda.amazonaws.com"},
"Action":"sts:AssumeRole"
}]
}'

aws iam attach-role-policy \
--role-name tintorerias-lambda-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

aws iam put-role-policy \
--role-name tintorerias-lambda-role \
--policy-name BedrockAccess \
--policy-document '{
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Action":[
"bedrock:InvokeAgent",
"bedrock:RetrieveAndGenerate",
"bedrock:Retrieve",
"bedrock:InvokeModel"
],
"Resource":"*"
}]
}'

Crear la función Lambda

aws lambda create-function \
--function-name tintorerias-chatbot \
--runtime python3.11 \
--role arn:aws:iam::${ACCOUNT_ID}:role/tintorerias-lambda-role \
--handler lambda_handler.handler \
--zip-file fileb://lambda_deployment.zip \
--timeout 30 \
--memory-size 256 \
--environment Variables="{
KB_ID=TU_KB_ID,
MODEL_ARN=amazon.nova-lite-v1:0,
AWS_REGION=us-east-1
}"
Create API Gateway and connect with Lambda

Crear HTTP API

API_ID=$(aws apigatewayv2 create-api \
--name "tintorerias-chat-api" \
--protocol-type HTTP \
--query 'ApiId' --output text)

Crear integración Lambda

LAMBDA_ARN="arn:aws:lambda:us-east-1:${ACCOUNT_ID}:function:tintorerias-chatbot"

INTEGRATION_ID=$(aws apigatewayv2 create-integration \
--api-id $API_ID \
--integration-type AWS_PROXY \
--integration-uri $LAMBDA_ARN \
--payload-format-version "2.0" \
--query 'IntegrationId' --output text)

Ruta POST /chat

aws apigatewayv2 create-route \
--api-id $API_ID \
--route-key "POST /chat" \
--target "integrations/$INTEGRATION_ID"

Stage prod con auto-deploy

aws apigatewayv2 create-stage \
--api-id $API_ID \
--stage-name prod \
--auto-deploy

Permiso para que API Gateway invoque Lambda

aws lambda add-permission \
--function-name tintorerias-chatbot \
--statement-id apigateway-invoke \
--action lambda:InvokeFunction \
--principal apigateway.amazonaws.com \
--source-arn "arn:aws:execute-api:us-east-1:${ACCOUNT_ID}:${API_ID}///chat"

URL final del endpoint

aws apigatewayv2 get-api --api-id $API_ID --query 'ApiEndpoint' --output text
The chat widget (frontend)
The widget is a self-contained JS file that is injected into any HTML page with a single :</p> <script src="chatbot-wiget.js">

Internally it generates its own DOM, styles and logic without depending on any library:

// Configuración mínima necesaria
const CONFIG = {
apiUrl: 'https://TU_DOMINIO.cloudfront.net/chat',
requestTimeout: 30000,
context: 'aquamax', // identifica qué agente/KB usar en el backend
};

// Cada sesión de navegador tiene su propio UUID
// Bedrock Agent usa este ID para mantener el contexto de conversación
state.sessionId = crypto.randomUUID();

// Llamada al backend
fetch(CONFIG.apiUrl, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
sessionId: state.sessionId,
message: texto,
context: CONFIG.context,
}),
});
Test the endpoint in production

Pregunta simple

curl -X POST https://TU_DOMINIO.cloudfront.net/chat \
-H "Content-Type: application/json" \
-d '{"sessionId":"test-1","message":"¿Cuáles son sus servicios?","context":"aquamax"}'

Pregunta de seguimiento (mismo sessionId — el agente recuerda el contexto)

curl -X POST https://TU_DOMINIO.cloudfront.net/chat \
-H "Content-Type: application/json" \
-d '{"sessionId":"test-1","message":"¿Y cuánto tiempo tarda?","context":"aquamax"}'

Respuesta esperada

{ "response": "Ofrecemos lavado en seco ecológico, lavado de ropa delicada..." }

Update the deployment

Backend — reconstruir y subir Lambda

cd tintorerias-chatbot
pip install flask flask-cors boto3 python-dotenv -t package/
cp app.py bedrock_client.py config.py lambda_handler.py package/
cd package && zip -r ../lambda_deployment.zip . && cd ..
aws lambda update-function-code \
--function-name tintorerias-chatbot \
--zip-file fileb://lambda_deployment.zip

Frontend — subir y limpiar caché de CloudFront

aws s3 cp index.html s3://$BUCKET_NAME/
aws s3 cp chatbot-wiget.js s3://$BUCKET_NAME/
aws cloudfront create-invalidation \
--distribution-id $DISTRIBUTION_ID \
--paths "/*"

Why this stack is ideal for local businesses
Almost zero cost at rest — Lambda and API Gateway charge per invocation, not per hour. A chatbot with moderate traffic costs pennies a month.
No server maintenance — no EC2 to patch, no containers to monitor.
Automatic scale — from 0 to thousands of simultaneous users without additional configuration.
Knowledge of real business — the RAG Knowledge Base ensures that the assistant responds with up-to-date business information, not generic model data.
Reusable — the same backend supports multiple clients by changing only the field context in the petition.

AWS Observability vs OpenTelemetry

YUVARAJ. R — Sat, 18 Apr 2026 07:23:02 +0000

observability
After using AWS-native observability (CloudWatch/X-Ray) as my default choice for nearly a decade, I recently implemented an open-source observability stack for a greenfield project. Here's what I learned about when to use each approach.

Why I Explored OpenTelemetry
For the past 9 years, every AWS project I worked on used CloudWatch and X-Ray. It was automatic — spin up services, observability comes built-in. No complaints.

Then came a project with a twist: the application needed to run across multiple clouds. AWS-native observability simply wasn't an option.

That led me to explore alternatives — both paid and open-source. After analyzing several options, we landed on OpenTelemetry. The paid tools were impressive, but we didn't want to trade one vendor lock-in for another.

What I Still Like About CloudWatch/X-Ray
Let me be clear: CloudWatch and X-Ray are excellent tools. Here's where they shine:

Zero setup friction. You can get up and running in no time. Almost no code required — everything works out of the box.

Native integration. CloudWatch talks to Lambda, API Gateway, DynamoDB, and every other AWS service without configuration. It just works.

Perfect for getting started. When you're building an MVP or early-stage product, you don't need a complex observability pipeline. You need to ship. CloudWatch lets you do that.

Where CloudWatch Falls Short
After years of using it, I've hit some consistent pain points:

Customization is hard. The visualization is rigid. Widget limitations and cross-account/cross-region constraints get frustrating as your system grows.

Connecting the dots is painful. Correlating metrics, logs, and traces in a single view requires significant configuration and code. It's possible, but not seamless.

These aren't deal-breakers for simple architectures. But when you're running distributed systems across environments, they start to compound.

Setting Up OpenTelemetry
For our stack, we chose:

Prometheus for metrics
Jaeger for traces
OpenSearch for logs
Grafana for visualization
OpenTelemetry has become an industry standard with strong community support and integrations with virtually every observability tool on the market.

What surprised me: The configuration is simple yet powerful. It covers not just the application layer but the underlying system as well. OpenTelemetry exports data to specialized tools (Prometheus, Jaeger, OpenSearch), and Grafana ties it all together with end-to-end request lifecycle visualization.

Setup time: A few hours to get a working proof-of-concept. We've since automated the entire setup with Ansible, making it repeatable across environments.

To be clear: a few hours gets you a PoC. Production-ready deployment — handling high-cardinality metrics, tuning collectors, configuring retention, setting up alerting — is a multi-week effort. Don't underestimate it.

The Hybrid Approach: Managed OTel on AWS
There's a middle ground worth mentioning: AWS now heavily supports OpenTelemetry.

AWS Distro for OpenTelemetry (ADOT) lets you instrument with vendor-neutral OTel code, but route telemetry to Amazon Managed Prometheus (AMP) and Amazon Managed Grafana (AMG).

This gives you:

→ Vendor-neutral instrumentation (no code lock-in)
→ Managed infrastructure (no self-hosting headaches)
→ AWS-native billing and support

For teams who want portability at the application layer but don't want to manage Prometheus/OpenSearch clusters, this is the smart middle path.

We chose full self-hosting because our multi-cloud requirement included non-AWS environments. But if you're AWS-primary with future portability concerns, ADOT + AMP + AMG is worth evaluating.

The Real Comparison
Here's how the two approaches stack up in practice:

Dimension CloudWatch/X-Ray OpenTelemetry
Setup time Almost none Few hours (PoC) / weeks (production)
Customization Hard Easy
SaaS invoice $$$ $
Total Cost of Ownership $$ $$ (shifts to compute + engineering)
Multi-cloud support No Yes
Debugging experience Easy Easy
Team learning curve Easy Easy
A note on cost: OpenTelemetry software is free, but self-hosting isn't. Running OpenSearch clusters, Prometheus instances, and EBS volumes for retention can get expensive at scale — not to mention engineering hours for index management, patching, and scaling. OTel lowers your SaaS invoice, but shifts the cost to compute and engineering time. It's a strategic reinvestment, not a simple cost-saving.

Where OpenTelemetry wins: Cloud-agnostic solutions without vendor lock-in. Same monitoring capabilities for on-premises and internal applications. When we needed identical observability for internal applications running on on-prem servers, the OTel stack worked flawlessly.

Where CloudWatch wins: Quick deployment on AWS when you want an efficient, no-code monitoring solution.

The Operational Reality
Running your own observability stack isn't free. Here's what I've learned:

Index management is painful. Managing indices for logs and traces in OpenSearch requires ongoing attention. It's not set-and-forget.

Reliability requires planning. Early on, Prometheus stopped accepting requests due to high call volume. Once we started batching requests, it stabilized. But it was a reminder: you're now responsible for your monitoring infrastructure.

Monitoring the monitor. We use Grafana alerts to notify us of any downtime in the observability pipeline itself. Yes, you need to monitor your monitoring.

Cost comparison: OpenTelemetry is cheaper than most paid solutions in terms of licensing. But factor in compute, storage, and engineering time. There are no restrictions on application count, call volume, or data retention — retention depends entirely on your needs and infrastructure budget. Maintenance has its overhead, but so does running any production system.

Team Adaptation
The team was happy. Using the same tooling everywhere meant consistent knowledge across environments. Same dashboards, same queries, same debugging workflows — whether troubleshooting AWS, another cloud, or on-prem.

Skills required: Prometheus and Grafana experience was important for our team. Jaeger and OpenSearch were easier to pick up.

Small teams: It depends entirely on the application's architecture and roadmap. A distributed, multi-cloud application in maintenance mode can actually be managed by a small team if the automation is solid. However, for a 2-3 person team building a fresh AWS-only MVP, the overhead of OTel might be a distraction.

My Decision Framework
When a CTO asks me "CloudWatch or OpenTelemetry?", I ask three questions:

Where will your applications run? AWS only, or multiple environments?
Is AWS the only cloud you're targeting? Now and in the future?
Are you willing to invest in monitoring infrastructure right now?
My rule of thumb:

If you're targeting AWS only and it's a new product, the AWS observability stack gets you up and running in no time.
If you want future portability without self-hosting, consider the hybrid approach (ADOT + AMP + AMG).
If you have a mature product with multiple microservices, multi-cloud requirements, and don't want vendor lock-in, choose full OTel.
For my next greenfield project: It depends. For serverless development, AWS observability still suits perfectly. But if I'm building a distributed system with multi-cloud support, OpenTelemetry will be my default choice.

The Future of Observability
Every major paid monitoring tool now supports OpenTelemetry. That tells you where the industry is heading. The community support is massive and growing.

OpenTelemetry is becoming the standard — not because it's free, but because it solves real problems around portability and vendor independence.

The Unbeatable Value of Traces
If I could only have one observability signal — logs, metrics, or traces — I'd choose traces without hesitation.

Here's why: as systems evolve from simple APIs into distributed orchestration layers (Kubernetes, event-driven pipelines, multi-service workflows), logs lose context rapidly. A log line tells you something happened. A trace tells you why, where, and how long it took across every hop.

For debugging distributed systems, tracing is irreplaceable.

Final Thoughts
Use CloudWatch/X-Ray when you need to hit the ground running on AWS with zero setup friction. Use OpenTelemetry when you need a mature, cloud-agnostic standard that grows with your multi-cloud or on-prem architecture without vendor lock-in.

One thing most people get wrong about observability: it's not a silver bullet. It gives you insight, but at the end of the day, it's still a developer's responsibility to write performant code.

Any regrets going the OpenTelemetry route? None so far.

What drove your observability strategy? Running OpenTelemetry in production — how are you managing collector infrastructure and reliability? I'd love to hear your experience.