Pedro Oliveira

Posted on Dec 11, 2025

A 100% Serverless WhatsApp Agent: Go + AWS Lambda + EventBridge

#go #aws #serverless #architecture

Building a WhatsApp Cloud API Agent on AWS Lambda Using Go

In this post I want to share how we designed a WhatsApp Cloud API agent that is 100% serverless, built with Go and AWS. The goal was to keep costs extremely low while still being ready for large-scale message traffic.

The project powers an automated commercial agent for Barx, talking to customers over WhatsApp, powered by OpenAI Agents behind the scenes.

High-level architecture

Everything runs serverlessly on AWS:

API Gateway HTTP API (v2): single entry point for the WhatsApp webhook and auxiliary endpoints (verification + Swagger).
Lambda functions (Go, provided.al2023):
- wpp-verification: handles the Webhook verification challenge from WhatsApp Cloud API (GET /webhook/messages).
- messages-webhook: receives incoming messages (POST /webhook/messages), validates & normalizes payloads, and schedules answer processing.
- answer-handler: processes messages in batch, calls the OpenAI Agent workflow, and sends responses back to WhatsApp.
- swagger-ui / get-swagger: serve API docs for internal use.
Amazon EventBridge Scheduler: used as a buffer + delay mechanism to decouple message ingestion from answer generation.
DynamoDB: single table (referenced by name from CDK) storing messages, conversation memory, and related entities.
AWS Secrets Manager: stores WhatsApp Cloud API credentials and related secrets (token, phone number ID, API version, etc.).

The CDK stack (cdk/lib/serverless-go-app-stack.ts) wires all of this:

Creates the HTTP API and routes to each Lambda via HttpLambdaIntegration.
Injects common environment variables into each Lambda (stage, region, table name, secret name, OpenAI API key).
Grants fine-grained IAM permissions:
- secretsmanager:GetSecretValue only for the configured secret.
- dynamodb:PutItem, GetItem, UpdateItem, etc. only on the WhatsApp agent table.
- EventBridge Scheduler permissions for the webhook Lambda to create/update/delete schedules and iam:PassRole for the scheduler role.
Reuses log groups with a short retention (3 days) to control CloudWatch costs.

Code architecture (clean-ish layers)

The Go service is structured in layers under services/agent:

internal/application: use cases (orchestrators) for each business flow.
internal/domain: entities and repository interfaces.
internal/infra: concrete implementations (WhatsApp client, OpenAI agent, DynamoDB repositories, scheduler, config, logging).
internal/interface/http: HTTP/Lambda controllers that adapt API Gateway events to use cases.

Incoming messages flow

WhatsApp Cloud API sends a POST to /webhook/messages.
API Gateway forwards it to the messages-webhook Lambda.
receive_message controller (internal/interface/http/receive_message/controller.go):
- Logs the raw payload.
- Unmarshals into a WhatsAppWebhook model.
- Maps it into one or more ReceiveMessageCommand values.
- For each command, calls ReceiveMessageUseCase.Execute.
ReceiveMessageUseCase (internal/application/receive_message/use_case.go):
- Persists the message in a MessageRepository (backed by DynamoDB).
- Schedules an answer using a SchedulerAnswerRepository that wraps EventBridge Scheduler.
- If a schedule already exists, it handles ConflictException and updates the schedule instead of failing.

The key idea: every message creates (or updates) a schedule that will later trigger answer-handler with the serialized Message as input. This naturally batches multiple messages from the same user within a short window and avoids hammering OpenAI for each keystroke.

Answer generation flow

EventBridge Scheduler invokes the answer-handler Lambda with the serialized Message.
answerUseCase (internal/application/answer/use_case.go):
- Sends a typing indicator via WhatsApp Cloud API to improve UX.
- Loads conversation memory from a MemoryRepository (DynamoDB).
- Lists pending user messages for that user ID.
- Concatenates the messages into a single prompt and calls the Agent client (AgentClient) wrapping openai-agents-go.
- Saves back the new LastResponseID to memory to keep the conversation context alive.
- Sends one or more WhatsApp text messages with the agent’s answer.
- Deletes processed user messages in batch.

WhatsApp Cloud API client

internal/infra/whatsapp/client.go implements CloudAPIRepository:

Reads WhatsApp Cloud API settings from Secrets Manager (version, phone number ID, access token).
Sends text messages and typing indicators by POSTing JSON payloads to the Graph API.
Logs both error states and successful responses, but returns simple Go errors to the application layer.

OpenAI Agent integration

internal/infra/agent/client.go integrates with openai-agents-go:

Defines a typed SchemaOutput describing the structured response (messages[] { content }).
Configures the Agent with:
- Name and instructions (AgentInstructions) tailored to the Barx commercial assistant.
- Model (gpt-5 in this codebase), reasoning settings, and metadata.
- Tools: file search (vector store) and web search constrained to barx.com.br.
Exposes a RunWorkflow method that returns a generic AgentOutput[SchemaOutput] with both the response and LastResponseID to keep the conversation thread.

Why 100% serverless?

This architecture was intentionally designed to be fully serverless:

No containers, no servers to manage.
Scale-to-zero behavior: when there is no WhatsApp traffic, there are essentially no compute costs.
Cloud-native services (API Gateway, Lambda, EventBridge Scheduler, DynamoDB, Secrets Manager) all scale horizontally with traffic.

Because of this, we can support:

Spiky traffic (promotions, campaigns, peak hours) without pre-provisioning instances.
Long-tail usage across many establishments, each generating low but unpredictable traffic.

Pros of this architecture

Cost-efficient: pay-per-invocation Lambdas + DynamoDB on-demand + short log retention keep monthly costs under control, especially for early-stage products.
Naturally scalable: API Gateway + Lambda + EventBridge Scheduler handle thousands of concurrent conversations without manual tuning.
Resilient conversation pipeline: storing messages and memory in DynamoDB plus scheduled triggers means transient failures usually don’t lose user messages.
Clean separation of concerns:
- interface/http for Lambda controllers
- application for orchestration
- domain for entities and interfaces
- infra for AWS + external integrations (WhatsApp, OpenAI, Secrets Manager, DynamoDB, Scheduler)
Vendor-optimized integrations: using EventBridge Scheduler for delayed processing instead of hand-rolled CRON, and Secrets Manager for credentials, keeps security and operations simpler.

Cons and trade-offs

Cold starts: Go on Lambda is generally fast, but with multiple functions (webhook, answer handler, Swagger) you can still feel cold start latency for very low-traffic tenants.
Distributed debugging: tracing a single user interaction across API Gateway, Scheduler, multiple Lambdas, and DynamoDB can be harder than in a monolithic app unless you invest in end-to-end observability.
Operational complexity at the cloud layer: IAM, Scheduler permissions (iam:PassRole), secrets, and environment variables must all be configured correctly; CDK helps, but misconfigurations can be subtle.
Limited local emulation: fully reproducing API Gateway + Scheduler + WhatsApp Webhooks locally is non-trivial; most realistic testing still happens in the cloud.
Tight coupling to AWS: the design uses AWS-native services heavily (Scheduler, Secrets Manager, DynamoDB, Lambda), so moving to another cloud would require a serious rewrite.

What would you improve?

This architecture has worked well for us so far: low cost, simple scaling, and a clean Go codebase with clear boundaries between HTTP, application, domain, and infrastructure.

If you’ve built something similar (WhatsApp bots, agentic backends, or serverless messaging pipelines), I’d love to hear from you:

How would you improve this design?
What would you change to make it more resilient or easier to operate?
Any patterns you recommend for observability or testing in this kind of serverless setup?

Share your thoughts, ideas, or questions in the comments — I’m happy to iterate on this architecture with the community’s feedback.

DEV Community