DEV Community

Swati Tyagi
Swati Tyagi

Posted on

Deploying a Serverless AI Agent with AWS Bedrock, Lambda, and API Gateway

This guide walks through building a question-answering service powered by GenAI using AWS bedrock. The architecture accepts prompts via HTTP and returns model-generated responses using Amazon Bedrock—all while keeping costs minimal through serverless infrastructure.

Architecture

Data Flow:

  1. External clients send HTTP requests to API Gateway
  2. API Gateway routes requests to the Lambda function
  3. Lambda invokes Amazon Bedrock's Nova Micro model
  4. ECR stores the Lambda container image (deployment artifact)

The Challenge

When implementing generative AI services, choosing the right architecture matters. This implementation demonstrates a lightweight GenAI solution that can integrate with existing systems or be exposed externally through an API.

Requirements

Functional Goals

Requirement Description
Prompt Processing Accept prompts and return Nova Micro completions
HTTP Endpoint Expose an endpoint for triggering responses
Estimated Volume ~100 monthly requests (for cost estimation)

Operational Goals

Aspect Requirement
Automation Fully automated deployment via GitHub Actions
Availability 99.9%+ monthly uptime
Security IAM-scoped Bedrock access, OpenID Connect auth, HTTPS-only
Observability Structured logging with CloudWatch dashboards

Intentional Omissions

Authentication, input sanitization, and authorization are excluded to keep focus on the core GenAI implementation.

Cost Analysis

Based on an estimated 22 input tokens and 232 output tokens per request:

Service Monthly Cost Notes
Bedrock (Nova Micro) ~$0.003 2,200 input / 23,200 output tokens
Lambda Free Within free tier (1M requests, 400K GB-seconds)
API Gateway Free (Year 1) ~$0.0004/month after
ECR ~$0.01 300MB image after 500MB free tier

Scaling Projections

Monthly Requests Estimated Cost
1,000 ~$0.04
10,000 ~$0.39
100,000 ~$3.76

Building the Agent

Project Setup

mkdir -p handler terraform 
cd handler
pnpm init -y
pnpm --package=typescript dlx tsc --init
mkdir -p src __tests__
touch src/{app,env,index}.ts 

pnpm add -D @types/node tsx typescript
pnpm add ai @ai-sdk/amazon-bedrock zod dotenv
Enter fullscreen mode Exit fullscreen mode

Core Components

The implementation has three layers:

flowchart TB
    A["Lambda Handler<br/><i>Parses events, returns responses</i>"] --> B["Application Logic<br/><i>Manages prompts & orchestration</i>"]
    B --> C["Bedrock Integration<br/><i>Model invocation via AI SDK</i>"]
Enter fullscreen mode Exit fullscreen mode

Lambda Entry Point

// index.ts
export const handler = async (event: any, context: any) => {
    try {
        const body = event.body ? JSON.parse(event.body) : {};
        const prompt = body.prompt ?? "Welcome from Warike technologies";
        const response = await main(prompt);
        return {
            statusCode: 200,
            body: JSON.stringify({ success: true, data: response }),
        };
    } catch (error) {
        return {
            statusCode: 500,
            body: JSON.stringify({
                success: false,
                error: error instanceof Error ? error.message : 'Unexpected error'
            }),
        };
    }
};
Enter fullscreen mode Exit fullscreen mode

Bedrock Integration

// utils/bedrock.ts
export async function generateResponse(prompt: string) {
    const { regionId, modelId } = config({});
    const bedrock = createAmazonBedrock({ region: regionId });

    const { text, usage } = await generateText({
        model: bedrock(modelId),
        system: "You are a helpful assistant.",
        prompt: [{ role: "user", content: prompt }],
    });

    console.log(`model: ${modelId}, response: ${text}, usage: ${JSON.stringify(usage)}`);
    return text;
}
Enter fullscreen mode Exit fullscreen mode

Environment Variables

AWS_REGION=us-west-2
AWS_BEDROCK_MODEL='amazon.nova-micro-v1:0'
AWS_BEARER_TOKEN_BEDROCK='aws_bearer_token_bedrock'
Enter fullscreen mode Exit fullscreen mode

⚠️ Security Note: Use short-lived API keys only.


Infrastructure

Dockerfile

# Build Stage
FROM node:22-alpine AS builder
WORKDIR /usr/src/app
RUN corepack enable
COPY package.json pnpm-lock.yaml* ./
RUN pnpm install --frozen-lockfile
COPY . .
RUN pnpm run build

# Runtime Stage
FROM public.ecr.aws/lambda/nodejs:22
WORKDIR ${LAMBDA_TASK_ROOT}
COPY --from=builder /usr/src/app/dist/src ./ 
COPY --from=builder /usr/src/app/node_modules ./node_modules
CMD [ "index.handler" ]
Enter fullscreen mode Exit fullscreen mode

Terraform Resources

Key infrastructure components:

  • API Gateway — HTTP protocol with Lambda integration, CORS headers, JSON access logs
  • Bedrock Permissions — Nova Micro inference profile access
  • Lambda Function — 900-second timeout, CloudWatch logging enabled

📝 Note: The ECR seeding resource requires Docker running locally.

CI/CD Pipeline

flowchart LR
    A[Push to Main] --> B[Build & Test]
    B --> C[Build Docker Image]
    C --> D[Push to ECR]
    D --> E[Deploy Lambda]
Enter fullscreen mode Exit fullscreen mode

The GitHub Actions workflow handles building, testing, Docker image creation, ECR push, and Lambda deployment—triggered on pushes to main.

Testing

curl -sS "https://123456.execute-api.us-west-2.amazonaws.com/dev/" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Heeey hoe gaat het?"}' | jq
Enter fullscreen mode Exit fullscreen mode

Expected Response:

{
  "success": true,
  "data": "Hoi! Het gaat prima, bedankt voor het vragen..."
}
Enter fullscreen mode Exit fullscreen mode

Monitoring

CloudWatch dashboards provide visibility into errors and performance metrics.


Cleanup

terraform destroy
Enter fullscreen mode Exit fullscreen mode

Takeaways

✅ Serverless GenAI with API Gateway, Lambda, and Bedrock's Nova Micro delivers a functional, cost-effective solution

✅ Pricing remains negligible even at significant scale

✅ Terraform handles infrastructure; GitHub Actions automates deployment

✅ Foundation readily supports more sophisticated generative AI applications

Top comments (0)