DEV Community

Cover image for My CDK Deploy Takes 7 Minutes: My Local Runner Takes 25ms

My CDK Deploy Takes 7 Minutes: My Local Runner Takes 25ms

👋 Hey there, Tech Enthusiasts!

I'm Sarvar, a Cloud Architect who loves turning complex tech problems into simple solutions. I've worked with AWS, Azure, DevOps, Data, Analytics, Generative-AI and Agentic-AI building real systems for real companies. In this article series, I'll share what I've learned in a way that's easy to follow, whether you're experienced or just getting started.

Let's get into it! 🚀


I'm building something I've been excited about for months hosting a CrewAI-powered AI agent on AWS using Amazon Bedrock AgentCore, and exposing it through API Gateway + Lambda so external systems can talk to it over HTTP.

The architecture is straightforward:

Client → API Gateway → Lambda → AgentCore Runtime → CrewAI Agent → Bedrock (Claude)
Enter fullscreen mode Exit fullscreen mode

The Lambda function is the bridge. It receives the HTTP request, formats it for my CrewAI agent running on AgentCore, waits for the response, and sends it back. Standard stuff.

Except the Lambda is where everything is painful.

I'm tweaking the request payload format for AgentCore. cdk deploy. Wait 7 minutes. Hit the endpoint. Response shape is wrong AgentCore returns nested JSON, my Lambda isn't unpacking it correctly. Fix one line. cdk deploy. Another 7 minutes. Realize I need to handle streaming responses differently. Another deploy.

By lunch I'd deployed 9 times and shipped maybe 40 lines of actual logic.

The CrewAI agent itself was working fine I'd already tested it locally with crewai run. The Bedrock model was responding perfectly. But the Lambda layer in between? Every tiny change to the request/response mapping cost me 7 minutes. That's where all my time was going.

Something had to change.


I Tried the Obvious Stuff

SAM CLI

First thing I reached for. sam local start-api that's literally what it's for, right?

Problem is, SAM wants a template.yaml. My infrastructure is CDK. I'm not maintaining two definitions of the same stack just to test locally. I tried passing cdk.out/MyStack.template.json directly to SAM and it half-worked some routes loaded, some didn't, the asset references were broken. I spent 45 minutes debugging SAM instead of building my feature.

Also, every request spins up a Docker container. On my MacBook that's 3-4 seconds of cold start per invocation. When I'm iterating on how my Lambda formats the AgentCore request testing different prompt structures, response parsing, error handling that completely kills the feedback loop.

And there's no hot reload. Change a file, stop SAM, run sam build, start SAM again. For five functions that's a 30-second rebuild cycle. Better than 7 minutes but still way too slow.

LocalStack

I've used LocalStack before for integration testing. It's impressive mocks basically all of AWS. But for this use case:

It took 20 minutes just to get the Lambda hot reload working. You have to deploy to some magic S3 bucket with a specific naming convention. Then it still wasn't picking up my changes consistently.

The Lambda hot reload features I needed are behind LocalStack's paid tier. I'm not paying monthly to test my API Gateway → Lambda mapping logic.

For a full event-driven system with SQS and Step Functions, sure, LocalStack makes sense. For "I want to hit my API and see what comes back" it's way overkill.


The Realization

I was staring at my cdk.out/ directory one morning I'd just run cdk synth to verify my stack before yet another deploy and it hit me:

Everything I need is right here.

The CloudFormation template has every route POST /agent/invoke, GET /agent/status, GET /agent/history, the whole thing. It has every Lambda function with its handler path and environment variables. It even has the authorizer config.

When you use CDK's NodejsFunction, esbuild metadata is written alongside the bundled assets in cdk.out/. That metadata traces back to the original TypeScript source file. So the synth output gives us the full route map and the entry points no extra config needed.

I don't need SAM to interpret this. I don't need LocalStack to mock it. I just need to read the JSON and wire up an Express server.

So that's what I built.


How It Works

cdk synth → cdk.out/MyStack.template.json
     ↓
  extract (parse the CF template → route manifest)
     ↓
  serve (Express + esbuild bundling + file watcher)
Enter fullscreen mode Exit fullscreen mode

Extract reads your CloudFormation template and figures out which API Gateway routes map to which Lambda functions. It resolves the Fn::GetAtt references, walks the API Gateway resource tree to reconstruct full paths, and traces each handler back to its TypeScript source file.

Entry points are resolved in priority order:

  1. esbuild metadata if your CDK project uses NodejsFunction, the .esbuild.meta.json in the asset directory traces back to the original source file. This is the zero-config path.
  2. Fallback convention if metadata isn't available, it falls back to src/handlers/{logicalId}.ts. You can override this with a custom CDK aspect that annotates the template.

Serve takes that manifest and boots Express. Each route gets registered. When a request comes in, esbuild bundles the handler on-demand (takes about 20ms the first time), constructs a proper APIGatewayProxyEvent from the Express request, and invokes the handler. Response goes back to the client.

That's it. No Docker. No template to maintain. No magic.


The Hot Reload Part

This is where it actually gets good.

I've got chokidar watching my source directories. When I save a file, it checks: is this file a handler entry point? If yes, it invalidates just that one handler's cache. If it's a shared utility file, it clears all caches.

The Express server never restarts. The routes stay registered. The next request to a changed handler re-bundles it with esbuild (about 5ms for a typical handler) and runs the fresh code.

So my workflow for the AgentCore project became:

  1. Tweak how my Lambda formats the CrewAI request payload
  2. Save
  3. curl -X POST localhost:3001/agent/invoke -d '{"task": "summarize this document"}'
  4. See the result in under 50ms

No deploy. No container spin-up. No rebuild step. Just save and curl.

I went from "is my Lambda correctly parsing the AgentCore response?" being a 7-minute question to a 5-second question. That changed everything.

Here's what I measured on my dev machine:

Request Time
First hit (cold bundle) ~25ms
Subsequent hits (cached) <2ms
After file change (re-bundle) ~5ms

Versus 5-10 minutes per cdk deploy. I genuinely can't go back.


The Failure That Made Me Build This

Let me show you the exact moment that pushed me over the edge.

I had my Lambda calling AgentCore's invoke endpoint. The response was coming back as:

{
  "output": {
    "content": [
      {"type": "text", "text": "Here's the analysis..."}
    ]
  },
  "stopReason": "end_turn"
}
Enter fullscreen mode Exit fullscreen mode

But my Lambda was returning:

{
  "statusCode": 200,
  "body": "[object Object]"
}
Enter fullscreen mode Exit fullscreen mode

Classic. I forgot JSON.stringify() on a nested object. One line fix:

// Before (broken)
body: result.output.content

// After (fixed)  
body: JSON.stringify({ response: result.output.content[0].text })
Enter fullscreen mode Exit fullscreen mode

That one-character-level fix cost me 7 minutes to discover because I had to deploy to see the output. With the local runner, I'd have seen [object Object] instantly, fixed it, saved, curled again done in 10 seconds.

After that, I spent a weekend building this tool. Never looked back.


The Comparison Nobody Asked For

SAM CLI LocalStack This
Extra config template.yaml Docker + deploy setup None*
Docker needed Yes Yes No
Hot reload Nope Kinda works Yes
Request latency 3-5s ~2s <25ms
Monthly cost Free Paid for hot reload Free
Drift risk High (second template) Medium (mock env) None (reads cdk synth)

*Requires NodejsFunction (CDK's standard Lambda construct for TypeScript/JS) for zero-config entry point resolution.


Before/After

BEFORE (my AgentCore project):
  tweak Lambda → cdk deploy (7 min) → curl → wrong response → cdk deploy (7 min) → curl
  1 iteration = 14+ minutes

AFTER:
  tweak Lambda → save → curl (25ms) → wrong response → fix → save → curl (5ms) → done
  1 iteration = 10 seconds
Enter fullscreen mode Exit fullscreen mode

Rough math: 12 fewer deploys × 7 minutes each = ~84 minutes saved per day. Over a week that's nearly a full workday I got back.


Using It

npm install --save-dev cdk-local-lambda
Enter fullscreen mode Exit fullscreen mode

In your package.json:

{
  "scripts": {
    "synth": "cdk synth",
    "local": "npx cdk-local dev --cdk-out cdk.out --stack MyStack --port 3001",
    "dev": "npm run synth && npm run local"
  }
}
Enter fullscreen mode Exit fullscreen mode

Then:

npm run dev
Enter fullscreen mode Exit fullscreen mode
Found 5 route(s)
  POST   /agent/invoke       → src/handlers/invokeAgent.ts
  GET    /agent/status/{id}  → src/handlers/getStatus.ts
  GET    /agent/history      → src/handlers/getHistory.ts
  POST   /agent/feedback     → src/handlers/submitFeedback.ts
  GET    /health             → src/handlers/health.ts

🚀 CDK Local Lambda running on http://localhost:3001
Enter fullscreen mode Exit fullscreen mode

Save a file. Hit the endpoint. Done.

Note: If you cloned the repo to contribute, run npm run build first. As an installed dependency, the CLI is ready to use.


Real Example: Testing My AgentCore Handler

Here's what my actual dev loop looks like now. My invoke handler takes a task, calls AgentCore, and returns the result:

# Test the invoke endpoint
curl -s -X POST http://localhost:3001/agent/invoke \
  -H "Content-Type: application/json" \
  -d '{"task": "summarize the Q2 report", "context": "finance"}' | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "requestId": "a1b2c3d4-e5f6-7890",
  "status": "completed",
  "response": "Here's the Q2 summary: Revenue increased 12% YoY...",
  "model": "claude-3.5-sonnet",
  "tokens": { "input": 847, "output": 234 }
}
Enter fullscreen mode Exit fullscreen mode

Now I change how the response is formatted maybe I want to add execution time:

// src/handlers/invokeAgent.ts - add timing
const start = Date.now();
const result = await callAgentCore(task, context);
const duration = Date.now() - start;

return {
  statusCode: 200,
  body: JSON.stringify({ ...result, executionMs: duration }),
};
Enter fullscreen mode Exit fullscreen mode

Save. Curl again. Fresh response with executionMs field. No deploy.


What It Doesn't Do

I'm not going to pretend this replaces a full deployment pipeline. It doesn't.

Works Doesn't Work
API Gateway → Lambda (REST) SQS / SNS / EventBridge triggers
Custom authorizers (TOKEN type) REQUEST authorizers
Path params, query strings, headers WebSocket APIs
JSON request/response bodies API Gateway request validation
Environment variables (literals) Cross-stack references (Fn::ImportValue)
Hot reload Step Functions

Your handlers still call real AWS services. If your Lambda talks to AgentCore, it calls real AgentCore. If it writes to DynamoDB, it writes to real DynamoDB. This tool only mocks the API Gateway → Lambda invocation, not the services your code uses.

For my project, that's actually what I want I need to test with real AgentCore responses to make sure my parsing logic handles the actual response shape. I just don't want to wait 7 minutes to test that parsing.


Who Should NOT Use This

Being honest about where this doesn't fit:

  • You use Serverless Framework → use serverless-offline instead. It reads your serverless.yml directly.
  • You need SQS/EventBridge/Step Functions locally → use LocalStack. This only handles API Gateway → Lambda.
  • You don't use CDK → this reads cdk.out/. No CDK, no use.
  • Your Lambdas are Python/Go/Java → TypeScript/JavaScript only (esbuild is the bundler).
  • You need to mock downstream AWS services → this doesn't mock DynamoDB, S3, Bedrock, etc. Your handlers call real services.

When You Still Deploy

  • Before merging a PR (integration test with real services)
  • Validating IAM permissions actually work
  • Load testing
  • Anything async (SQS consumers, EventBridge rules)
  • First-time AgentCore endpoint validation (does the IAM role have bedrock-agentcore:InvokeAgent permission?)

I deploy maybe 2-3 times a day now instead of 15+. The local runner handles the "am I returning the right JSON shape" iterations. Deploy handles the "does my IAM policy actually let me call AgentCore" questions.


What's Coming

I'm actively using this daily on the AgentCore project and hitting edges:

  • [ ] HTTP API (v2) support different event format
  • [ ] REQUEST authorizer support
  • [ ] Lambda layers resolution
  • [ ] .env.local overrides for those [REF:xxx] placeholders
  • [ ] Multi-stack support

If any of those are blocking you, open an issue or better, a PR.


The Point

Here's what bugged me about both SAM and LocalStack for this use case: they add things. Another template. Another environment. Another service to configure and keep in sync.

CDK already knows everything about your API. It's sitting right there in cdk.out/. Reading it directly means nothing can drift, because there's nothing to drift.

Tools like serverless-offline and Architect's arc sandbox offer something similar for their own frameworks, but nothing that reads CDK synth output directly. That's the key difference your CDK code remains the single source of truth, locally and in production.

I wish I'd built this before starting the AgentCore project. Would've saved me an entire week of deploy-wait-test cycles.


Try It Yourself

Here's how to try it yourself in under 5 minutes. The complete source code is on GitHub:

GitHub logo simplynadaf / cdk-local-lambda

I Built a Local Lambda Runner for CDK - No Docker, No SAM, Sub-Second Reload

⚡ cdk-local-lambda

Run your CDK Lambda functions locally. No Docker. No SAM. Sub-second hot reload.

npm version Node.js TypeScript License: MIT


┌─────────────────────────────────────────────────────┐
│                                                     │
│   cdk synth → extract → serve                       │
│                                                     │
│   Change code. Save. Hit endpoint. Done.            │
│   No deploy. No Docker. No waiting.                 │
│                                                     │
└─────────────────────────────────────────────────────┘

Quick StartHow It WorksCLI ReferenceHot ReloadExamples


🤔 The Problem

Every CDK developer knows this loop:

Change 1 line → cdk deploy → wait 5-10 min → test → realize it's wrong → repeat

Existing solutions aren't great for CDK:

Tool Issue
SAM CLI Requires separate template.yaml, Docker cold starts, no hot reload
LocalStack Heavy, Lambda hot reload behind paid tier, awkward setup

💡 The Insight

After cdk synth, everything you need is already in cdk.out/ — routes, handlers, env vars, authorizers. Why maintain a second config when the source of truth already exists?

Step 1: Clone the Repository

git clone https://github.com/simplynadaf/cdk-local-lambda.git
cd cdk-local-lambda
Enter fullscreen mode Exit fullscreen mode

This gives you the full project source code, example handlers, and a mock cdk.out/ template that simulates real CDK synth output.

Step 2: Install Dependencies

npm install
Enter fullscreen mode Exit fullscreen mode

This installs Express, esbuild, chokidar, and commander everything the local runner needs. No global installs, no Docker, no AWS CLI required.

Step 3: Start the Local Server

npx tsx src/cli.ts dev --cdk-out examples/cdk.out --stack MyStack --port 3001
Enter fullscreen mode Exit fullscreen mode

This uses tsx for development. In a published package you'd use npx cdk-local dev ... after building.

You should see the server boot with all routes listed:

Found 3 route(s)
  GET  /users      → examples/src/handlers/listUsers.ts
  GET  /users/{id} → examples/src/handlers/getUser.ts
  POST /users      → examples/src/handlers/createUser.ts

🚀 CDK Local Lambda running on http://localhost:3001
Enter fullscreen mode Exit fullscreen mode

Step 4: Hit the Endpoints

Open a new terminal and test:

List all users:

curl -s http://localhost:3001/users | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode
{
  "users": [
    {"id": "1", "name": "Alice", "email": "alice@example.com"},
    {"id": "2", "name": "Bob", "email": "bob@example.com"},
    {"id": "3", "name": "Charlie", "email": "charlie@example.com"}
  ],
  "count": 3
}
Enter fullscreen mode Exit fullscreen mode

Get a specific user:

curl -s http://localhost:3001/users/1 | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode
{
  "id": "1",
  "name": "Alice",
  "email": "alice@example.com"
}
Enter fullscreen mode Exit fullscreen mode

Create a new user:

curl -s -X POST http://localhost:3001/users \
  -H "Content-Type: application/json" \
  -d '{"name":"Alice","email":"alice@example.com"}' | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode
{
  "id": "1749637842910",
  "name": "Alice",
  "email": "alice@example.com",
  "createdAt": "2026-06-11T09:30:42.910Z"
}
Enter fullscreen mode Exit fullscreen mode

Step 5: Test Hot Reload

This is the magic part. With the server still running:

  1. Open examples/src/handlers/getUser.ts in any editor
  2. Add a new field to the response for example, add "source": "hot-reloaded!" to the return object
  3. Save the file

Watch the server terminal you'll see:

  ♻ Changed: examples/src/handlers/getUser.ts - invalidated 1 handler
Enter fullscreen mode Exit fullscreen mode

Now curl the same endpoint again:

curl -s http://localhost:3001/users/1 | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode

The response now includes your new field. No restart. No rebuild. Sub-second.

Step 6: Try Breaking Things

Test error handling:

# User that doesn't exist - should return 404
curl -s http://localhost:3001/users/999 | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode
{
  "error": "User not found"
}
Enter fullscreen mode Exit fullscreen mode


# Missing required field - should return 400
curl -s -X POST http://localhost:3001/users \
  -H "Content-Type: application/json" \
  -d '{}' | python3 -m json.tool
Enter fullscreen mode Exit fullscreen mode
{
  "error": "name is required"
}
Enter fullscreen mode Exit fullscreen mode


That's it. Six steps, no AWS account needed, no Docker, no deploy. The example uses mock handlers, but in a real project you'd point --cdk-out at your actual cdk synth output and it works the same way like I'm doing with my AgentCore project right now.


What I'm Building Next

The AgentCore integration is still in progress. Once it's stable, I'm planning a follow-up article showing the full architecture how I'm hosting a CrewAI multi-agent system on AgentCore and exposing it through the API Gateway + Lambda layer that this tool helped me iterate on 100x faster.

If you're working with AgentCore or any AI agent framework that needs an HTTP API layer, this same pattern applies. The Lambda is just a bridge and bridges need fast iteration.


If this is useful, star the repo. If something's broken, open an issue. If you want a feature, tell me in the comments
I'm building this for my own workflow anyway, might as well make it work for yours too.


📌 Wrapping Up

Thanks for reading! If this was helpful:

  • ❤️ Like if it added value
  • 💾 Save for later
  • 🔄 Share with your team

Follow me for more on: AWS architecture, FinOps, DevOps, and AI Infrastructure.

👉 Visit my website | Connect on LinkedIn | Email: simplynadaf@gmail.com

Happy Learning 🚀

Top comments (0)