Table of Contents
- Introduction
- Why DynamoDB for Serverless?
- Table Design Fundamentals
- Setting Up the Stack
- CRUD Operations with AWS SDK v3
- Global Secondary Indexes
- Query vs Scan
- Gotchas & Common Pitfalls
- Best Practices
- Cost Considerations
- Conclusion
- References
Introduction
Quick Links:
- 📂 Source Code: GitHub Repository
- 🚀 Live Demo: https://dmcechq7isaw7.cloudfront.net/
Remember that contact form API we built in Part 2? It worked great, but there was a sneaky problem – our data vanished every time Lambda recycled!
Picture this nightmare: A Fortune 500 company submits a partnership inquiry through your contact form. You celebrate. But when you check for the lead later... nothing. The Lambda function recycled, and that million-dollar opportunity evaporated into thin air.
Not ideal.
Today, we're fixing this by adding DynamoDB – a fully managed, serverless NoSQL database that scales from zero to millions of requests per second. Your leads will persist forever (or until you delete them).
In this third part of our AWS Serverless Web Mastery series, you'll learn:
- How to design DynamoDB tables for serverless applications
- CRUD operations using AWS SDK v3
- Global Secondary Indexes (GSIs) for flexible queries
- Production patterns for error handling and performance
Let's make your data bulletproof!
Why DynamoDB for Serverless?
Perfect Serverless Companion
| Feature | Benefit |
|---|---|
| Serverless | No servers to manage, automatic scaling |
| Pay-per-request | $0 when idle, perfect for variable workloads |
| Single-digit millisecond latency | Fast responses regardless of scale |
| Built-in security | Encryption at rest and in transit |
| Event-driven | DynamoDB Streams trigger Lambda functions |
The Competition
| Database | Serverless? | Cold Start Issue? | Scaling |
|---|---|---|---|
| DynamoDB | ✅ Yes | ❌ None | Instant |
| RDS | ❌ No | N/A | Minutes |
| Aurora Serverless v2 | ✅ Yes | Minutes | Seconds |
| MongoDB Atlas | ⚠️ Partial | Minutes | Minutes |
For serverless applications, DynamoDB is the clear winner.
Table Design Fundamentals
Think Access Patterns, Not Entities
Unlike relational databases where you normalize data, DynamoDB is designed around access patterns. Before creating your table, ask:
- How will I retrieve data?
- What queries do I need?
- What filters will I apply?
Our Access Patterns
For our Contact Form, we need:
| Access Pattern | Solution |
|---|---|
| Get lead by ID | Primary key lookup |
| List all leads | Scan (acceptable for admin) |
| Find leads by email | GSI on email |
| Filter by status | GSI on status |
Table Design
┌─────────────────────────────────────────────────────────────┐
│ ContactFormLeads Table │
├─────────────────────────────────────────────────────────────┤
│ │
│ Primary Key: leadId (String) │
│ │
│ Attributes: │
│ ├── name (String) │
│ ├── email (String) │
│ ├── company (String, optional) │
│ ├── subject (String) │
│ ├── message (String) │
│ ├── status (String: new/contacted/qualified/converted) │
│ ├── createdAt (String, ISO timestamp) │
│ └── updatedAt (String, ISO timestamp) │
│ │
│ GSI: email-index │
│ ├── Partition Key: email │
│ └── Sort Key: createdAt │
│ │
│ GSI: status-index │
│ ├── Partition Key: status │
│ └── Sort Key: createdAt │
│ │
└─────────────────────────────────────────────────────────────┘
Why This Design?
-
leadIdas partition key: Unique IDs ensure even distribution and fast lookups - No sort key on main table: Simple key structure for CRUD operations
- GSIs for query patterns: Enable efficient queries without scanning
Setting Up the Stack
CDK Infrastructure
Create lib/dynamodb-stack.ts:
import * as cdk from "aws-cdk-lib";
import * as dynamodb from "aws-cdk-lib/aws-dynamodb";
import * as lambda from "aws-cdk-lib/aws-lambda";
import * as lambdaNodejs from "aws-cdk-lib/aws-lambda-nodejs";
import * as apigateway from "aws-cdk-lib/aws-apigateway";
import { Construct } from "constructs";
export class DynamoDbStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
// ============================================
// DynamoDB Table
// ============================================
const leadsTable = new dynamodb.Table(this, "LeadsTable", {
tableName: "ContactFormLeads",
// Partition key only - simple design for CRUD
partitionKey: {
name: "leadId",
type: dynamodb.AttributeType.STRING,
},
// On-demand = auto-scaling, pay-per-request
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
// Point-in-time recovery for data protection
pointInTimeRecovery: true,
// Encryption at rest
encryption: dynamodb.TableEncryption.AWS_MANAGED,
// DESTROY for dev (RETAIN for production!)
removalPolicy: cdk.RemovalPolicy.DESTROY,
});
// GSI for querying by email
leadsTable.addGlobalSecondaryIndex({
indexName: "email-index",
partitionKey: {
name: "email",
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: "createdAt",
type: dynamodb.AttributeType.STRING,
},
projectionType: dynamodb.ProjectionType.ALL,
});
// GSI for filtering by status
leadsTable.addGlobalSecondaryIndex({
indexName: "status-index",
partitionKey: {
name: "status",
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: "createdAt",
type: dynamodb.AttributeType.STRING,
},
projectionType: dynamodb.ProjectionType.ALL,
});
// Lambda Function with DynamoDB access
const leadsHandler = new lambdaNodejs.NodejsFunction(this, "LeadsHandler", {
runtime: lambda.Runtime.NODEJS_22_X,
entry: path.join(__dirname, "../lambda/handlers/leads.ts"),
handler: "handler",
description: "Leads CRUD operations with DynamoDB",
timeout: cdk.Duration.seconds(30),
memorySize: 256,
logRetention: logs.RetentionDays.ONE_WEEK,
environment: {
TABLE_NAME: leadsTable.tableName,
EMAIL_INDEX: "email-index",
STATUS_INDEX: "status-index",
},
bundling: {
minify: true,
sourceMap: true,
},
});
// Grant read/write permissions
leadsTable.grantReadWriteData(leadsHandler);
// ... API Gateway setup (same as Part 2)
}
}
Key Points
-
billingMode: PAY_PER_REQUEST: Scales automatically, no capacity planning -
grantReadWriteData(): CDK automatically creates the IAM policy - Environment variables: Pass table name to Lambda
CRUD Operations with AWS SDK v3
Setting Up the Client
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
PutItemCommand,
GetItemCommand,
UpdateItemCommand,
DeleteItemCommand,
QueryCommand,
ScanCommand,
} from "@aws-sdk/client-dynamodb";
import { marshall, unmarshall } from "@aws-sdk/util-dynamodb";
const dynamoClient = new DynamoDBClient({});
const TABLE_NAME = process.env.TABLE_NAME!;
Create (PutItem)
async function createLead(leadData: CreateLeadRequest): Promise<Lead> {
const lead: Lead = {
leadId: `lead_${Date.now()}_${Math.random().toString(36).slice(2)}`,
name: leadData.name.trim(),
email: leadData.email.toLowerCase().trim(),
company: leadData.company?.trim(),
subject: leadData.subject,
message: leadData.message.trim(),
status: "new",
createdAt: new Date().toISOString(),
updatedAt: new Date().toISOString(),
};
const command = new PutItemCommand({
TableName: TABLE_NAME,
Item: marshall(lead, { removeUndefinedValues: true }),
// Prevent accidental overwrites
ConditionExpression: "attribute_not_exists(leadId)",
});
await dynamoClient.send(command);
return lead;
}
Key Points:
-
marshall()converts JS objects to DynamoDB format -
removeUndefinedValues: truehandles optional fields -
ConditionExpressionprevents duplicate IDs (though unlikely)
Read (GetItem)
async function getLeadById(leadId: string): Promise<Lead | null> {
const command = new GetItemCommand({
TableName: TABLE_NAME,
Key: marshall({ leadId }),
});
const response = await dynamoClient.send(command);
if (!response.Item) {
return null;
}
return unmarshall(response.Item) as Lead;
}
Key Points:
-
GetItemis extremely fast (single-digit milliseconds) -
unmarshall()converts DynamoDB format back to JS
Update (UpdateItem)
async function updateLead(
leadId: string,
updates: UpdateLeadRequest,
): Promise<Lead | null> {
// Build update expression dynamically
const updateExpressions: string[] = [];
const expressionAttributeNames: Record<string, string> = {};
const expressionAttributeValues: Record<string, any> = {};
// Always update timestamp
updateExpressions.push("#updatedAt = :updatedAt");
expressionAttributeNames["#updatedAt"] = "updatedAt";
expressionAttributeValues[":updatedAt"] = new Date().toISOString();
// Add fields that have values
if (updates.status !== undefined) {
updateExpressions.push("#status = :status");
expressionAttributeNames["#status"] = "status";
expressionAttributeValues[":status"] = updates.status;
}
// ... add other fields similarly
const command = new UpdateItemCommand({
TableName: TABLE_NAME,
Key: marshall({ leadId }),
UpdateExpression: `SET ${updateExpressions.join(", ")}`,
ExpressionAttributeNames: expressionAttributeNames,
ExpressionAttributeValues: marshall(expressionAttributeValues),
ConditionExpression: "attribute_exists(leadId)",
ReturnValues: "ALL_NEW",
});
try {
const response = await dynamoClient.send(command);
return unmarshall(response.Attributes!) as Lead;
} catch (error: any) {
if (error.name === "ConditionalCheckFailedException") {
return null; // Item doesn't exist
}
throw error;
}
}
Key Points:
- Use expression attribute names (
#status) for reserved words - Dynamic expressions allow partial updates
-
ReturnValues: 'ALL_NEW'returns the updated item
Delete (DeleteItem)
async function deleteLead(leadId: string): Promise<boolean> {
const command = new DeleteItemCommand({
TableName: TABLE_NAME,
Key: marshall({ leadId }),
ConditionExpression: "attribute_exists(leadId)",
});
try {
await dynamoClient.send(command);
return true;
} catch (error: any) {
if (error.name === "ConditionalCheckFailedException") {
return false; // Item doesn't exist
}
throw error;
}
}
Global Secondary Indexes
Query by Email (GSI)
async function getLeadsByEmail(email: string): Promise<Lead[]> {
const command = new QueryCommand({
TableName: TABLE_NAME,
IndexName: "email-index",
KeyConditionExpression: "email = :email",
ExpressionAttributeValues: marshall({
":email": email.toLowerCase(),
}),
// Newest first
ScanIndexForward: false,
});
const response = await dynamoClient.send(command);
return (response.Items || []).map((item) => unmarshall(item) as Lead);
}
Filter by Status (GSI)
async function getLeadsByStatus(status: string): Promise<Lead[]> {
const command = new QueryCommand({
TableName: TABLE_NAME,
IndexName: "status-index",
KeyConditionExpression: "#status = :status",
ExpressionAttributeNames: {
"#status": "status", // 'status' is a reserved word
},
ExpressionAttributeValues: marshall({
":status": status,
}),
ScanIndexForward: false,
});
const response = await dynamoClient.send(command);
return (response.Items || []).map((item) => unmarshall(item) as Lead);
}
Query vs Scan
The Critical Difference
| Operation | Performance | Use Case |
|---|---|---|
| Query | O(n) - reads matched items only | Specific key lookups |
| Scan | O(N) - reads ENTIRE table | Full table access |
Example: Finding "new" leads
❌ Slow: Scan with Filter
// This reads EVERY item in the table!
const command = new ScanCommand({
TableName: TABLE_NAME,
FilterExpression: "#status = :status",
ExpressionAttributeNames: { "#status": "status" },
ExpressionAttributeValues: marshall({ ":status": "new" }),
});
✅ Fast: Query with GSI
// This reads only matching items
const command = new QueryCommand({
TableName: TABLE_NAME,
IndexName: "status-index",
KeyConditionExpression: "#status = :status",
ExpressionAttributeNames: { "#status": "status" },
ExpressionAttributeValues: marshall({ ":status": "new" }),
});
The difference at scale:
- 1 million leads, 100 are "new"
- Scan: Reads 1,000,000 items → $$$, slow
- Query: Reads 100 items → Pennies, fast
Gotchas & Common Pitfalls
1. Reserved Words
Problem: status, name, data are reserved words.
// ❌ This fails
UpdateExpression: 'SET status = :status'
// ✅ Use expression attribute names
UpdateExpression: 'SET #status = :status',
ExpressionAttributeNames: { '#status': 'status' }
2. Empty Strings
Problem: DynamoDB doesn't allow empty strings in certain contexts.
// ❌ This fails for empty company
Item: {
company: "";
}
// ✅ Use removeUndefinedValues and check for empty
marshall(lead, { removeUndefinedValues: true });
// Or explicitly handle
company: leadData.company?.trim() || undefined;
3. Conditional Check Failures
Problem: ConditionExpression throws on failure.
try {
await dynamoClient.send(command);
} catch (error: any) {
if (error.name === "ConditionalCheckFailedException") {
// Handle gracefully (item doesn't exist or condition failed)
return null;
}
throw error;
}
4. Large Items
Problem: DynamoDB has a 400KB item size limit.
Solution: Store large content (like file attachments) in S3, store the S3 URL in DynamoDB.
Best Practices
1. Use On-Demand for Variable Workloads
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST;
No capacity planning, automatic scaling, pay only for what you use.
2. Enable Point-in-Time Recovery
pointInTimeRecovery: true;
Protects against accidental deletes. Can restore to any point in the last 35 days.
3. Use Batch Operations for Bulk Actions
For multiple items, use BatchGetItem and BatchWriteItem:
import { BatchWriteCommand } from "@aws-sdk/lib-dynamodb";
// Write up to 25 items at once
const command = new BatchWriteCommand({
RequestItems: {
[TABLE_NAME]: leads.map((lead) => ({
PutRequest: { Item: lead },
})),
},
});
Check out my DynamoDB Batch Operations blog for detailed patterns.
4. Design for Even Distribution
Avoid "hot" partitions by choosing well-distributed partition keys:
// ✅ Good - even distribution
partitionKey: { name: 'leadId', type: STRING }
// ❌ Bad - hot partitions (if most leads are 'new')
partitionKey: { name: 'status', type: STRING }
Cost Considerations
Pricing Model
| Operation | On-Demand Cost |
|---|---|
| Write (1KB) | $0.625 per million |
| Read (4KB) | $0.125 per million |
| Storage | $0.25 per GB/month |
| GSI Writes | Same as table |
| GSI Reads | Same as table |
Real-World Example
For a contact form with:
- 1,000 leads/month (writes)
- 10,000 reads/month (admin dashboard)
- 1GB storage
Monthly Cost:
- Writes: 0.001M × $0.625 = $0.000625
- Reads: 0.01M × $0.125 = $0.00125
- Storage: 1GB × $0.25 = $0.25
- Total: ~$0.25/month (essentially free!)
Conclusion
Your contact form now has a rock-solid persistence layer! 🎉
In this part, you learned:
✅ DynamoDB table design for serverless
✅ Global Secondary Indexes for flexible queries
✅ CRUD operations with AWS SDK v3
✅ Query vs Scan performance implications
✅ Production-ready error handling
Your leads are now safely stored and can be queried efficiently. No more data disappearing when Lambda recycles!
In Part 4, we'll bring everything together – frontend, API, and database – into a complete, production-ready application with:
- Full frontend-backend integration
- Environment configuration
- Monitoring and alerting
- Production deployment checklist
Related Posts:
GitHub Repository: aws-serverless-website-tutorial
See you until next time. Happy coding! 🚀
Top comments (0)