DEV Community

Cover image for Persistence Made Easy: AWS DynamoDB Deep Dive for Serverless Applications

Persistence Made Easy: AWS DynamoDB Deep Dive for Serverless Applications

Table of Contents


Introduction

Quick Links:

Remember that contact form API we built in Part 2? It worked great, but there was a sneaky problem – our data vanished every time Lambda recycled!

Picture this nightmare: A Fortune 500 company submits a partnership inquiry through your contact form. You celebrate. But when you check for the lead later... nothing. The Lambda function recycled, and that million-dollar opportunity evaporated into thin air.

Not ideal.

Today, we're fixing this by adding DynamoDB – a fully managed, serverless NoSQL database that scales from zero to millions of requests per second. Your leads will persist forever (or until you delete them).

In this third part of our AWS Serverless Web Mastery series, you'll learn:

  • How to design DynamoDB tables for serverless applications
  • CRUD operations using AWS SDK v3
  • Global Secondary Indexes (GSIs) for flexible queries
  • Production patterns for error handling and performance

Let's make your data bulletproof!


Why DynamoDB for Serverless?

Perfect Serverless Companion

Feature Benefit
Serverless No servers to manage, automatic scaling
Pay-per-request $0 when idle, perfect for variable workloads
Single-digit millisecond latency Fast responses regardless of scale
Built-in security Encryption at rest and in transit
Event-driven DynamoDB Streams trigger Lambda functions

The Competition

Database Serverless? Cold Start Issue? Scaling
DynamoDB ✅ Yes ❌ None Instant
RDS ❌ No N/A Minutes
Aurora Serverless v2 ✅ Yes Minutes Seconds
MongoDB Atlas ⚠️ Partial Minutes Minutes

For serverless applications, DynamoDB is the clear winner.


Table Design Fundamentals

Think Access Patterns, Not Entities

Unlike relational databases where you normalize data, DynamoDB is designed around access patterns. Before creating your table, ask:

  1. How will I retrieve data?
  2. What queries do I need?
  3. What filters will I apply?

Our Access Patterns

For our Contact Form, we need:

Access Pattern Solution
Get lead by ID Primary key lookup
List all leads Scan (acceptable for admin)
Find leads by email GSI on email
Filter by status GSI on status

Table Design

┌─────────────────────────────────────────────────────────────┐
│                    ContactFormLeads Table                   │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  Primary Key: leadId (String)                               │
│                                                             │
│  Attributes:                                                │
│  ├── name (String)                                          │
│  ├── email (String)                                         │
│  ├── company (String, optional)                             │
│  ├── subject (String)                                       │
│  ├── message (String)                                       │
│  ├── status (String: new/contacted/qualified/converted)     │
│  ├── createdAt (String, ISO timestamp)                      │
│  └── updatedAt (String, ISO timestamp)                      │
│                                                             │
│  GSI: email-index                                           │
│  ├── Partition Key: email                                   │
│  └── Sort Key: createdAt                                    │
│                                                             │
│  GSI: status-index                                          │
│  ├── Partition Key: status                                  │
│  └── Sort Key: createdAt                                    │
│                                                             │
└─────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Why This Design?

  1. leadId as partition key: Unique IDs ensure even distribution and fast lookups
  2. No sort key on main table: Simple key structure for CRUD operations
  3. GSIs for query patterns: Enable efficient queries without scanning

Setting Up the Stack

CDK Infrastructure

Create lib/dynamodb-stack.ts:

import * as cdk from "aws-cdk-lib";
import * as dynamodb from "aws-cdk-lib/aws-dynamodb";
import * as lambda from "aws-cdk-lib/aws-lambda";
import * as lambdaNodejs from "aws-cdk-lib/aws-lambda-nodejs";
import * as apigateway from "aws-cdk-lib/aws-apigateway";
import { Construct } from "constructs";

export class DynamoDbStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // ============================================
    // DynamoDB Table
    // ============================================
    const leadsTable = new dynamodb.Table(this, "LeadsTable", {
      tableName: "ContactFormLeads",

      // Partition key only - simple design for CRUD
      partitionKey: {
        name: "leadId",
        type: dynamodb.AttributeType.STRING,
      },

      // On-demand = auto-scaling, pay-per-request
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,

      // Point-in-time recovery for data protection
      pointInTimeRecovery: true,

      // Encryption at rest
      encryption: dynamodb.TableEncryption.AWS_MANAGED,

      // DESTROY for dev (RETAIN for production!)
      removalPolicy: cdk.RemovalPolicy.DESTROY,
    });

    // GSI for querying by email
    leadsTable.addGlobalSecondaryIndex({
      indexName: "email-index",
      partitionKey: {
        name: "email",
        type: dynamodb.AttributeType.STRING,
      },
      sortKey: {
        name: "createdAt",
        type: dynamodb.AttributeType.STRING,
      },
      projectionType: dynamodb.ProjectionType.ALL,
    });

    // GSI for filtering by status
    leadsTable.addGlobalSecondaryIndex({
      indexName: "status-index",
      partitionKey: {
        name: "status",
        type: dynamodb.AttributeType.STRING,
      },
      sortKey: {
        name: "createdAt",
        type: dynamodb.AttributeType.STRING,
      },
      projectionType: dynamodb.ProjectionType.ALL,
    });

    // Lambda Function with DynamoDB access
    const leadsHandler = new lambdaNodejs.NodejsFunction(this, "LeadsHandler", {
      runtime: lambda.Runtime.NODEJS_22_X,
      entry: path.join(__dirname, "../lambda/handlers/leads.ts"),
      handler: "handler",
      description: "Leads CRUD operations with DynamoDB",
      timeout: cdk.Duration.seconds(30),
      memorySize: 256,
      logRetention: logs.RetentionDays.ONE_WEEK,
      environment: {
        TABLE_NAME: leadsTable.tableName,
        EMAIL_INDEX: "email-index",
        STATUS_INDEX: "status-index",
      },
      bundling: {
        minify: true,
        sourceMap: true,
      },
    });

    // Grant read/write permissions
    leadsTable.grantReadWriteData(leadsHandler);

    // ... API Gateway setup (same as Part 2)
  }
}
Enter fullscreen mode Exit fullscreen mode

Key Points

  1. billingMode: PAY_PER_REQUEST: Scales automatically, no capacity planning
  2. grantReadWriteData(): CDK automatically creates the IAM policy
  3. Environment variables: Pass table name to Lambda

CRUD Operations with AWS SDK v3

Setting Up the Client

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
  PutItemCommand,
  GetItemCommand,
  UpdateItemCommand,
  DeleteItemCommand,
  QueryCommand,
  ScanCommand,
} from "@aws-sdk/client-dynamodb";
import { marshall, unmarshall } from "@aws-sdk/util-dynamodb";

const dynamoClient = new DynamoDBClient({});
const TABLE_NAME = process.env.TABLE_NAME!;
Enter fullscreen mode Exit fullscreen mode

Create (PutItem)

async function createLead(leadData: CreateLeadRequest): Promise<Lead> {
  const lead: Lead = {
    leadId: `lead_${Date.now()}_${Math.random().toString(36).slice(2)}`,
    name: leadData.name.trim(),
    email: leadData.email.toLowerCase().trim(),
    company: leadData.company?.trim(),
    subject: leadData.subject,
    message: leadData.message.trim(),
    status: "new",
    createdAt: new Date().toISOString(),
    updatedAt: new Date().toISOString(),
  };

  const command = new PutItemCommand({
    TableName: TABLE_NAME,
    Item: marshall(lead, { removeUndefinedValues: true }),
    // Prevent accidental overwrites
    ConditionExpression: "attribute_not_exists(leadId)",
  });

  await dynamoClient.send(command);
  return lead;
}
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • marshall() converts JS objects to DynamoDB format
  • removeUndefinedValues: true handles optional fields
  • ConditionExpression prevents duplicate IDs (though unlikely)

Read (GetItem)

async function getLeadById(leadId: string): Promise<Lead | null> {
  const command = new GetItemCommand({
    TableName: TABLE_NAME,
    Key: marshall({ leadId }),
  });

  const response = await dynamoClient.send(command);

  if (!response.Item) {
    return null;
  }

  return unmarshall(response.Item) as Lead;
}
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • GetItem is extremely fast (single-digit milliseconds)
  • unmarshall() converts DynamoDB format back to JS

Update (UpdateItem)

async function updateLead(
  leadId: string,
  updates: UpdateLeadRequest,
): Promise<Lead | null> {
  // Build update expression dynamically
  const updateExpressions: string[] = [];
  const expressionAttributeNames: Record<string, string> = {};
  const expressionAttributeValues: Record<string, any> = {};

  // Always update timestamp
  updateExpressions.push("#updatedAt = :updatedAt");
  expressionAttributeNames["#updatedAt"] = "updatedAt";
  expressionAttributeValues[":updatedAt"] = new Date().toISOString();

  // Add fields that have values
  if (updates.status !== undefined) {
    updateExpressions.push("#status = :status");
    expressionAttributeNames["#status"] = "status";
    expressionAttributeValues[":status"] = updates.status;
  }

  // ... add other fields similarly

  const command = new UpdateItemCommand({
    TableName: TABLE_NAME,
    Key: marshall({ leadId }),
    UpdateExpression: `SET ${updateExpressions.join(", ")}`,
    ExpressionAttributeNames: expressionAttributeNames,
    ExpressionAttributeValues: marshall(expressionAttributeValues),
    ConditionExpression: "attribute_exists(leadId)",
    ReturnValues: "ALL_NEW",
  });

  try {
    const response = await dynamoClient.send(command);
    return unmarshall(response.Attributes!) as Lead;
  } catch (error: any) {
    if (error.name === "ConditionalCheckFailedException") {
      return null; // Item doesn't exist
    }
    throw error;
  }
}
Enter fullscreen mode Exit fullscreen mode

Key Points:

  • Use expression attribute names (#status) for reserved words
  • Dynamic expressions allow partial updates
  • ReturnValues: 'ALL_NEW' returns the updated item

Delete (DeleteItem)

async function deleteLead(leadId: string): Promise<boolean> {
  const command = new DeleteItemCommand({
    TableName: TABLE_NAME,
    Key: marshall({ leadId }),
    ConditionExpression: "attribute_exists(leadId)",
  });

  try {
    await dynamoClient.send(command);
    return true;
  } catch (error: any) {
    if (error.name === "ConditionalCheckFailedException") {
      return false; // Item doesn't exist
    }
    throw error;
  }
}
Enter fullscreen mode Exit fullscreen mode

Global Secondary Indexes

Query by Email (GSI)

async function getLeadsByEmail(email: string): Promise<Lead[]> {
  const command = new QueryCommand({
    TableName: TABLE_NAME,
    IndexName: "email-index",
    KeyConditionExpression: "email = :email",
    ExpressionAttributeValues: marshall({
      ":email": email.toLowerCase(),
    }),
    // Newest first
    ScanIndexForward: false,
  });

  const response = await dynamoClient.send(command);
  return (response.Items || []).map((item) => unmarshall(item) as Lead);
}
Enter fullscreen mode Exit fullscreen mode

Filter by Status (GSI)

async function getLeadsByStatus(status: string): Promise<Lead[]> {
  const command = new QueryCommand({
    TableName: TABLE_NAME,
    IndexName: "status-index",
    KeyConditionExpression: "#status = :status",
    ExpressionAttributeNames: {
      "#status": "status", // 'status' is a reserved word
    },
    ExpressionAttributeValues: marshall({
      ":status": status,
    }),
    ScanIndexForward: false,
  });

  const response = await dynamoClient.send(command);
  return (response.Items || []).map((item) => unmarshall(item) as Lead);
}
Enter fullscreen mode Exit fullscreen mode

Query vs Scan

The Critical Difference

Operation Performance Use Case
Query O(n) - reads matched items only Specific key lookups
Scan O(N) - reads ENTIRE table Full table access

Example: Finding "new" leads

❌ Slow: Scan with Filter

// This reads EVERY item in the table!
const command = new ScanCommand({
  TableName: TABLE_NAME,
  FilterExpression: "#status = :status",
  ExpressionAttributeNames: { "#status": "status" },
  ExpressionAttributeValues: marshall({ ":status": "new" }),
});
Enter fullscreen mode Exit fullscreen mode

✅ Fast: Query with GSI

// This reads only matching items
const command = new QueryCommand({
  TableName: TABLE_NAME,
  IndexName: "status-index",
  KeyConditionExpression: "#status = :status",
  ExpressionAttributeNames: { "#status": "status" },
  ExpressionAttributeValues: marshall({ ":status": "new" }),
});
Enter fullscreen mode Exit fullscreen mode

The difference at scale:

  • 1 million leads, 100 are "new"
  • Scan: Reads 1,000,000 items → $$$, slow
  • Query: Reads 100 items → Pennies, fast

Gotchas & Common Pitfalls

1. Reserved Words

Problem: status, name, data are reserved words.

// ❌ This fails
UpdateExpression: 'SET status = :status'

// ✅ Use expression attribute names
UpdateExpression: 'SET #status = :status',
ExpressionAttributeNames: { '#status': 'status' }
Enter fullscreen mode Exit fullscreen mode

2. Empty Strings

Problem: DynamoDB doesn't allow empty strings in certain contexts.

// ❌ This fails for empty company
Item: {
  company: "";
}

// ✅ Use removeUndefinedValues and check for empty
marshall(lead, { removeUndefinedValues: true });

// Or explicitly handle
company: leadData.company?.trim() || undefined;
Enter fullscreen mode Exit fullscreen mode

3. Conditional Check Failures

Problem: ConditionExpression throws on failure.

try {
  await dynamoClient.send(command);
} catch (error: any) {
  if (error.name === "ConditionalCheckFailedException") {
    // Handle gracefully (item doesn't exist or condition failed)
    return null;
  }
  throw error;
}
Enter fullscreen mode Exit fullscreen mode

4. Large Items

Problem: DynamoDB has a 400KB item size limit.

Solution: Store large content (like file attachments) in S3, store the S3 URL in DynamoDB.


Best Practices

1. Use On-Demand for Variable Workloads

billingMode: dynamodb.BillingMode.PAY_PER_REQUEST;
Enter fullscreen mode Exit fullscreen mode

No capacity planning, automatic scaling, pay only for what you use.

2. Enable Point-in-Time Recovery

pointInTimeRecovery: true;
Enter fullscreen mode Exit fullscreen mode

Protects against accidental deletes. Can restore to any point in the last 35 days.

3. Use Batch Operations for Bulk Actions

For multiple items, use BatchGetItem and BatchWriteItem:

import { BatchWriteCommand } from "@aws-sdk/lib-dynamodb";

// Write up to 25 items at once
const command = new BatchWriteCommand({
  RequestItems: {
    [TABLE_NAME]: leads.map((lead) => ({
      PutRequest: { Item: lead },
    })),
  },
});
Enter fullscreen mode Exit fullscreen mode

Check out my DynamoDB Batch Operations blog for detailed patterns.

4. Design for Even Distribution

Avoid "hot" partitions by choosing well-distributed partition keys:

// ✅ Good - even distribution
partitionKey: { name: 'leadId', type: STRING }

// ❌ Bad - hot partitions (if most leads are 'new')
partitionKey: { name: 'status', type: STRING }
Enter fullscreen mode Exit fullscreen mode

Cost Considerations

Pricing Model

Operation On-Demand Cost
Write (1KB) $0.625 per million
Read (4KB) $0.125 per million
Storage $0.25 per GB/month
GSI Writes Same as table
GSI Reads Same as table

Real-World Example

For a contact form with:

  • 1,000 leads/month (writes)
  • 10,000 reads/month (admin dashboard)
  • 1GB storage

Monthly Cost:

  • Writes: 0.001M × $0.625 = $0.000625
  • Reads: 0.01M × $0.125 = $0.00125
  • Storage: 1GB × $0.25 = $0.25
  • Total: ~$0.25/month (essentially free!)

Conclusion

Your contact form now has a rock-solid persistence layer! 🎉

In this part, you learned:

✅ DynamoDB table design for serverless
✅ Global Secondary Indexes for flexible queries
✅ CRUD operations with AWS SDK v3
✅ Query vs Scan performance implications
✅ Production-ready error handling

Your leads are now safely stored and can be queried efficiently. No more data disappearing when Lambda recycles!

In Part 4, we'll bring everything together – frontend, API, and database – into a complete, production-ready application with:

  • Full frontend-backend integration
  • Environment configuration
  • Monitoring and alerting
  • Production deployment checklist

Related Posts:

GitHub Repository: aws-serverless-website-tutorial

See you until next time. Happy coding! 🚀


References

Top comments (0)