I'm writing this as part of my submission for the H0 Hackathon: Hack the Zero Stack with Vercel and AWS Databases. But the database problem I'm about to describe is real and it took me longer to solve than I'd like to admit.
SlothPost is a tool I built for indie developers who want to build in public but can never find the time to actually write posts. It connects to your GitHub, Vercel, and App Store, captures all your activity, and drafts social media posts for X and Threads that you review and approve before they go live.
The part that runs everything is a cron-based scheduler. And the scheduler has an interesting DynamoDB design problem.
The problem
Every user in SlothPost can have multiple "products" β the things they're building. Each product has a posting schedule: something like "post to X every Tuesday and Thursday at 9am, post to Threads on Mondays."
The scheduler needs to run every minute and answer one question: which products are due for post generation right now?
The naive approach is to scan the entire products table, filter for active schedules, and check if any of them are due. This works fine when you have 10 users. It's a disaster at 10,000.
The sparse GSI approach
DynamoDB doesn't have server-side scheduled jobs. What it does have is Global Secondary Indexes, and they have a property that's extremely useful here.
A GSI only indexes items that have the attribute it's keying on. If an item doesn't have the attribute, it's simply not in the index. This is called a sparse index.
I added an attribute called scheduleStatus to each product. When a product has an active posting schedule, scheduleStatus is set to 'active'. When it's paused or the user hasn't configured one yet, the attribute is simply absent from the item.
I also added nextRunAt β a Unix timestamp for when this product is next due for generation. This gets recomputed every time a post is generated based on the user's schedule configuration.
The GSI is keyed on scheduleStatus (partition key) and nextRunAt (sort key). To find what's due right now, the cron query looks like this:
typescriptconst result = await dynamoDB.query({
TableName: 'slothpost-products',
IndexName: 'scheduleStatus-nextRunAt-index',
KeyConditionExpression: 'scheduleStatus = :active AND nextRunAt <= :now',
ExpressionAttributeValues: {
':active': 'active',
':now': Date.now()
}
});
That's it. The query only touches products that are actively scheduled. No scans, no post-query filtering. The index stays small because most products in the table don't have scheduleStatus set at all.
The bug that took me a day to find
When I first built computeNextRunAt, I had it check daySchedule.enabled to see if a given day was enabled in the user's schedule. This seemed reasonable until I realized that field didn't exist anywhere.
The enabled state in SlothPost isn't stored as a boolean. It's represented by whether time is null or not. If the user has set a time for Monday, Monday is enabled. If the time is null, it's disabled. There's no separate enabled flag.
The function was silently returning null for the next run time on every call, which meant nextRunAt was never written, which meant nothing ever appeared in the GSI, which meant the scheduler ran every minute and found nothing to do.
It took a while to find because the cron itself was working fine β it just had an empty result set on every execution. That's the kind of bug that doesn't throw an error. It just silently doesn't work.
The fix was one line: replace daySchedule.enabled with typeof daySchedule.time === 'string'. But finding that line took longer.
The Vercel webhook routing problem
I used the same sparse GSI pattern for a different problem: routing Vercel deployment webhooks to the right user.
SlothPost is a Vercel Marketplace integration. When a user connects their Vercel account, they select which Vercel project they want to track. That project ID gets written as vercelProjectId on their product record.
When a deployment webhook arrives, I need to find which product it belongs to. The sparse GSI on vercelProjectId makes this a single targeted query instead of a scan. Products without a Vercel project connected don't have the attribute and don't appear in the index.
One thing worth knowing about Vercel marketplace integrations that isn't documented anywhere I could find: you cannot create webhooks programmatically via the Vercel REST API. You get a hard 403. The correct approach is to configure a global webhook in the Integration Console. I lost a full day trying to make the API approach work before I figured this out.
Takeaways
The sparse GSI is genuinely useful for any scheduling or routing problem where the "active" state is a subset of your total item count. It keeps your queries fast, keeps the index small, and avoids the full-table-scan anti-pattern entirely.
The one thing I'd add: removeUndefinedValues: true on the DynamoDB DocumentClient is not optional when you're working with TypeScript. If any attribute in your item is undefined, DynamoDB will reject the write. That config option silently strips undefined fields instead. I spent two hours debugging that before finding it in the docs.
Full app is at https://slothpost.app
Uses Next.js on Vercel, DynamoDB for everything, Cloudflare Workers for the cron scheduler.
Top comments (0)