DEV Community

Cover image for The AI feature is the easy part
Anurag Srivastava
Anurag Srivastava

Posted on

The AI feature is the easy part

Adding AI to a product takes an afternoon. An API key, a prompt, a fetch call. Done.

Building the system that runs that AI feature in production is a different problem entirely. How do you isolate data between tenants so Org A never sees Org B's rows? How do you bill for usage and actually block access when the quota runs out? How do you build a dashboard that shows real numbers instead of placeholder charts?

I built TubeDigest, a YouTube video summarizer. Teams paste a video URL, get a summary. The summarization is one OpenAI call. The multi-tenant org structure, subscription billing, usage enforcement, caching layer -- that's where the actual work went.

This post is about all of that.

TubeDigest landing page with hero text


What's actually running

On the surface: paste a YouTube URL, get a summary. Behind that:

Postgres Row Level Security handles tenant data isolation. Org A can't query Org B's rows even if the application code has a bug. The database enforces it, not my WHERE clauses.

Billing runs through Dodo Payments with real subscriptions. Free tier has a monthly cap. Hit it, you're blocked. Upgrade, the limit goes up. Webhooks handle the whole lifecycle.

Organizations are the tenant boundary. Invite team members, assign roles (owner or member), manage access. Actual multi-user orgs, not single-user accounts with a team label stapled on.

There's also a caching layer in front of the AI pipeline. If any org has already summarized a particular video, the cached result gets returned. One transcript extraction and one OpenAI call serve every future request for that same video.

Summarizer page with a YouTube URL pasted, showing a completed summary for a video about Bhangarh Fort with video thumbnail, summary text, copy and open buttons


Tech stack

Layer Choice
Frontend Next.js (App Router) + Tailwind CSS + shadcn/ui
Backend NestJS (separate service)
Database PostgreSQL on Neon
ORM TypeORM
Auth Clerk
Billing Dodo Payments
AI OpenAI API
Monorepo Turborepo + pnpm

Frontend and backend are separate services, deployed independently. The frontend handles UI and auth. The backend owns the data, billing logic, and AI pipeline. They talk over REST -- 17 endpoints, all documented with Swagger.

I split them because that's how I'd build a client's SaaS product. Not because it looks good on a diagram.

TubeDigest architecture diagram showing Next.js frontend on Vercel, NestJS backend on Render, PostgreSQL on Neon with RLS, and connections to Clerk, Dodo Payments, OpenAI, and YouTube Captions API


Data isolation

Most multi-tenant apps put WHERE org_id = ? on every query. Miss one and you have a data leak. That approach works until someone forgets, and then it's a breach.

TubeDigest uses Postgres Row Level Security. The database enforces who sees what.

Every request goes through a middleware that verifies the Clerk JWT, pulls the org_id, and sets it on the connection:

SET LOCAL app.org_id = 'the-org-id';
Enter fullscreen mode Exit fullscreen mode

After that, every query on tenant-scoped tables gets filtered by the database engine. Not by application code. The policy:

CREATE POLICY tenant_isolation ON users
  USING (org_id = current_setting('app.org_id')::uuid);
Enter fullscreen mode Exit fullscreen mode

This covers users, invitations, subscriptions, usage_records, and user_summaries. The videos table has no RLS because it's the shared cache. All orgs read from it.

If my code has a bug that skips the org filter, the database catches it anyway. The safety net is in the infrastructure, not in my discipline.

Workspace settings Members tab showing 3 members with roles (1 Owner, 2 Members), an invite form with email input and role selector, and joined dates


Billing that works end to end

A lot of SaaS demos have a pricing page and a test checkout button that goes nowhere. TubeDigest has a billing system that processes real payments.

Dodo Payments handles subscriptions. Checkout sessions, webhook events, billing portal -- standard payment gateway patterns:

  1. Sign up. Land on Free tier with a monthly summary cap
  2. Use up the quota. API returns 429s
  3. Click upgrade. Dodo checkout. Payment processes. Webhook fires
  4. Backend catches the webhook. Updates the subscription. Raises the limit
  5. User manages billing through the self-service portal

Usage tracking is per org, per billing period. Every summary request increments a counter. The system knows how many summaries each org has used, when the period resets, and what the cap is. The dashboard shows all of this live.

Workspace settings Billing tab showing Free plan, usage this period for May 2026, summaries and seats counts, upgrade to Pro option at $10/mo, and link to Dodo customer portal


Duplicate payments

This problem doesn't show up in demos. It shows up in production, and it wrecks things.

User clicks checkout twice. Network hiccup makes Dodo fire the same webhook twice. User opens checkout in two browser tabs. All of these cause the same issue: duplicate payment processing. Double charges, phantom subscription upgrades, usage limits applied twice.

TubeDigest processes webhooks idempotently. Every incoming event gets checked against what's already been processed. If the system has seen that event before, it acknowledges it and moves on. No state change. The subscription table stays clean no matter how many times the same event arrives.

Not a lot of code. But this is the kind of thing that separates a demo from a system that handles real money.


The AI pipeline

The summarizer is simple on purpose. I wanted the complexity in the infrastructure, not the AI call. Here's the flow:

  1. User submits a YouTube URL
  2. Backend extracts the video ID, checks the videos cache table
  3. Cache miss: pull transcript via YouTube's captions API
  4. Truncate to about 4000 tokens (roughly 20 minutes of spoken content)
  5. Send to OpenAI, get summary back
  6. Store transcript and summary in the cache, keyed by video ID
  7. Log the request in user_summaries, increment the org's usage counter

Cache hit? Skip steps 3 through 6. Return the stored summary immediately. A YouTube video that gets summarized by 500 different orgs costs one API call. Not 500.

No captions on the video? Return a 422 with a clear message. Don't touch the usage counter. Don't charge for failures.

Summary history page showing 7 summaries with video thumbnails, titles, preview text, dates, and view/open links for each entry


Dashboard with real data

Building a dashboard is easy. Having real data to put in it is the harder part, because the data pipeline has to exist first.

TubeDigest tracks:

  • Summaries used this period, out of the plan's limit. Live counter
  • Recent activity. Last 5-10 summaries with video titles and timestamps
  • Daily usage. Bar chart of summaries per day across the billing period

All of that comes from two tables: usage_records and user_summaries. No analytics service. No third party dashboard tool. The same data pipeline that tracks billing also feeds the dashboard.

TubeDigest dashboard showing 7 summaries used out of 10 this month, 3 remaining, Free plan, a daily usage bar chart over 30 days, and recent activity list with 5 YouTube video summaries


Project structure

Monorepo with Turborepo and pnpm:

tubedigest/
├── apps/
│   ├── web/     # Next.js frontend → Vercel
│   └── api/     # NestJS backend → Render
├── packages/    # Shared types
├── turbo.json
└── pnpm-workspace.yaml
Enter fullscreen mode Exit fullscreen mode

CI runs lint, type-check, and build on every push via GitHub Actions. Vercel and Render auto-deploy when the pipeline passes. Swagger docs live at /api/docs with request and response schemas for every endpoint.


What I'd add at scale

These aren't missing. They're unnecessary at this stage. If TubeDigest had real traffic, they'd be next:

  • Async processing. Summarization is synchronous right now. At scale, a job queue with Bull and Redis would handle long videos without blocking request threads.
  • Chunk and merge for long transcripts. I truncate at about 4000 tokens, roughly 20 minutes. A chunk, summarize, merge pipeline would handle 3 hour lectures but costs more in API calls.
  • Integration tests for billing webhooks and RLS policies. The two most important paths in the system deserve the most test coverage.

Try it

Live demo: tubedigest-web.vercel.app

TubeDigest

A multi-tenant SaaS YouTube video summarizer. Paste a YouTube URL, get a concise AI-generated summary. Built with organization-based multi-tenancy, usage-based billing, and role-based access control.

Tech Stack










































Layer Technology
Monorepo Turborepo + pnpm workspaces
Frontend Next.js (App Router) + Tailwind CSS + shadcn/ui
Backend NestJS
Auth Clerk
Database PostgreSQL (Neon) + TypeORM
Multi-tenancy Row-Level Security (RLS)
Billing Dodo Payments
AI OpenAI API

Features

  • AI Summarization — paste a YouTube URL, get a summary powered by OpenAI
  • Multi-Tenancy — organization-based isolation with Postgres RLS
  • Role-Based Access — owner and member roles with backend-enforced permissions
  • Usage Tracking — per-organization usage limits and daily usage charts
  • Billing — subscription management via Dodo Payments
  • Team Management — invite members, manage roles, revoke access
  • Video Caching — same video = reuse cached summary, saving API costs

Project Structure

tubedigest/
├── apps/
│   ├── web/        # Next.js frontend
│   └── api/        # NestJS backend
├──

The AI feature took a day. The system around it took the rest of the week. That ratio tells you where the real engineering is.

Top comments (0)