DEV Community

Cover image for Building Vidiflow: A Production-Grade Video Downloader Backend in TypeScript
Hamid Karimi
Hamid Karimi

Posted on

Building Vidiflow: A Production-Grade Video Downloader Backend in TypeScript

I Built a Production-Ready Video Downloader Backend in TypeScript — Here's What I Learned

I just completed V2 of Vidiflow, a production-ready backend API for downloading videos from multiple platforms.

This article is the full story of what I built, why I built it this way, and what I learned along the way.


The Challenge

Most video downloader tutorials online are either:

  • Too simple — basic proof-of-concept projects that don't scale
  • Too vague — "just use this library" without discussing architecture
  • Not production-ready — missing authentication, error handling, queues, testing, or documentation

I wanted to build something different.

I wanted to build a backend that could:

  • Handle thousands of users
  • Support multiple video platforms
  • Be easy to extend
  • Stay maintainable for years

What I Built

Vidiflow Backend is a TypeScript/Node.js API that can:

  • ✅ Accept video URLs from YouTube, TikTok, and more
  • ✅ Extract real metadata (title, formats, quality options)
  • ✅ Return direct download URLs
  • ✅ Track download history
  • ✅ Manage favorite videos
  • ✅ Handle authentication (JWT, refresh tokens, OAuth-ready)
  • ✅ Process downloads asynchronously
  • ✅ Provide real-time download progress via WebSockets
  • ✅ Include complete Swagger/OpenAPI documentation

The Architecture (The Hard Part)

Rather than jumping straight into coding, I designed the architecture first.

That single decision probably saved me more time than anything else during the project.

Feature-Based Modular Structure

src/
└── modules/
    ├── auth/
    ├── downloads/
    ├── users/
    └── providers/
Enter fullscreen mode Exit fullscreen mode

Every feature owns its own code.

For example:

  • Authentication lives inside auth
  • Download logic lives inside downloads
  • Platform implementations live inside providers

A new developer can instantly find where something belongs.


Three-Layer Pattern

Every feature follows the same structure:

Routes
    ↓
Controllers
    ↓
Services
    ↓
Repositories
Enter fullscreen mode Exit fullscreen mode

Routes

Only define HTTP endpoints.

Controllers

Very thin.

They:

  • validate requests
  • call services
  • return responses

Nothing else.

Services

Contain all business logic.

This makes them:

  • reusable
  • testable
  • framework-independent

Repositories

Only communicate with the database.

If I ever switch ORMs or introduce caching, the rest of the application doesn't change.


Key Technical Decisions

1. Provider Architecture

Instead of hardcoding YouTube logic into the download service, I created a provider interface.

interface IProvider {
  name: string;

  canHandle(url: string): boolean;

  getVideoInfo(url: string): Promise<VideoInfo>;

  download(url: string, options): Promise<DownloadResult>;
}
Enter fullscreen mode Exit fullscreen mode

Now every platform becomes its own provider.

Adding a new platform only requires:

  • one new provider file
  • one registration line

Nothing else changes.

This makes the system extremely easy to extend.


2. Video Extraction with yt-dlp

Instead of writing extraction logic myself, I integrated yt-dlp.

The backend:

  1. launches yt-dlp as a subprocess
  2. reads its JSON output
  3. converts it into a clean API response

Why?

Because video extraction is incredibly difficult.

Platforms constantly change.

yt-dlp already solves that problem.

My backend simply provides a stable interface around it.


3. Asynchronous Downloads with BullMQ + Redis

Downloads are processed in the background.

Instead of waiting for a long download to finish:

POST /downloads
Enter fullscreen mode Exit fullscreen mode

immediately returns:

downloadId
Enter fullscreen mode Exit fullscreen mode

The worker processes the job separately.

Clients receive updates through WebSockets (or polling if needed).

Benefits:

  • instant API responses
  • better scalability
  • no blocked requests

4. JWT + Refresh Tokens

Authentication follows the standard production pattern.

Access Token

  • expires in 15 minutes
  • sent in Authorization header

Refresh Token

  • valid for 7 days
  • stored hashed in the database
  • sent via httpOnly cookie

On logout, refresh tokens are revoked.

This means stolen access tokens only live briefly, while users can still stay logged in securely.


5. Centralized Error Handling

Every service throws typed errors.

Example:

throw new NotFoundError("Video not found");
Enter fullscreen mode Exit fullscreen mode

instead of

throw {
  status: 404,
  message: "Video not found"
};
Enter fullscreen mode Exit fullscreen mode

A single middleware converts every error into a consistent HTTP response.

Controllers stay clean.

No repetitive try/catch blocks.


Designing for Future Growth

One of my goals was making sure V2 could naturally grow into V3.

For example:

Want payments?

→ Add a new billing module.

Need caching?

→ Add Redis inside repositories.

Need mobile apps?

→ Reuse the same API.

Need microservices later?

→ Module boundaries already exist.

Need analytics?

→ Add logging around services.

That's what "designing for scale" means.

Not premature optimization.

Just clean boundaries.


V1 → V2 Journey

V1

  • Mock data
  • Placeholder extraction
  • Architecture prototype

V2

  • Real yt-dlp integration
  • WebSocket progress updates
  • Swagger documentation
  • Real download URLs
  • Better error handling
  • TikTok provider implementation

Even though TikTok extraction is less reliable because of platform restrictions, the provider architecture handled that without affecting the rest of the application.


Project Size

Approximately:

  • 2,500 lines of code

Not huge.

Just intentionally written.


What I Learned

1. Architecture First, Code Second

I spent around two hours designing before writing code.

That probably saved ten hours of refactoring later.


2. Modular > DRY

Not every duplicated function deserves extraction.

But every feature deserves its own home.

Organization matters more than chasing perfect DRY.


3. Typed Errors > Strings

This:

throw new ValidationError(...)
Enter fullscreen mode Exit fullscreen mode

is much better than:

throw {
  status: 400,
  message: "..."
}
Enter fullscreen mode Exit fullscreen mode

TypeScript catches mistakes before runtime.


4. Secrets Belong in .env

Hardcoded configuration eventually becomes technical debt.

Even development values belong inside:

.env
Enter fullscreen mode Exit fullscreen mode

or

.env.example
Enter fullscreen mode Exit fullscreen mode

5. Documentation Isn't Optional

Adding Swagger took less than an hour.

It will save countless hours for anyone using the API.

Documentation is part of the product.


6. Test While You Build

Every endpoint was tested in Postman before moving to the next one.

Small feedback loops catch bugs much earlier.


Tech Stack

  • Language: TypeScript (Strict Mode)
  • Runtime: Node.js
  • Framework: Express.js
  • Database: PostgreSQL + Prisma ORM
  • Authentication: JWT + bcrypt
  • Queues: BullMQ + Redis
  • Real-Time: Socket.IO
  • Security: Helmet, CORS, Express Rate Limit, Express Validator
  • Documentation: Swagger/OpenAPI
  • Video Extraction: yt-dlp

Everything is container-ready and designed for production deployment.


What's Next?

Backend (V3)

  • Unit tests
  • Integration tests
  • Email verification
  • Password reset
  • User profile endpoints
  • More video providers
  • Admin monitoring endpoints

Frontend

I'll be building a React frontend that includes:

  • Authentication
  • Video downloads
  • Download history
  • Favorites
  • Real-time progress tracking

Future Plans

After that:

  • Browser extension
  • Mobile apps
  • Analytics
  • Premium features

Open Source

The project is fully open source on GitHub.

It includes:

  • clean commit history
  • tagged releases
  • proper project structure
  • production architecture

If you're learning backend development, it can serve as a reference for:

  • Structuring a Node.js application
  • Designing extensible systems
  • Implementing authentication
  • Building async workflows
  • Writing clean service layers
  • Handling errors properly

The Biggest Lesson

Building production software isn't about writing more code.

It's about making better decisions.

Things like:

  • Feature-based organization
  • Provider architecture
  • Separation of concerns
  • Typed errors
  • Secure authentication
  • Async processing
  • Real-time updates
  • Documentation

None of these ideas are particularly advanced.

They simply require intentional design.

That's what makes a project maintainable.


Advice for Anyone Building Something Similar

If you're building your own backend, I'd recommend:

  • Start with architecture, not implementation.
  • Write down why you're choosing a pattern.
  • Keep responsibilities separated.
  • Test continuously.
  • Document the reasoning, not just the API.

That's what separates tutorial projects from production systems.


One More Thing

If you're interested in backend architecture, video downloading, or building scalable TypeScript APIs, feel free to check out the project on GitHub.

It's real code with real trade-offs—not tutorial code.

If you find it useful, I'd really appreciate a ⭐ on the repository.

Feedback, suggestions, and discussions are always welcome.

GitHub Repository:

👉 https://github.com/hamidukarimi/Vidiflow-backend


Final Thought

The biggest win isn't that I built a video downloader.

The biggest win is that I understand every architectural decision inside the codebase.

For me, that's what building real software is all about.

Top comments (0)