DEV Community

Cover image for How I Built a Search Engine for my YouTube Channel using Elasticsearch Serverless

How I Built a Search Engine for my YouTube Channel using Elasticsearch Serverless

As my YouTube channel, Ubcodes, continues to grow, I’ve realized that finding specific technical tutorials in a sea of videos can be a challenge for my subscribers. Whether they are looking for a deep dive into GraphQL or a React Native crash course, the standard search experience doesn't always cut it.

I decided to build a solution: a dedicated YouTube Search Library powered by Elasticsearch Serverless.

View the Live Demo here | Explore the GitHub Repo

The Problem: Why Basic Search Wasn't Enough

When you have a growing library of technical content, simple client-side filtering falls short. Users might:

  • Misspell technical terms ("Nxtjs" instead of "Next.js")
  • Search for concepts that appear in descriptions, not just titles
  • Need to find videos by tags or related topics

I wanted to move beyond basic filtering and implement a professional Search AI experience that rivals what you'd find on major platforms.

The Goal: Three Core Requirements

My requirements were simple but high-impact:

  1. Fuzzy Search: If a user types "Nxtjs" instead of "Next.js," they should still find the right video.
  2. Hit Highlighting: Search terms should "glow" in the results so users immediately see why a video was recommended.
  3. Speed: Results should appear almost instantly as the user types (search-as-you-type with debouncing).

The Tech Stack

  • Frontend: Next.js 14 (App Router) with TypeScript
  • Styling: Tailwind CSS for a modern, responsive UI
  • Search Engine: Elasticsearch Serverless (Elastic Cloud)
  • Client Library: @elastic/elasticsearch for Node.js
  • Icons: Lucide React
  • Deployment: Vercel

Architecture Overview

The application follows a clean separation of concerns:

┌─────────────────┐
│   Next.js UI    │  (React Components + Tailwind)
│  (app/page.tsx) │
└────────┬────────┘
         │ HTTP GET /api/search?q=...
         ▼
┌─────────────────┐
│  API Route      │  (app/api/search/route.ts)
│  Elasticsearch  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Elasticsearch   │
│   Serverless    │
│   (Cloud)       │
└─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Implementation

1. Data Structure & Schema Design

I started by structuring my video metadata into a clean JSON format. Each entry includes all the fields needed for rich search:

{
  "id": "1",
  "title": "Build and Deploy a blog with Subscription & Comments",
  "description": "In this tutorial, we'll build and deploy...",
  "thumbnail_url": "https://...",
  "video_url": "https://youtu.be/...",
  "tags": ["nextjs", "react", "tutorial"],
  "published_date": "2024-03-31"
}
Enter fullscreen mode Exit fullscreen mode

2. Elasticsearch Index Mapping

The key to powerful search is proper field mapping. I created an index with:

  • Text fields (title, description, tags) with the standard analyzer for full-text search
  • Keyword fields for exact matches (useful for filtering)
  • Date field for temporal queries
  • Multi-field mapping on title and tags to support both text search and exact matching

Here's the mapping configuration:

mappings: {
  properties: {
    id: { type: 'keyword' },
    title: {
      type: 'text',
      analyzer: 'standard',
      fields: {
        keyword: { type: 'keyword' }  // For exact matches
      }
    },
    description: { type: 'text', analyzer: 'standard' },
    tags: {
      type: 'text',
      analyzer: 'standard',
      fields: { keyword: { type: 'keyword' } }
    },
    published_date: { type: 'date' }
  }
}
Enter fullscreen mode Exit fullscreen mode

3. The Search API: Where the Magic Happens

The heart of the application is the API route (app/api/search/route.ts). Instead of a simple term match, I implemented a sophisticated multi_match query with field boosting:

const response = await client.search({
  index: 'youtube-videos',
  body: {
    query: {
      multi_match: {
        query: searchTerm,
        fields: ['title^3', 'description^2', 'tags^2'],
        fuzziness: 'AUTO',
        type: 'best_fields'
      }
    },
    highlight: {
      fields: {
        title: {},
        description: {},
        tags: {}
      },
      fragment_size: 150,
      number_of_fragments: 2
    },
    size: 20
  }
});
Enter fullscreen mode Exit fullscreen mode

Key features:

  • Field Boosting: title^3 means title matches are 3x more important than description/tags
  • Fuzziness: 'AUTO': Automatically handles typos and variations
  • Highlighting: Returns fragments with <em> tags around matching terms
  • Best Fields: Uses the highest-scoring field match

4. Frontend: Search-as-You-Type Experience

The frontend implements several UX best practices:

Debouncing for Performance

To avoid hitting the API on every keystroke, I implemented a 300ms debounce:

useEffect(() => {
  const timer = setTimeout(() => {
    setDebouncedQuery(query);
  }, 300);
  return () => clearTimeout(timer);
}, [query]);
Enter fullscreen mode Exit fullscreen mode

Highlight Rendering

Elasticsearch returns highlights wrapped in <em> tags. I created a custom function to convert these to styled <mark> elements:

const highlightText = (text: string, highlights?: string[]): React.ReactNode => {
  if (!highlights || highlights.length === 0) {
    return text;
  }

  const highlighted = highlights[0];
  const parts = highlighted.split(/(<em>.*?<\/em>)/g);

  return parts.map((part, index) => {
    if (part.startsWith('<em>') && part.endsWith('</em>')) {
      const text = part.replace(/<\/?em>/g, '');
      return (
        <mark key={index} className="bg-yellow-200 dark:bg-yellow-800 px-1 rounded">
          {text}
        </mark>
      );
    }
    return part;
  });
};
Enter fullscreen mode Exit fullscreen mode

Loading & Empty States

I added skeleton loaders and empty state messages to provide visual feedback:

  • Loading: Animated skeleton cards while fetching
  • Empty Results: Friendly message with suggestions
  • Initial State: Invitation to start searching

5. Data Indexing Script

I created a standalone Node.js script (index-data.ts) that:

  1. Validates environment variables before connecting
  2. Deletes existing index for clean re-indexing
  3. Creates index with mappings (serverless-compatible)
  4. Bulk indexes videos with error handling
  5. Normalizes dates automatically (handles formats like "2023-10-7" → "2023-10-07")

The script handles Elasticsearch Serverless constraints (no manual shard/replica configuration) and provides clear console feedback.

Challenges & Solutions

Challenge 1: Serverless Configuration

Problem: Elasticsearch Serverless doesn't allow manual shard/replica settings.

Solution: Conditional configuration based on connection method:

// Only add settings for non-serverless deployments
if (process.env.ELASTIC_CLOUD_ID && !process.env.ELASTIC_ENDPOINT) {
  indexBody.settings = {
    number_of_shards: 1,
    number_of_replicas: 0
  };
}
Enter fullscreen mode Exit fullscreen mode

Challenge 2: Date Format Validation

Problem: Inconsistent date formats in JSON data caused indexing errors.

Solution: Automatic date normalization in the indexing script:

function normalizeDate(dateString: string): string {
  const datePattern = /^(\d{4})-(\d{1,2})-(\d{1,2})$/;
  const match = dateString.match(datePattern);
  if (match) {
    const [, year, month, day] = match;
    return `${year}-${month.padStart(2, '0')}-${day.padStart(2, '0')}`;
  }
  return dateString;
}
Enter fullscreen mode Exit fullscreen mode

Challenge 3: Environment Variable Loading

Problem: Scripts needed to load .env.local explicitly.

Solution: Explicit path resolution for environment files:

const envPath = path.join(process.cwd(), '.env.local');
dotenv.config({ path: envPath });
dotenv.config(); // Fallback to .env
Enter fullscreen mode Exit fullscreen mode

Results & Performance

The implementation delivers:

  • Sub-100ms search latency (after debounce)
  • Fuzzy matching handles common typos automatically
  • Visual highlighting makes results immediately clear
  • Responsive design works seamlessly on mobile and desktop
  • Scalable architecture ready to handle thousands of videos

What I Learned

Building this project end-to-end reinforced how much Search AI can improve the Developer Experience (DX). It's not just about finding data; it's about the speed and relevance of the discovery process.

Key takeaways:

  1. Elasticsearch Serverless removes infrastructure complexity while providing enterprise-grade search
  2. Field boosting is crucial for relevance—titles should matter more than descriptions
  3. Highlighting transforms search from functional to delightful
  4. Debouncing is essential for search-as-you-type without overwhelming the API
  5. Type safety with TypeScript caught many potential runtime errors early

Even with a small initial dataset, the scalability of Elasticsearch Serverless means this library can grow alongside my channel.

Conclusion

This project demonstrates that you don't need a massive engineering team to build professional search experiences. With modern tools like Next.js 14, Elasticsearch Serverless, and Vercel, a single developer can create search functionality that rivals major platforms.

Check out the GitHub repository to see the full implementation.


Want to build something similar? The repository includes:

  • Complete TypeScript implementation
  • Environment variable setup guide
  • Indexing scripts with error handling
  • Responsive UI components
  • Deployment configuration

Feel free to fork or star!

Top comments (0)