DocuMind - Production-Ready Semantic Document Search with Redis 8 Vector Sets

Matt Engman — Fri, 01 Aug 2025 17:01:41 +0000

This is a submission for the Redis AI Challenge: Real-Time AI Innovators.

What I Built

DocuMind is a production-ready semantic document search system that transforms static document storage into an intelligent, searchable knowledge base. Built with Redis 8 Vector Sets at its core, DocuMind enables natural language queries across entire document collections with sub-second response times.

Key Features:

🔍 Real-time semantic search using OpenAI embeddings and Redis Vector Sets
📊 Live analytics dashboard with search metrics and system health monitoring
📄 Intelligent document processing with automatic chunking and vector generation
⚡ Advanced caching with 75% memory efficiency through quantized embeddings
🎯 Production deployment on Google Cloud Run + Vercel with enterprise security

Tech Stack:

Backend: FastAPI, Redis 8, OpenAI API, Python
Frontend: React, TypeScript, Tailwind CSS, Framer Motion
Infrastructure: Google Cloud Run, Vercel, Redis Cloud
AI/ML: OpenAI embeddings, sentence-transformers fallback

Demo

🚀 Live Demo:

Try it yourself:

Upload a PDF, DOCX, or TXT document
Watch real-time processing with vector generation
Search using natural language (e.g., "artificial intelligence", "business strategy")
View live analytics showing search performance and system metrics

GitHub Repository:

How I Used Redis 8

DocuMind leverages Redis 8 as the foundation for its entire real-time AI infrastructure:

🎯 Redis Vector Sets - The Core Innovation

Native vector storage using Redis 8's cutting-edge Vector Sets feature
Optimized fallback search when Redis Stack KNN queries aren't available
Base64 vector encoding for reliable storage and retrieval
Quantized embeddings achieving 75% memory reduction vs traditional databases

📊 Multi-Model Data Architecture

DocuMind uses Redis 8's versatility to store multiple data types seamlessly:

JSON documents for metadata and document information
Vector Sets for semantic embeddings and similarity search
Hash maps for analytics and system metrics
Sets for document indexing and relationship management

⚡ Real-Time Performance Features

Semantic caching with intelligent cache invalidation
Sub-second search across thousands of document chunks
Live analytics tracking search patterns and system health
Background processing with Redis-based job queues

🔧 Production-Grade Implementation


python
# Redis Vector Sets integration with fallback search
async def search_vectors(self, query: str, limit: int = 10, 
                        similarity_threshold: float = 0.1):
    # Generate OpenAI embedding
    query_embedding = await embedding_service.generate_embedding(query)
    query_vector = np.array(query_embedding["vector"], dtype=np.float32)

    # Use optimized fallback vector search for Redis Stack compatibility
    results = await self._execute_fallback_search(query_vector, limit)

    # Process and rank results with cosine similarity
    return self._process_search_results(results, query_vector, similarity_threshold)

DEV Community: Matt Engman