DEV Community

Cover image for Local AI Knowledge Base with Next.js, Ollama, and PostgreSQL
Aman Yadav
Aman Yadav

Posted on

3

Local AI Knowledge Base with Next.js, Ollama, and PostgreSQL

This is a submission for the Open Source AI Challenge with pgai and Ollama

What I Built

I developed a fully local, AI-powered Knowledge Base Management System that enables users to upload documents and interact with them through RAG (Retrieval-Augmented Generation). This system harnesses the capabilities of PostgreSQL’s AI extensions alongside Ollama's local models, creating a privacy-centered solution that operates entirely on your own infrastructure.

Key Features:

  1. Fully Local Operation
    • All AI processing occurs directly on your machine, powered by Llama 3.1
    • Ensures that no data leaves your infrastructure, providing complete privacy and control.
  2. Intelligent chat interface with RAG capabilities
    • Uses Retrieval-Augmented Generation to deliver accurate responses.
    • Vector similarity search is driven by pgvector, with efficient querying through pgvectorscale.
  3. Function Calling
    • Supports extensible operations for expanded interaction capabilities.
  4. Modern Web Interface
    • Features a clean, intuitive UI designed with shadcn/ui.
    • Enables real-time chat interactions with integrated file management.

Demo

In this demo, users can select multiple PDFs and receive answers based on the content. Here, the hackathon announcement is used as a source. Additionally, users have the option to manage files within the app.

User also has option to manage files in the app

file selection

Below is the chat interface, where users interact with the documents.

chat interface

GitHub Repository:

Next.js AI Knowledge-base chatbot

An fully local Open-Source AI knowledge made with Next.js and the AI SDK Powered by Ollama and Postgres

Features · Model Providers · Deploy Your Own · Running locally

Features

  • Next.js App Router
    • Advanced routing for seamless navigation and performance
    • React Server Components (RSCs) and Server Actions for server-side rendering and increased performance
  • AI SDK
    • Unified API for generating text, structured objects, and tool calls with LLMs
    • Hooks for building dynamic chat and generative user interfaces
  • shadcn/ui
  • Data Persistence
    • Postgres for saving chat history and user data
    • Min.io for efficient file storage
  • NextAuth.js
    • Simple and secure authentication
  • Ollama for AI model management
    • Easily switch between different AI models and providers
  • pgvector for vector similarity search
    • Efficiently store embeddings for similarity search
  • pgvectorscale for scaling vector similarity search
    • Query embeddings for RAG

Model Providers

Tools Used

The project leverages several powerful open-source tools:

AI and Database:

  • PostgreSQL as the primary database
  • pgvector for storing vectors
  • pgvectorscale for similarity search
  • Ollama running
    • Llama 3.1 for chat
    • mxbai-embed-large for embeddings

Frontend and Backend:

  • Next.js 14 with App Router
  • Vercel AI SDK for chat interfaces
  • shadcn/ui for component design
  • Tailwind CSS for styling
  • NextAuth.js for authentication
  • Min.io for file storage
  • Drizzle as ORM

Final Thoughts

Building this project for the Open Source AI Challenge was a fulfilling journey. The most challenging yet rewarding aspect was enabling it to run entirely locally, offering a significant data privacy advantage.

Using pgvector and pgvectorscale made the RAG implementation seamless.
The pgai Vectorizer currently supports only OpenAI embeddings. While I initially aimed to integrate it with Ollama, the absence of a compatible tokenization endpoint made this integration unfeasible. As an alternative, I implemented the embedding logic directly.

Prize categories:

  • Main Category
  • Open-source Models from Ollama

Future Improvements

  • Fine-tune prompts for better tool interaction
  • Add support for more document formats
  • Implement multi-model support
  • Optimize vector search performance
  • Add batch processing capabilities for large document sets

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry 🕒

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay