DEV Community

Cover image for Accelerating ClippyAI with Embedding LLM and Vector Database
MrDoe
MrDoe

Posted on

Accelerating ClippyAI with Embedding LLM and Vector Database

This is a submission for the Open Source AI Challenge with pgai and Ollama

What I Built

ClippyAI is an innovative, open-source multi-platform AI project designed to automate and simplify repetitive tasks like generating email responses, explaining, summarizing, and translating texts. Recently this year, I posted on DEV.to about ClippyAI, which uses Ollama to automatically generate answers for repetitive emails.

As this was working quite well, I extended it to a multi-purpose application, which seamlessly integrates into the Windows or Linux/X11 clipboard that can be used, e.g. to explain, summarize or translate texts and code. The only bottleneck has been the execution speed of the LLM interference, because to run larger models in Ollama at an adequate speed requires at least a modern CPU, or better a dedicated GPU.

Based on this contest, the idea to use an embedding LLM along with a vector database came to my mind. The database could serve as a cache to store the most common answers, so that there would be no need to generate all of them everytime.

In this project, I integrated an embedding LLM (nomic-embed-text) hosted by Ollama, a PostgreSQL vector database, and pg.ai to create a system that caches embeddings of template answers in a vector database. This setup allows for rapid retrieval of templates for similar questions or tasks, significantly reducing response times to only a few seconds on modern CPUs.

Basic Concept

Pg.ai provides us with the command ollama_embed, which can be used to send text to the embedding LLM nomic-embed-text hosted by Ollama. As result, it will return a vector with 768 dimensions. Such vector can be imagined like a compressed semantic description of the text. When you compare two resulting vectors of different text inputs by its euclidian distance, the meaning is the decisive factor, not the similarity in terms of the words or characters as it would be with classical string distance functions like the levenshtein algorithm.

Now, instead of directly passing the clipboard data and the task to a generative LLM hosted by Ollama, it is first being sent to a PostgreSQL database.

Storing Embeddings

Embeddings are high-dimensional vectors that represent the semantic meaning of text. A vector database stores these embeddings to enable rapid similarity searches.

To use this concept, we must first fill our vector database with data.
Before storing, we concatenate the clipboard data with the task description in the variable @question.
Then we calculate the embedding vector for the @question variable and store it together with the answer generated by the general purpose LLM:

INSERT INTO clippy (question, answer, embedding_question, embedding_answer)
SELECT
  @question,
  @answer,
  ai.ollama_embed('nomic-embed-text', @question);
Enter fullscreen mode Exit fullscreen mode

From the ClippyAI GUI, this function will be executed when the thumb-up button is clicked, or the Store all responses as embeddings mode is active.

Embeddings which should not serve as templates, can also be removed by clicking the thumbs-down button.

Retrieving Answers from Embeddings

By using the extension pg.ai, the database can execute request to Ollama directly from a SQL query:

SELECT
  answer, 
  embedding_question <->
    ai.ollama_embed('nomic-embed-text', @question)
    as distance
FROM clippy
WHERE embedding_question <->
  ai.ollama_embed('nomic-embed-text', @question) <= @threshold
ORDER BY 3;
Enter fullscreen mode Exit fullscreen mode

The <-> operator calculates the distance of two vectors and we are only taking results, which are below the user-specified @threshold variable.
Finally, we order the result set descending by the distance.
On the GUI side, users can scroll through the answers by pressing the >> button.

Benefits of This Integration

  • Enhanced Data Privacy: All data processing happens locally, ensuring high levels of data privacy.
  • Efficient Text Processing: Using an embedding LLM for text comparisions provides better results than just calculating the distance two strings, because the similarity is measured by a semantic likeliness.
  • Scalable AI Solutions: Combining PostgreSQL with pgai and pgvector allows for scalable and efficient AI solutions.
  • Cross-Platform Support: This setup works seamlessly on both Windows and Linux platforms.

Demo

Download the latest version at https://github.com/MrDoe/ClippyAI.
See the installation instructions for how to set up the PostgreSQL database with pg.ai.

Before submitting a task:
Image description

After submitting a task:
Image description

Tools Used

In my project, I utilized several powerful tools to build the AI system:

  • pgvector: This PostgreSQL extension allows for efficient storage and retrieval of high-dimensional vector data, enabling rapid similarity searches.
  • pgai: Provided the ability to execute requests to Ollama directly from SQL queries, making the integration seamless and efficient.
  • pgai Vectorizer: The vectorizer tool from pgai was used to generate embeddings from the text, which are then stored in the vector database.
  • Ollama's nomic-embed-text: This embedding LLM was crucial for transforming text into high-dimensional vectors that capture semantic meaning.
  • Docker: I used the PostgreSQL + pgai container to set up and run the database environment smoothly.
  • .NET SDK 8.0: Open Source Framework for C# applications from Microsoft.
  • Avalonia: Platform-independent UI Framework for .NET

Final Thoughts

Building this application was an exciting journey. I learned a lot about integrating and using embedding AI models with vector databases. The combination of PostgreSQL, pgai, and Ollama proved to be a powerful setup for text processing tasks.

I believe this project could significantly enhance productivity in various domains by providing quick and relevant responses to common queries. The seamless integration into the clipboard makes it a handy tool for everyday use, and the local data processing ensures that user data remains private and secure.

Prize Categories

This submission may qualify for the following prize categories:

  • Open-source Models from Ollama: For utilizing Ollama with the free nomic-embed-text LLM.
  • Vectorizer Vibe: For integrating pgai Vectorizer and leveraging vector databases.

Team Submissions

This project was a solo effort, so no additional team members need to be credited.

Top comments (0)