DEV Community

Oscar
Oscar

Posted on

2 1 1 1

paRAGraph - Narrative RAG chatbot

This is a submission for the Open Source AI Challenge with pgai and Ollama

What I Built

I developed an AI-driven storytelling and conversational application that integrates a chatbot interface powered by Ollama and PostgreSQL with TimescaleDB, designed to deliver responses that are contextually enriched by both past user queries and stored story fragments. By leveraging vector embeddings, the app retrieves relevant past conversations and story fragments to enrich the chatbot's response, creating a dynamic and personalized user experience.

Demo


Image description
To try the app, you can either use a Timescale service to run the provided .sql files or install the necessary extensions locally on your own TimescaleDB instance. Additionally, make sure to set your DATABASE_URL in the .env file to properly configure the database connection.

To use the pgai Vectorizer, your TimescaleDB instance must also have an OpenAI API key set. This is required for generating embeddings with the Vectorizer.

Tools Used

  • TimescaleDB with pgvector: I utilized TimescaleDB to store and manage historical data related to user conversations and story fragments, using pgvector to store vector embeddings for efficient retrieval based on semantic similarity.

  • pgai Vectorizer: The pgai Vectorizer was used to create vector embeddings of story fragments and conversation history. These embeddings allowed for contextually similar data to be fetched based on the user’s current input, enabling rich, relevant responses.

  • pgvectorscale: For even more efficient similarity-based retrieval, I used pgvectorscale to create StreamingDiskANN indexes on both the story fragment embeddings and the conversation history embeddings. This enabled faster and more scalable retrieval of semantically similar fragments from large datasets, improving the performance of the app during chat interactions.

  • Ollama: I integrated Ollama's conversational AI (specifically using the "llama3.2" model) to handle the chatbot’s generation of responses. This model was accessed via API calls and was key to generating coherent, contextually informed replies.

Final Thoughts

This project was both challenging and rewarding, especially in terms of designing an efficient retrieval system for story fragments and past conversations that uses vector embeddings to find semantically similar entries. It was nice to see how AI and database technologies like TimescaleDB and Ollama can come together to build an interactive storytelling experience.

Prize Categories:

  • Open-source Models from Ollama
  • Vectorizer Vibe
  • All the Extensions!

API Trace View

How I Cut 22.3 Seconds Off an API Call with Sentry 🕒

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay