Relevant arXiv paper RAG

andrew — Mon, 11 Nov 2024 07:43:01 +0000

This is a submission for the Open Source AI Challenge with pgai and Ollama

What I Built

Imagine having an assistant that can, based on a given research topic/paper, instantly connect you with papers that are relevant to you - saving hours spent sifting through services like Arxiv!

This project is a research paper recommendation system leveraging RAG with PostgreSQL, pgVectorScale, and a language model (choose from ChatGpt4o:mini/Claude35/llama32:3b). Using a *single * arXiv paper ID, the system finds similar research articles using vector embeddings, allowing users to dive deeper into related works, spot trends, and explore different approaches on the topic.

Demo

Hosted GUI:
https://tinyurl.com/timescalechallengeanyademo
Note: May be unavailable due to Gradio 72hr url limit - SEE COLAB-HOSTED SELF-RUNNABLE SOURCE CODE BELOW, slow due to multiple users, some recent arXiv url/papers don't work)

Top-10 Similar Papers Demo

T10 Similar Paper Summary/Analysis Demo

Static Preview

Colab Notebook Source Code (Try it: ~10min):
https://tinyurl.com/timescalechallengeanyanotebook

NOTE: Default configuration uses Ollama, but OpenAI Anthropic Claude w/ Cohere Embeddings is preferred due to context length limitations with pgAI and Ollama embeddings (LLM similarity analysis/question: 3 papers instead of 10 papers w/ Ollama, see final thoughts).

Tools Used

pgvector & pgvectorscale: Backbone for storing and searching vector embeddings of arXiv paper texts, which are each converted into vector representation. Use DISKANN (or IVFFLAT) for grouping, indexing embeddings.
pgai: Used for generating embeddings and answer questions for research documents. pgAI is used as a gateway to OpenAI, Anthropic, Cohere, and Ollama.

Final Thoughts

Additional unique aspects of this project:
- Usage Postgres stored functions to call pgai functions in 'function mode', enabling users without any access to the database or pgai to build a RAG (superior security).
- Integration with OpenAI, Anthropic, and local Ollama APIs.
Learnings
- The learning curve for implementation was of medium level, but I felt like I learned a lot from exploring timescale's github documentation and writing stored function commands (with ChatGPT's help, took my database systems course >1yr ago - a little rusty). Should add further documentation and review for inefficiencies to notebook in future.
Feedback
- pgvector is limited to an embedding dimension size of 4k (2k if full vector is used), falling short of OpenAI's 4096. I wrote additional code to trim the output, which complicated the implementation.
- pgAI's Ollama may have context length issues (when I use Ollama's interface directly there are no such issues), which limited the later question-answering function to 3 papers. When using Anthropic/Cohere, we could do more.

DEV Community: andrew

Relevant arXiv paper RAG

What I Built

Demo

Tools Used

Final Thoughts