Brian Douglas

Posted on Mar 4, 2023 • Edited on Feb 1, 2024

AI powered knowledge search

#ai #openai #github

Some of the best implementations of AI use existing knowledge basis and make them searchable through prompts. While scrolling through Twitter, I a tweet on paul-graham-gpt.

header image was generated using midjourney

In this 9 days of OpenAI series, I am looking at AI projects and their code to help demystify how any dev can build AI-generated projects.

Find more AI projects using OpenSauced

Paul Graham is best known for his work on the programming language Lisp and for cofounding the influential startup accelerator and seed capital firm Y Combinator. He also writes many essays that provide a lot of knowledge for current and future startup founders.

Mckay Wrigley built a tool called paul-graham-gpt, you can use it live here, to navigate all of PaulG's essays, and I am excited to jump in and take a look on how he did this.

mckaywrigley / paul-graham-gpt

RAG on Paul Graham's essays.

How was mckaywrigley/paul-graham-gpt made?

paul-graham-gpt is described as AI search & chat for all of Paul Graham’s essays. When looking closer at the code, it uses the embeddings API from OpenAI.

_OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for:
_

This is my first look at embeddings, and if you read in my previous post on aicommits, that used completions. Based on my reading in the docs, embeddings are useful when traversing existing data and looking for relevancy. The code samples use Amazon food reviews as the example in the docs. You might be looking for reviews on condiments, and the relevance you are looking for is negative reviews. The embeddings check tone along with ratings.

That is my best explanation after a first look, but check out the embeddings use cases for more context

How does it work?

The project's README does a great job explaining the techniques. The author is looping over all essays and generating embeddings for each text chunk. This is done in the generateEmbeddings function.

All essay content is stored in scripts/pg.json

// scripts/embed.ts

// this response is loop over using the essay content in the generateEmbeddings fnction
const embeddingResponse = await openai.createEmbedding({
  model: "text-embedding-ada-002",
  input: content
});

...

// This is parsing the essays from the JSON
(async () => {
  const book: PGJSON = JSON.parse(fs.readFileSync("scripts/pg.json", "utf8"));

  await generateEmbeddings(book.essays);
})();

Then they take the user's search query to generate an embedding and use the result to find the most relevant passages from the essays.

// pages/api/search.ts

const res = await fetch("https://api.openai.com/v1/embeddings", {
  headers: {
    "Content-Type": "application/JSON",
    Authorization: `Bearer ${apiKey}`
  },
  method: "POST",
  body: JSON.stringify({
    model: "text-embedding-ada-002",
    input
  })
});

...

The comparison is done using cosine similarity across our database of vectors.

// pages/api/search.ts

const {
  data: chunks,
  error
} = await supabaseAdmin.RPC("pg_search", {
  query_embedding: embedding,
  similarity_threshold: 0.01, // cosine similarity
  match_count: matches
});

The Postgres database has the pgvector extension hosted on Supabase. This was just announced recently by Supabase last month.

Results are ranked by similarity score and returned to the user.

I enjoyed walking through the code and learning how this works. If I need to correct something, or if you have some insight into the code, please comment. Thanks to McKay for sharing this with us, and be sure to give them a follow and check out their other work in AI, Codewand AI-powered tools to help your team build software faster.

Also, if you have a project leveraging OpenAI, leave a link in the comments. I'd love to take a look and include it in my 9 days of OpenAI series.

Stay saucy.

How I Cut 22.3 Seconds Off an API Call with Sentry

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

DEV Community

AI powered knowledge search

mckaywrigley / paul-graham-gpt

RAG on Paul Graham's essays.

How was mckaywrigley/paul-graham-gpt made?

How does it work?

How I Cut 22.3 Seconds Off an API Call with Sentry

Top comments (0)

See why 4M developers consider Sentry, “not bad.”

Okay