Ever wish you could ask your team's knowledge base a question and get instant answers? That's exactly what Notion AI does—and it's pretty magical. But here's the thing: you can build your own version in about 10 minutes.
In this tutorial, we'll create a RAG (Retrieval-Augmented Generation) agent that turns your Notion pages into a personal AI assistant. We're using PostgreSQL with pgvector (yes, Postgres can do vector search!), OpenAI, and Python.
The starter template for this tutorial is available in my GitHub repository: https://github.com/michaelwwn/notion-ai-tutorial. Clone it to get started quickly!
What You'll Need
- Python 3.8 or above
- Docker: For running Postgres with pgvector easily.
- OpenAI API Key: For generating embeddings and responses.
Step 1: Set Up Environment
We'll use Docker to spin up a Postgres instance with the pgvector extension pre-installed.
- Create a
docker-compose.yml
file in your project directory with the following content:
services:
postgres:
image: pgvector/pgvector:pg18
restart: unless-stopped
networks:
- default
volumes:
- postgres-data:/var/lib/postgresql/data
- ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql
ports:
- 5432:5432
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
volumes:
postgres-data:
- Create a
postgres
directory and add aschema.sql
file inside it:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS runbooks (
content TEXT,
embedding VECTOR(1536)
);
CREATE INDEX ON runbooks USING hnsw (embeddings vector_cosine_ops);
- Start the Docker container:
docker-compose up -d
This will launch Postgres on localhost:5432
, create the database, enable pgvector, and set up the runbooks table automatically.
The Magic Behind RAG: What's Actually Happening
Before the code, let's cover foundational concepts from transformer models, which power Notion AI. Here is a simplified flow.
Tokenization
Tokenization breaks text into tokens. For "I love playing football.", it becomes 4 tokens ["I", "loving", "playing", "football"].
Embeddings
Embeddings are vectors capturing semantic meaning. They start from pre-trained vocabularies and represent data in high-dimensional space.
Positional Encoding
Adds position info to embeddings to preserve word order.
Attention (Self-Attention)
Self-attention is the core essence of transformer model. It allows focusing on relevant input parts. In "I ate an apple in the Apple store.", "apple" attends to "ate" highly.
Transformer Flow
The encoded embeddings pass through three layers of attention mechanisms (typically multi-head self-attention in transformers), followed by feedforward networks and normalization in each layer. The final layer outputs contextual embedding vectors for each token, which can be used for downstream tasks.
How can we generate response from user query?
Notion documented the process of their RAG agent. Basically it generates the embeddings for each page content and store them in a vector database. Then, the service queries the database to retrieve the similiar result of user query's embeddings, finally passing the content to the LLM to generate response.
Step 2: Start Building
Get your API key from https://platform.openai.com/api-keys. The $5 credit for new users is more than enough for this tutorial. Create a .env
file in your project directory:
OPENAI_API_KEY=your-api-key-here
Export the Notion content with all the subpages as HTML. In this example, I am using a on-call runbooks table view with the incident details documented in the subpage. You can use free incident management template from The Stanford Daily if you don't already have one.
Move the exported HTML files to "pages" folder of your project directory.
Install Dependency
pip install psycopg pgvector openai beautifulsoup4 python-dotenv
Database Connection Function
A helper to connect to Postgres and register vector support.
import psycopg
from pgvector.psycopg import register_vector
def get_db_connection():
"""Creates a connection to PostgreSQL with vector support."""
conn = psycopg.connect(
host="localhost",
dbname="postgres",
user="postgres",
password="postgres",
port=5432
)
register_vector(conn)
return conn
Extract Text from HTML
Notion exports include lots of markup. We need just the text:
from bs4 import BeautifulSoup
def parse_html_to_text(html_path):
"""Extracts clean text content from HTML file."""
with open(html_path, 'r', encoding='utf-8') as file:
soup = BeautifulSoup(file, 'html.parser')
body = soup.body
if body:
return body.get_text(separator='\n', strip=True)
return ""
Ingest Embeddings
This is where we process your Notion pages and store their embeddings:
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
HTML_DIR = "pages"
def ingest():
"""Reads Notion pages and stores them with embeddings in the database."""
conn = get_db_connection()
cur = conn.cursor()
html_files = [f for f in os.listdir(HTML_DIR) if f.endswith('.html')]
for index, html_file in enumerate(html_files, 1):
print(f"Processing {index}/{len(html_files)}: {html_file}")
html_path = os.path.join(HTML_DIR, html_file)
text = parse_html_to_text(html_path)
if text:
# Generate embedding vector for the page content
response = client.embeddings.create(
model="text-embedding-3-small",
input=[text]
)
embedding = response.data[0].embedding
cur.execute("""
INSERT INTO runbooks (content, embeddings)
VALUES (%s, %s)
""", (text, embedding))
conn.commit()
cur.close()
conn.close()
print(f"✅ Ingested {len(html_files)} pages")
Retrieve Similar Content (Search Embeddings)
Embed the query and find top similar contents using cosine similarity (<=>
operator in pgvector).
def retrieve_similar_content(query_text, limit=3):
"""Finds pages most similar to the query using vector search."""
conn = get_db_connection()
cur = conn.cursor()
# Convert query to embedding
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=[query_text]
).data[0].embedding
# Find similar embeddings using cosine distance
cur.execute("""
SELECT content
FROM runbooks
ORDER BY embeddings <=> %s::vector
LIMIT %s
""", (query_embedding, limit))
results = cur.fetchall()
cur.close()
conn.close()
return results
Generate Response
def generate_response(query, relevant_contexts):
"""Generates AI answer based on relevant page content."""
context_str = "\n\n".join([
f"Knowledge Base {i+1}:\n{content[0]}"
for i, content in enumerate(relevant_contexts)
])
response = client.responses.create(
model="gpt-5-mini",
input=f"""
You are a helpful assistant. Use the following retrieved contexts to answer the user's query.
If the contexts don't contain relevant information, say you don't know.
Retrieved Knowledge Bases:
{context_str}
User Query: {query}
Answer:
""")
return response.output_text
The Complete Script
Here's everything put together:
import psycopg
from pgvector.psycopg import register_vector
import os
from bs4 import BeautifulSoup
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
HTML_DIR = "pages"
def get_db_connection():
"""Creates a connection to PostgreSQL with vector support."""
conn = psycopg.connect(
host="localhost",
dbname="postgres",
user="postgres",
password="postgres",
port=5432
)
register_vector(conn)
return conn
def parse_html_to_text(html_path):
"""Extracts clean text content from HTML file."""
with open(html_path, 'r', encoding='utf-8') as file:
soup = BeautifulSoup(file, 'html.parser')
body = soup.body
if body:
return body.get_text(separator='\n', strip=True)
return ""
def ingest():
"""Reads Notion pages and stores them with embeddings in the database."""
conn = get_db_connection()
cur = conn.cursor()
html_files = [f for f in os.listdir(HTML_DIR) if f.endswith('.html')]
for index, html_file in enumerate(html_files, 1):
print(f"Processing {index}/{len(html_files)}: {html_file}")
html_path = os.path.join(HTML_DIR, html_file)
text = parse_html_to_text(html_path)
if text:
# Generate embedding vector for the page content
response = client.embeddings.create(
model="text-embedding-3-small",
input=[text]
)
embedding = response.data[0].embedding
cur.execute("""
INSERT INTO runbooks (content, embeddings)
VALUES (%s, %s)
""", (text, embedding))
conn.commit()
cur.close()
conn.close()
print(f"✅ Ingested {len(html_files)} pages")
def retrieve_similar_content(query_text, limit=3):
"""Finds pages most similar to the query using vector search."""
conn = get_db_connection()
cur = conn.cursor()
# Convert query to embedding
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input=[query_text]
).data[0].embedding
# Find similar embeddings using cosine distance
cur.execute("""
SELECT content
FROM runbooks
ORDER BY embeddings <=> %s::vector
LIMIT %s
""", (query_embedding, limit))
results = cur.fetchall()
cur.close()
conn.close()
return results
def generate_response(query, relevant_contexts):
"""Generates AI answer based on relevant page content."""
context_str = "\n\n".join([
f"Knowledge Base {i+1}:\n{content[0]}"
for i, content in enumerate(relevant_contexts)
])
response = client.responses.create(
model="gpt-5-mini",
input=f"""
You are a helpful assistant. Use the following retrieved contexts to answer the user's query.
If the contexts don't contain relevant information, say you don't know.
Retrieved Knowledge Bases:
{context_str}
User Query: {query}
Answer:
""")
return response.output_text
if __name__ == "__main__":
# Step 1: Ingest Notion pages (comment this out after first run)
ingest()
# Step 2: Query the knowledge base
query = "What is our organization email?"
similar_pages = retrieve_similar_content(query, limit=3)
response = generate_response(query, similar_pages)
print(f"\nQuery: {query}")
print(f"\nAnswer:\n{response}")
You've just built a production-ready RAG agent that can power your team's knowledge base. But this is just the beginning.
Try experimenting with different embedding models, add memory to track conversation history, or implement filtering by document categories. The possibilities are endless once you understand the fundamentals.
Further Readings
Have questions? Drop them in the comments. Thank you for reading! ❤️
Top comments (0)