Streaming AI responses looks cool. But here's the problem: the AI doesn't know anything about your business. Ask it about your users, orders, or documents and it hallucinates.
Embeddings fix this. They turn text into vectors — mathematical fingerprints that capture meaning. Similar ideas cluster together in vector space. Search stops being keyword matching and starts being concept matching.
This post adds semantic search to your Rails app using pgvector and the neighbor gem. By the end, you'll search your content by meaning, not keywords.
What We're Building
A document search where "cloud computing" finds articles about AWS, Azure, and deployment — even if they never use the words "cloud" or "computing".
Setup
First, get pgvector running. If you're on a VPS (which you should be):
# Ubuntu/Debian
sudo apt install postgresql-16-pgvector
# Or use the Docker image
# postgres:16 with pgvector extension
Enable the extension in your database:
CREATE EXTENSION IF NOT EXISTS vector;
Add the gems:
gem 'pgvector'
gem 'neighbor'
gem 'ruby-openai'
bundle install
The Document Model
bin/rails g model Document title:string content:text embedding:vector
bin/rails db:migrate
Your migration needs the vector type. Check that db/migrate/xxx_create_documents.rb looks like:
create_table :documents do |t|
t.string :title
t.text :content
t.vector :embedding, limit: 1536 # OpenAI's embedding dimension
t.timestamps
end
Generating Embeddings
When a document is saved, we automatically generate its embedding vector:
class Document < ApplicationRecord
has_neighbors :embedding
after_save :generate_embedding, if: :saved_change_to_content?
private
def generate_embedding
client = OpenAI::Client.new(
access_token: Rails.application.credentials.openai[:api_key]
)
response = client.embeddings(
parameters: {
model: "text-embedding-3-small",
input: "#{title}\n\n#{content}"
}
)
embedding = response.dig("data", 0, "embedding")
update_column(:embedding, embedding)
end
end
This runs async in production (wrap it in a job), but the logic is the same: text goes in, vector comes out.
Semantic Search
Now the good part. Searching by meaning:
class DocumentsController < ApplicationController
def search
return render json: [] if params[:query].blank?
# Convert the query to an embedding
query_embedding = embed_query(params[:query])
# Find nearest neighbors using cosine similarity
@documents = Document.nearest_neighbors(
:embedding,
query_embedding,
distance: "cosine"
).limit(10)
render json: @documents.map { |d|
{
title: d.title,
content: d.content.truncate(200),
similarity: 1 - d.neighbor_distance # Convert distance to similarity
}
}
end
private
def embed_query(text)
client = OpenAI::Client.new(
access_token: Rails.application.credentials.openai[:api_key]
)
response = client.embeddings(
parameters: {
model: "text-embedding-3-small",
input: text
}
)
response.dig("data", 0, "embedding")
end
end
Add the route:
resources :documents do
collection do
get :search
end
end
How It Works
- Embedding generation — OpenAI converts text to a 1536-dimensional vector
-
Storage —
pgvectorstores vectors efficiently with indexing -
Search —
neighborperforms cosine similarity in SQL - Results — Closest vectors = most semantically similar content
The magic is in the similarity calculation. Cosine similarity measures the angle between vectors. Two documents about deployment have similar vectors even if they use different words.
Building a Search UI
<%= form_with url: search_documents_path, method: :get, data: { turbo_frame: "results" } do |f| %>
<%= f.text_field :query, placeholder: "Search by meaning..." %>
<%= f.submit "Search" %>
<% end %>
<%= turbo_frame_tag "results" do %>
<% if @documents %>
<% @documents.each do |doc| %>
<div class="result">
<h3><%= doc.title %></h3>
<p><%= doc.content.truncate(200) %></p>
<small>Similarity: <%= (1 - doc.neighbor_distance).round(3) %></small>
</div>
<% end %>
<% end %>
<% end %>
Indexing for Performance
Without an index, vector search scans every row. Add an IVFFlat index:
class AddVectorIndexToDocuments < ActiveRecord::Migration[7.1]
def change
add_index :documents, :embedding, using: :ivfflat, opclass: :vector_cosine_ops, lists: 100
end
end
For datasets under 10k rows, you might skip this. For 100k+, it's essential.
Hybrid Search: Keywords + Vectors
Pure semantic search sometimes misses exact matches. Combine both:
def hybrid_search(query)
# Keyword search with trigram similarity
keyword_results = Document.where(
"content % ? OR title % ?", query, query
).order(Arel.sql("similarity(content, '#{query}') DESC")).limit(20)
# Semantic search
query_embedding = embed_query(query)
semantic_results = Document.nearest_neighbors(
:embedding, query_embedding, distance: "cosine"
).limit(20)
# Reciprocal Rank Fusion — combine both result sets
all_results = (keyword_results + semantic_results).uniq
# Simple RRF: score = 1 / (rank + k)
# Implementation left as exercise, or use a gem
all_results.first(10)
end
Production Considerations
Pre-compute embeddings for common queries — Cache embeddings for your top 100 search terms.
Batch embedding generation — OpenAI supports up to 2048 inputs per request:
def batch_embed(texts)
response = client.embeddings(
parameters: {
model: "text-embedding-3-small",
input: texts
}
)
response.dig("data").map { |d| d["embedding"] }
end
Monitor vector size — 1536 dimensions × 4 bytes = ~6KB per document. Plan storage accordingly.
Next Up
Now you can search by meaning. In the next post, we'll combine this with streaming responses to build a full RAG system — the AI will actually know your data.
Part of the Ruby for AI series. Building AI-powered Rails apps, one post at a time.
Top comments (0)