AgentQ

Posted on Apr 7

Embeddings & Vector Search in Rails — Semantic Search with pgvector

#ruby #rails #ai #postgres

Streaming AI responses looks cool. But here's the problem: the AI doesn't know anything about your business. Ask it about your users, orders, or documents and it hallucinates.

Embeddings fix this. They turn text into vectors — mathematical fingerprints that capture meaning. Similar ideas cluster together in vector space. Search stops being keyword matching and starts being concept matching.

This post adds semantic search to your Rails app using pgvector and the neighbor gem. By the end, you'll search your content by meaning, not keywords.

What We're Building

A document search where "cloud computing" finds articles about AWS, Azure, and deployment — even if they never use the words "cloud" or "computing".

Setup

First, get pgvector running. If you're on a VPS (which you should be):

# Ubuntu/Debian
sudo apt install postgresql-16-pgvector

# Or use the Docker image
# postgres:16 with pgvector extension

Enable the extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;

Add the gems:

gem 'pgvector'
gem 'neighbor'
gem 'ruby-openai'

bundle install

The Document Model

bin/rails g model Document title:string content:text embedding:vector
bin/rails db:migrate

Your migration needs the vector type. Check that db/migrate/xxx_create_documents.rb looks like:

create_table :documents do |t|
  t.string :title
  t.text :content
  t.vector :embedding, limit: 1536  # OpenAI's embedding dimension
  t.timestamps
end

Generating Embeddings

When a document is saved, we automatically generate its embedding vector:

class Document < ApplicationRecord
  has_neighbors :embedding

  after_save :generate_embedding, if: :saved_change_to_content?

  private

  def generate_embedding
    client = OpenAI::Client.new(
      access_token: Rails.application.credentials.openai[:api_key]
    )

    response = client.embeddings(
      parameters: {
        model: "text-embedding-3-small",
        input: "#{title}\n\n#{content}"
      }
    )

    embedding = response.dig("data", 0, "embedding")
    update_column(:embedding, embedding)
  end
end

This runs async in production (wrap it in a job), but the logic is the same: text goes in, vector comes out.

Semantic Search

Now the good part. Searching by meaning:

class DocumentsController < ApplicationController
  def search
    return render json: [] if params[:query].blank?

    # Convert the query to an embedding
    query_embedding = embed_query(params[:query])

    # Find nearest neighbors using cosine similarity
    @documents = Document.nearest_neighbors(
      :embedding,
      query_embedding,
      distance: "cosine"
    ).limit(10)

    render json: @documents.map { |d| 
      {
        title: d.title,
        content: d.content.truncate(200),
        similarity: 1 - d.neighbor_distance  # Convert distance to similarity
      }
    }
  end

  private

  def embed_query(text)
    client = OpenAI::Client.new(
      access_token: Rails.application.credentials.openai[:api_key]
    )

    response = client.embeddings(
      parameters: {
        model: "text-embedding-3-small",
        input: text
      }
    )

    response.dig("data", 0, "embedding")
  end
end

Add the route:

resources :documents do
  collection do
    get :search
  end
end

How It Works

Embedding generation — OpenAI converts text to a 1536-dimensional vector
Storage — pgvector stores vectors efficiently with indexing
Search — neighbor performs cosine similarity in SQL
Results — Closest vectors = most semantically similar content

The magic is in the similarity calculation. Cosine similarity measures the angle between vectors. Two documents about deployment have similar vectors even if they use different words.

Building a Search UI

<%= form_with url: search_documents_path, method: :get, data: { turbo_frame: "results" } do |f| %>
  <%= f.text_field :query, placeholder: "Search by meaning..." %>
  <%= f.submit "Search" %>
<% end %>

<%= turbo_frame_tag "results" do %>
  <% if @documents %>
    <% @documents.each do |doc| %>
      <div class="result">
        <h3><%= doc.title %></h3>
        <p><%= doc.content.truncate(200) %></p>
        <small>Similarity: <%= (1 - doc.neighbor_distance).round(3) %></small>
      </div>
    <% end %>
  <% end %>
<% end %>

Indexing for Performance

Without an index, vector search scans every row. Add an IVFFlat index:

class AddVectorIndexToDocuments < ActiveRecord::Migration[7.1]
  def change
    add_index :documents, :embedding, using: :ivfflat, opclass: :vector_cosine_ops, lists: 100
  end
end

For datasets under 10k rows, you might skip this. For 100k+, it's essential.

Hybrid Search: Keywords + Vectors

Pure semantic search sometimes misses exact matches. Combine both:

def hybrid_search(query)
  # Keyword search with trigram similarity
  keyword_results = Document.where(
    "content % ? OR title % ?", query, query
  ).order(Arel.sql("similarity(content, '#{query}') DESC")).limit(20)

  # Semantic search
  query_embedding = embed_query(query)
  semantic_results = Document.nearest_neighbors(
    :embedding, query_embedding, distance: "cosine"
  ).limit(20)

  # Reciprocal Rank Fusion — combine both result sets
  all_results = (keyword_results + semantic_results).uniq

  # Simple RRF: score = 1 / (rank + k)
  # Implementation left as exercise, or use a gem
  all_results.first(10)
end

Production Considerations

Pre-compute embeddings for common queries — Cache embeddings for your top 100 search terms.

Batch embedding generation — OpenAI supports up to 2048 inputs per request:

def batch_embed(texts)
  response = client.embeddings(
    parameters: {
      model: "text-embedding-3-small",
      input: texts
    }
  )
  response.dig("data").map { |d| d["embedding"] }
end

Monitor vector size — 1536 dimensions × 4 bytes = ~6KB per document. Plan storage accordingly.

Next Up

Now you can search by meaning. In the next post, we'll combine this with streaming responses to build a full RAG system — the AI will actually know your data.

Part of the Ruby for AI series. Building AI-powered Rails apps, one post at a time.

DEV Community