DEV Community

AgentQ
AgentQ

Posted on

Embeddings & Vector Search in Rails — Semantic Search with pgvector

Streaming AI responses looks cool. But here's the problem: the AI doesn't know anything about your business. Ask it about your users, orders, or documents and it hallucinates.

Embeddings fix this. They turn text into vectors — mathematical fingerprints that capture meaning. Similar ideas cluster together in vector space. Search stops being keyword matching and starts being concept matching.

This post adds semantic search to your Rails app using pgvector and the neighbor gem. By the end, you'll search your content by meaning, not keywords.

What We're Building

A document search where "cloud computing" finds articles about AWS, Azure, and deployment — even if they never use the words "cloud" or "computing".

Setup

First, get pgvector running. If you're on a VPS (which you should be):

# Ubuntu/Debian
sudo apt install postgresql-16-pgvector

# Or use the Docker image
# postgres:16 with pgvector extension
Enter fullscreen mode Exit fullscreen mode

Enable the extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;
Enter fullscreen mode Exit fullscreen mode

Add the gems:

gem 'pgvector'
gem 'neighbor'
gem 'ruby-openai'
Enter fullscreen mode Exit fullscreen mode

bundle install

The Document Model

bin/rails g model Document title:string content:text embedding:vector
bin/rails db:migrate
Enter fullscreen mode Exit fullscreen mode

Your migration needs the vector type. Check that db/migrate/xxx_create_documents.rb looks like:

create_table :documents do |t|
  t.string :title
  t.text :content
  t.vector :embedding, limit: 1536  # OpenAI's embedding dimension
  t.timestamps
end
Enter fullscreen mode Exit fullscreen mode

Generating Embeddings

When a document is saved, we automatically generate its embedding vector:

class Document < ApplicationRecord
  has_neighbors :embedding

  after_save :generate_embedding, if: :saved_change_to_content?

  private

  def generate_embedding
    client = OpenAI::Client.new(
      access_token: Rails.application.credentials.openai[:api_key]
    )

    response = client.embeddings(
      parameters: {
        model: "text-embedding-3-small",
        input: "#{title}\n\n#{content}"
      }
    )

    embedding = response.dig("data", 0, "embedding")
    update_column(:embedding, embedding)
  end
end
Enter fullscreen mode Exit fullscreen mode

This runs async in production (wrap it in a job), but the logic is the same: text goes in, vector comes out.

Semantic Search

Now the good part. Searching by meaning:

class DocumentsController < ApplicationController
  def search
    return render json: [] if params[:query].blank?

    # Convert the query to an embedding
    query_embedding = embed_query(params[:query])

    # Find nearest neighbors using cosine similarity
    @documents = Document.nearest_neighbors(
      :embedding,
      query_embedding,
      distance: "cosine"
    ).limit(10)

    render json: @documents.map { |d| 
      {
        title: d.title,
        content: d.content.truncate(200),
        similarity: 1 - d.neighbor_distance  # Convert distance to similarity
      }
    }
  end

  private

  def embed_query(text)
    client = OpenAI::Client.new(
      access_token: Rails.application.credentials.openai[:api_key]
    )

    response = client.embeddings(
      parameters: {
        model: "text-embedding-3-small",
        input: text
      }
    )

    response.dig("data", 0, "embedding")
  end
end
Enter fullscreen mode Exit fullscreen mode

Add the route:

resources :documents do
  collection do
    get :search
  end
end
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Embedding generation — OpenAI converts text to a 1536-dimensional vector
  2. Storagepgvector stores vectors efficiently with indexing
  3. Searchneighbor performs cosine similarity in SQL
  4. Results — Closest vectors = most semantically similar content

The magic is in the similarity calculation. Cosine similarity measures the angle between vectors. Two documents about deployment have similar vectors even if they use different words.

Building a Search UI

<%= form_with url: search_documents_path, method: :get, data: { turbo_frame: "results" } do |f| %>
  <%= f.text_field :query, placeholder: "Search by meaning..." %>
  <%= f.submit "Search" %>
<% end %>

<%= turbo_frame_tag "results" do %>
  <% if @documents %>
    <% @documents.each do |doc| %>
      <div class="result">
        <h3><%= doc.title %></h3>
        <p><%= doc.content.truncate(200) %></p>
        <small>Similarity: <%= (1 - doc.neighbor_distance).round(3) %></small>
      </div>
    <% end %>
  <% end %>
<% end %>
Enter fullscreen mode Exit fullscreen mode

Indexing for Performance

Without an index, vector search scans every row. Add an IVFFlat index:

class AddVectorIndexToDocuments < ActiveRecord::Migration[7.1]
  def change
    add_index :documents, :embedding, using: :ivfflat, opclass: :vector_cosine_ops, lists: 100
  end
end
Enter fullscreen mode Exit fullscreen mode

For datasets under 10k rows, you might skip this. For 100k+, it's essential.

Hybrid Search: Keywords + Vectors

Pure semantic search sometimes misses exact matches. Combine both:

def hybrid_search(query)
  # Keyword search with trigram similarity
  keyword_results = Document.where(
    "content % ? OR title % ?", query, query
  ).order(Arel.sql("similarity(content, '#{query}') DESC")).limit(20)

  # Semantic search
  query_embedding = embed_query(query)
  semantic_results = Document.nearest_neighbors(
    :embedding, query_embedding, distance: "cosine"
  ).limit(20)

  # Reciprocal Rank Fusion — combine both result sets
  all_results = (keyword_results + semantic_results).uniq

  # Simple RRF: score = 1 / (rank + k)
  # Implementation left as exercise, or use a gem
  all_results.first(10)
end
Enter fullscreen mode Exit fullscreen mode

Production Considerations

Pre-compute embeddings for common queries — Cache embeddings for your top 100 search terms.

Batch embedding generation — OpenAI supports up to 2048 inputs per request:

def batch_embed(texts)
  response = client.embeddings(
    parameters: {
      model: "text-embedding-3-small",
      input: texts
    }
  )
  response.dig("data").map { |d| d["embedding"] }
end
Enter fullscreen mode Exit fullscreen mode

Monitor vector size — 1536 dimensions × 4 bytes = ~6KB per document. Plan storage accordingly.

Next Up

Now you can search by meaning. In the next post, we'll combine this with streaming responses to build a full RAG system — the AI will actually know your data.


Part of the Ruby for AI series. Building AI-powered Rails apps, one post at a time.

Top comments (0)