DEV Community

Hugo
Hugo

Posted on • Originally published at itapi.ai

Building AI-Powered Search with Text Embeddings: A Hands-On Tutorial [May 2026]

Building AI-Powered Search with Text Embeddings: A Hands-On Tutorial

What Are Embeddings?

Embeddings turn text into dense vectors of floating-point numbers. Two sentences with similar meaning will have vectors that point in nearly the same direction. This is the foundation of semantic search, recommendation engines, and RAG (Retrieval-Augmented Generation).

Building Semantic Search in 20 Lines

import openai
import numpy as np

client = openai.OpenAI(
    api_key="your-itapi-key",
    base_url="https://api.itapi.ai/v1"
)

def get_embedding(text: str) -> list[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

# Semantic search over documents
docs = [
    "How to deploy Flask applications to production",
    "Django vs FastAPI: choosing the right Python framework",
    "Setting up PostgreSQL with Docker Compose"
]
doc_embeddings = [get_embedding(d) for d in docs]

query = "Best Python web framework for APIs"
query_emb = get_embedding(query)

# Cosine similarity via dot product (vectors are normalized)
scores = [np.dot(query_emb, d) for d in doc_embeddings]
best_match = docs[np.argmax(scores)]

print(f"Query: {query}")
print(f"Top result: {best_match} (score: {max(scores):.3f})")
Enter fullscreen mode Exit fullscreen mode

Scaling Up

For production, store embeddings in a vector database like Pinecone, Weaviate, or pgvector. The query pattern stays identical: embed the query, compute similarity against the index, return the top-k matches.

RAG Pipeline Overview

def answer_question(question: str, knowledge_base: list[str]):
    # 1. Retrieve relevant context
    q_emb = get_embedding(question)
    scored = [(np.dot(q_emb, d), d) for d in knowledge_base]
    context = sorted(scored, reverse=True)[:3]

    # 2. Generate answer with context
    prompt = f"Answer based on context:
{context}
Question: {question}"
    return generate("gpt-4o", prompt)
Enter fullscreen mode Exit fullscreen mode

What's Next?

Have you tried integrating multiple LLM providers in a single project? Share your experience or questions in the comments below.


This guide was written for developers who want practical, no-fluff tutorials. If you are building with AI APIs, check out itapi.ai for a developer-friendly platform with transparent pricing and multi-model support.

Top comments (0)