Julien L for WiScale

Posted on Apr 7 • Edited on Apr 9

Building a Real-Time Recommendation Engine with VelesDB

#ai #python #tutorial #database

A visitor lands on the Elden Ring product page. They have never signed up, just browsing. Next to the buy button, a snippet appears: "Players also explored: Dark Souls III, God of War, Cyberpunk 2077." They click God of War. They buy both.

Two days later, Alice, a returning customer who already owns Elden Ring and Dark Souls III, visits the same page. Her snippet reads differently: "Based on your library: The Witcher 3, Cyberpunk 2077, Blasphemous." The engine knows her taste.

Same product page. Two completely different snippets. No external API call. No cloud ML pipeline. Just a 3MB binary running on your server, embedded directly in your application process.

This is what we are going to build in this article.

The problem with most recommendation systems

Classic approaches fall into two traps. Either they call an external service (AWS Personalize, Google Recommendations AI) which introduces latency, cost, and a hard dependency on connectivity. Or they pre-compute nightly batch recommendations that are stale the moment a user's behavior changes.

What I wanted for VelesDB was something different: no external service, the recommendation engine lives inside your deployment, and it reacts to the current session in real time with no nightly batch. It also needs to handle two distinct cases, anonymous prospects and authenticated customers, cleanly.

VelesDB® fits this use case naturally. It is a local-first vector + graph + columnar database, written in Rust (about 3MB binary). For recommendations, we will use two of its three pillars: a vector collection to store product embeddings and find semantically similar games, and a graph collection to store user behavior (purchases, views) and traverse purchase history.

Content-based filtering recommends items similar to what the user is currently looking at, based on item attributes alone, no knowledge of other users is needed. Collaborative filtering recommends items based on the behavior of other users who share similar tastes ("people who bought X also bought Y"). Both approaches complement each other and we will implement all three strategies below.

Architecture overview

                 +--------------------------------------+
                 |           VelesDB (3MB, local)        |
                 |                                       |
  Product page   |  +--------------+  +--------------+  |
  request   ---> |  |   products   |  | user_behavior |  |
                 |  |  collection  |  |    graph      |  |
                 |  | (384D cosine)|  | (edges+nodes) |  |
                 |  +------+-------+  +------+--------+  |
                 |         |                 |           |
                 |         v                 v           |
                 |   vector search    graph traversal    |
                 |   (all users)      (customers only)   |
                 |         |                 |           |
                 |         +---------+-------+           |
                 |                   v                   |
                 |        fused recommendations          |
                 +--------------------------------------+
                                    |
                                    v
                          product snippet (HTML/JSON)

The logic branches on authentication state:

Visitor type	Strategy	Data source
Prospect (anonymous)	Content-based filtering	products collection
Customer (authenticated)	Graph history + vector fusion	behavior graph + products

Step 1 - Storing your product catalog

Install VelesDB first:

pip install velesdb

Each product becomes a vector (also called an embedding) in the products collection.

What is a vector embedding? An embedding is a list of floating-point numbers, for example [0.12, -0.87, 0.45, ...], that positions a piece of content in a multi-dimensional space. Two games that share genre, tags, and gameplay style will end up close to each other in that space; two games that have nothing in common will end up far apart. A text embedding model like sentence-transformers/all-MiniLM-L6-v2 produces vectors of 384 numbers (384 dimensions) for any string you feed it. The more dimensions, the more nuance the model can encode, but also the more memory each vector takes.

In production you would use sentence-transformers/all-MiniLM-L6-v2 (384 dimensions). For local testing without GPU, a deterministic function based on product metadata works fine.

import velesdb
from sentence_transformers import SentenceTransformer

db = velesdb.Database("./gamestore_db")

# 384-dimensional collection, cosine similarity
products = db.get_or_create_collection("products", 384, metric="cosine")

# Initialize the embedding model once at startup
encoder = SentenceTransformer("all-MiniLM-L6-v2")

def embed_game(game: dict) -> list[float]:
    """Turn game metadata into a dense vector."""
    text = f"{game['title']} {game['genre']} {game.get('description', '')} {' '.join(game['tags'])}"
    return encoder.encode(text).tolist()

catalog = [
    {
        "title": "Elden Ring",
        "genre": "RPG",
        "publisher": "FromSoftware",
        "price": 59.99,
        "description": "A vast open-world action RPG set in the Lands Between. Brutal combat, cryptic lore, and interconnected dungeons designed by Hidetaka Miyazaki and George R.R. Martin.",
        "tags": ["soulslike", "open-world", "fantasy", "difficult"],
    },
    {
        "title": "Dark Souls III",
        "genre": "RPG",
        "publisher": "FromSoftware",
        "price": 39.99,
        "description": "The final chapter of the Dark Souls trilogy. Punishing yet rewarding combat, intricate level design, and a dark gothic world on the verge of collapse.",
        "tags": ["soulslike", "fantasy", "hardcore", "gothic"],
    },
    {
        "title": "God of War",
        "genre": "Action-RPG",
        "publisher": "Sony",
        "price": 44.99,
        "description": "Kratos and his son Atreus journey through Norse mythology. A cinematic action RPG combining visceral combat with an emotional story of fatherhood and identity.",
        "tags": ["mythology", "action", "story-rich", "combat"],
    },
    {
        "title": "The Witcher 3",
        "genre": "RPG",
        "publisher": "CD Projekt Red",
        "price": 29.99,
        "description": "An open-world RPG following Geralt of Rivia, a monster hunter for hire. Morally complex quests, a living world, and one of the richest narratives in gaming.",
        "tags": ["open-world", "fantasy", "story-rich", "narrative"],
    },
    {
        "title": "Cyberpunk 2077",
        "genre": "RPG",
        "publisher": "CD Projekt Red",
        "price": 49.99,
        "description": "A first-person open-world RPG set in Night City, a megacity obsessed with power and body modification. Deep story branching, cybernetic upgrades, and a neon-drenched dystopia.",
        "tags": ["sci-fi", "open-world", "story-rich", "action"],
    },
    {
        "title": "FIFA 25",
        "genre": "Sports",
        "publisher": "EA",
        "price": 49.99,
        "description": "The most popular football simulation game. Real teams, real players, Ultimate Team card collecting, and competitive online multiplayer.",
        "tags": ["football", "multiplayer", "simulation"],
    },
    {
        "title": "NBA 2K25",
        "genre": "Sports",
        "publisher": "2K",
        "price": 44.99,
        "description": "The definitive basketball simulation. Authentic player animations, MyCareer story mode, and deep franchise management across all NBA teams.",
        "tags": ["basketball", "multiplayer", "simulation"],
    },
    {
        "title": "Minecraft",
        "genre": "Sandbox",
        "publisher": "Mojang",
        "price": 26.99,
        "description": "A procedurally generated world where you gather resources, build structures, and survive the night. Limitless creativity in survival or creative mode.",
        "tags": ["building", "survival", "creativity", "multiplayer"],
    },
    {
        "title": "Stardew Valley",
        "genre": "Simulation",
        "publisher": "ConcernedApe",
        "price": 13.99,
        "description": "Inherit your grandfather's old farm and build it into something great. Grow crops, raise animals, mine for resources, and build relationships with the villagers.",
        "tags": ["farming", "relaxing", "indie", "rpg-elements"],
    },
    {
        "title": "Hollow Knight",
        "genre": "Metroidvania",
        "publisher": "Team Cherry",
        "price": 14.99,
        "description": "A challenging 2D action adventure set in a vast underground kingdom of insects. Precise combat, hundreds of rooms to explore, and a haunting hand-drawn art style.",
        "tags": ["platformer", "difficult", "exploration", "indie"],
    },
]

# Assign stable integer IDs (database requires int IDs, not strings)
points = []
for i, game in enumerate(catalog, start=1):
    vector = embed_game(game)
    points.append({
        "id": i,
        "vector": vector,
        "payload": {
            "title": game["title"],
            "genre": game["genre"],
            "publisher": game["publisher"],
            "price": game["price"],
            "description": game["description"],
            "tags": game["tags"],
        },
    })

count = products.upsert(points)
print(f"Catalog indexed: {count} games")
# -> Catalog indexed: 10 games

Each game lives as a dense float32 vector alongside its full metadata payload.

Dense vs sparse vector: a dense vector has a non-zero value at almost every position. The 384 floats produced by a language model are all meaningful. A sparse vector, by contrast, has mostly zeros (think TF-IDF word counts). Dense vectors capture semantic meaning; sparse vectors capture exact keyword matches. VelesDB supports both, but for recommendations based on genre and mood, dense embeddings are the right tool.

HNSW (Hierarchical Navigable Small World): the index VelesDB builds automatically on top of your vectors. It organizes vectors into a multi-layer graph so that nearest-neighbor search runs in O(log n) rather than scanning every vector. In practice this means sub-millisecond queries even at catalog scale (hundreds of thousands of products).

Updating the catalog is just another upsert. VelesDB will overwrite the vector and payload for that ID in-place:

# Price drop on Hollow Knight
products.upsert([{
    "id": 10,
    "vector": embed_game(catalog[9]),  # vector unchanged
    "payload": {**catalog[9], "price": 9.99},
}])

Step 2 - Storing user behavior as a graph

User interactions, purchases, views, wishlist additions, live in a graph collection. Each user is a node. Each interaction is a labeled edge pointing to a product node (by product ID).

# Graph collection for behavioral data (no vectors needed here)
behavior = db.create_graph_collection("user_behavior")

Registering a customer

def register_customer(user_id: int, name: str, email: str) -> None:
    """Store customer profile as a graph node payload."""
    behavior.store_node_payload(user_id, {
        "name": name,
        "email": email,
        "type": "customer",
        "member_since": "2024-01-01",
    })

Recording interactions during a session

import time

EDGE_COUNTER = 1000  # simple auto-increment for edge IDs

def record_purchase(user_id: int, product_id: int, price: float) -> None:
    global EDGE_COUNTER
    behavior.add_edge({
        "id": EDGE_COUNTER,
        "source": user_id,
        "target": product_id,
        "label": "purchased",
        "properties": {
            "price_paid": price,
            "timestamp": int(time.time()),
        },
    })
    EDGE_COUNTER += 1

def record_view(user_id: int, product_id: int, duration_s: int) -> None:
    global EDGE_COUNTER
    behavior.add_edge({
        "id": EDGE_COUNTER,
        "source": user_id,
        "target": product_id,
        "label": "viewed",
        "properties": {
            "duration_s": duration_s,
            "timestamp": int(time.time()),
        },
    })
    EDGE_COUNTER += 1

Example: Alice and Bob's histories

# Alice, RPG fan, registered customer
register_customer(1001, "Alice", "alice@example.com")
record_purchase(1001, product_id=1, price=59.99)  # Elden Ring
record_purchase(1001, product_id=2, price=39.99)  # Dark Souls III
record_view(1001, product_id=5, duration_s=12)    # FIFA 25 (glanced at)

# Bob, Sports fan
register_customer(1002, "Bob", "bob@example.com")
record_purchase(1002, product_id=6, price=49.99)  # FIFA 25
record_purchase(1002, product_id=7, price=44.99)  # NBA 2K25

The graph is a persistent adjacency structure.

Graph terminology: a node is an entity (a user, a product). An edge is a directed relationship between two nodes, with a label that describes the relationship type (purchased, viewed, wishlisted). Edges can carry additional data as properties (price paid, timestamp, session duration). get_outgoing(user_id) returns all edges that leave a node, what the user did. get_incoming(product_id) returns all edges that arrive at a node, who interacted with that product. Both run in constant time O(degree), meaning the lookup time depends only on how many edges that node has, not on the total size of the graph.

Step 3 - Generating recommendations

Now the interesting part. Two strategies, one clean branch point.

Strategy A: Prospect, content-based filtering

The visitor is anonymous. We have exactly one signal: the product they are looking at. We use the product's vector to find the closest neighbors in the catalog.

def recommend_for_prospect(
    current_product_id: int,
    top_k: int = 4,
) -> list[dict]:
    """Return games similar to the one being viewed.

    Uses vector similarity search, no user history needed.
    """
    # Retrieve the current product's vector
    points = products.get([current_product_id])
    if not points or points[0] is None:
        return []

    current_vector = points[0]["vector"]

    # Find the nearest neighbors (excluding the product itself)
    results = products.search(vector=current_vector, top_k=top_k + 1)
    return [r for r in results if r["id"] != current_product_id][:top_k]

Usage:

# Visitor is on the Elden Ring page
recs = recommend_for_prospect(current_product_id=1, top_k=4)

print("You might also like:")
for r in recs:
    p = r["payload"]
    print(f"  [{r['score']:.2f}] {p['title']} ({p['genre']}) - EUR {p['price']:.2f}")

You might also like:
  [0.98] Dark Souls III (RPG) - EUR 39.99
  [0.95] God of War (Action-RPG) - EUR 44.99
  [0.91] Hollow Knight (Metroidvania) - EUR 14.99
  [0.87] The Witcher 3 (RPG) - EUR 29.99

The score field is the cosine similarity between the two embedding vectors.

Cosine similarity measures the angle between two vectors, not their distance. If A and B are two embedding vectors:
cosine_similarity(A, B) = (A . B) / (|A| x |B|)
Where A . B is the dot product (sum of element-wise products) and |A| is the norm (length) of the vector. The result is always between -1 and 1. Two vectors pointing in the same direction score 1.0 (identical semantic content). Two vectors at a 90-degree angle score 0.0 (nothing in common). Negative scores mean semantic opposition, rare in practice for game embeddings.

Cosine similarity ignores the magnitude of vectors and only looks at direction, which makes it robust to texts of different lengths. That is why VelesDB uses metric="cosine" by default for text embeddings.

A score of 0.98 means Dark Souls III and Elden Ring point in almost exactly the same direction in the 384-dimensional embedding space. They share soulslike mechanics, FromSoftware aesthetics, and difficulty-focused tags.

Strategy B: Customer, graph traversal + multi-query fusion

Alice is logged in. We know she bought Elden Ring and Dark Souls III. She is now on the God of War page. Rather than ignoring her history, we combine it with the current product to produce a truly personalized snippet.

The approach has three steps. First, traverse the graph to retrieve Alice's purchased product IDs. Then fetch the vectors of those products from the catalog. Finally, run a multi-query search using all those vectors together (including the current product), fused via Reciprocal Rank Fusion (RRF).

def recommend_for_customer(
    user_id: int,
    current_product_id: int,
    top_k: int = 4,
) -> list[dict]:
    """Return personalized recommendations fusing purchase history + current context.

    Falls back to content-based filtering if the user has no purchase history.
    """
    # Check the user exists in the graph
    user_profile = behavior.get_node_payload(user_id)
    if user_profile is None:
        return recommend_for_prospect(current_product_id, top_k)

    # Retrieve purchase history via graph traversal
    outgoing_edges = behavior.get_outgoing(user_id)
    purchased_ids = [
        edge["target"]
        for edge in outgoing_edges
        if edge["label"] == "purchased"
    ]

    if not purchased_ids:
        return recommend_for_prospect(current_product_id, top_k)

    # Fetch vectors for purchased products + current product
    all_ids = purchased_ids + [current_product_id]
    points = products.get(all_ids)
    query_vectors = [p["vector"] for p in points if p is not None]

    # Multi-query search with Reciprocal Rank Fusion (RRF)
    # RRF promotes candidates that rank consistently well across all queries,
    # not just the one with the highest single-query score.
    strategy = velesdb.FusionStrategy.rrf(k=60)  # k=60 is the standard smoothing constant
    results = products.multi_query_search(
        query_vectors,
        top_k=top_k + len(purchased_ids) + 1,
        fusion=strategy,
    )

    # Filter out products already owned and the current product
    exclude_ids = set(purchased_ids) | {current_product_id}
    filtered = [r for r in results if r["id"] not in exclude_ids]
    return filtered[:top_k]

Usage:

# Alice is on the God of War page
recs = recommend_for_customer(user_id=1001, current_product_id=3, top_k=4)

print("Based on your library:")
for r in recs:
    p = r["payload"]
    print(f"  {p['title']} ({p['genre']}) - EUR {p['price']:.2f}")

Based on your library:
  The Witcher 3 (RPG) - EUR 29.99
  Cyberpunk 2077 (RPG) - EUR 49.99
  Hollow Knight (Metroidvania) - EUR 14.99
  Stardew Valley (Simulation) - EUR 13.99

Alice gets story-rich RPGs that her purchase history points to, blended with the God of War context. The Sports games never appear, they are too far in vector space from anything in her history.

How RRF works: instead of averaging raw similarity scores (which are on different scales across queries), RRF combines ranks. For each candidate product, it computes:
RRF_score(product) = sum of  1 / (k + rank_i(product))
                    for each query i
Where rank_i(product) is the position of that product in the result list of query i (1 = best), and k is a smoothing constant (60 by default, from the original 2009 Cormack et al. paper). The sum is taken over all n query vectors (Alice's two purchases + the current product = 3 queries here).

Why k=60? It dampens the advantage of the very top position. Without it, a product ranked 1st in a single query would dominate even if it ranked last in all other queries. With k=60, the difference between rank 1 (score: 1/61 = 0.016) and rank 5 (score: 1/65 = 0.015) is small, rewarding products that appear consistently in the top 10 across all queries rather than explosively first in just one.

Why not average scores directly? Because the raw cosine similarity from query A (say 0.95 for Dark Souls vs Elden Ring) is not comparable to the cosine similarity from query B (say 0.62 for Witcher 3 vs Dark Souls). Rank-based fusion is scale-invariant and works reliably regardless of how concentrated or spread out the individual score distributions are.

Bob, visiting the Minecraft page, gets a completely different result:

recs = recommend_for_customer(user_id=1002, current_product_id=8, top_k=4)
print("Based on your library:")
for r in recs:
    print(f"  {r['payload']['title']} ({r['payload']['genre']})")

Based on your library:
  Stardew Valley (Simulation)
  Terraria (Sandbox)
  NBA 2K25 (Sports)
  FIFA 25 (Sports)

His Sports purchases anchor the recommendations, and Minecraft's sandbox-creative vector pulls in Stardew Valley and Terraria as close neighbors.

Strategy C: Also-bought, collaborative filtering via graph

This one is optional, but the question naturally arises: can we surface products that other users with similar taste have bought? Yes, and the graph makes it straightforward.

The idea: for the product currently being viewed, find everyone who bought it, then aggregate what else they bought. Products that appear frequently across multiple buyers rise to the top.

from collections import Counter

def recommend_collaborative(
    current_product_id: int,
    exclude_ids: set[int] | None = None,
    top_k: int = 4,
) -> list[dict]:
    """Surface products co-purchased by users who also bought the current product.

    This works without any vector math, pure graph traversal.
    """
    if exclude_ids is None:
        exclude_ids = set()
    exclude_ids = exclude_ids | {current_product_id}

    # Who bought this product?
    buyer_edges = behavior.get_incoming(current_product_id)
    buyer_ids = [e["source"] for e in buyer_edges if e["label"] == "purchased"]

    if not buyer_ids:
        return []

    # What else did each buyer purchase?
    co_purchase_counts: Counter = Counter()
    for buyer_id in buyer_ids:
        their_edges = behavior.get_outgoing(buyer_id)
        for edge in their_edges:
            if edge["label"] == "purchased" and edge["target"] not in exclude_ids:
                co_purchase_counts[edge["target"]] += 1

    # Rank by frequency, most co-purchased first
    top_ids = [pid for pid, _ in co_purchase_counts.most_common(top_k)]

    # Fetch full product data
    pts = products.get(top_ids)
    results = [p for p in pts if p is not None]
    for r in results:
        r["co_purchase_count"] = co_purchase_counts[r["id"]]
    return results

Usage, works for both prospects and customers:

# Anyone on the Elden Ring page: what did other buyers also buy?
collab = recommend_collaborative(current_product_id=1, top_k=3)

print("Others who bought this also bought:")
for r in collab:
    p = r["payload"]
    print(f"  [{r['co_purchase_count']} buyers] {p['title']} - EUR {p['price']:.2f}")

Others who bought this also bought:
  [2 buyers] Dark Souls III - EUR 39.99
  [1 buyers] The Witcher 3 - EUR 29.99
  [1 buyers] Hollow Knight - EUR 14.99

For authenticated customers, combine it with the personal list and deduplicate:

def recommend_combined(
    user_id: int,
    current_product_id: int,
    top_k: int = 4,
) -> list[dict]:
    """Merge personalized + collaborative signals, deduplicated."""
    # Get the user's purchase history to exclude
    outgoing = behavior.get_outgoing(user_id)
    owned_ids = {e["target"] for e in outgoing if e["label"] == "purchased"}
    exclude = owned_ids | {current_product_id}

    personal = recommend_for_customer(user_id, current_product_id, top_k=top_k)
    collab = recommend_collaborative(current_product_id, exclude_ids=exclude, top_k=top_k)

    seen_ids: set[int] = set()
    merged: list[dict] = []
    for r in collab + personal:
        if r["id"] not in seen_ids and r["id"] not in exclude:
            seen_ids.add(r["id"])
            merged.append(r)
    return merged[:top_k]

The collaborative signal is particularly valuable for top-selling products (many buyers = rich signal) and for prospects who have no personal history yet. For niche titles with few buyers, fall back gracefully to content-based.## Step 4 - Pushing the recommendation snippet

Now that we have results, we need to push them to the product page. The pattern depends on your stack.

Option A: REST API endpoint (FastAPI example)

from fastapi import FastAPI, Request

app = FastAPI()
db = velesdb.Database("./gamestore_db")
products = db.get_or_create_collection("products", 384, metric="cosine")
behavior = db.get_graph_collection("user_behavior")

@app.get("/api/recommendations/{product_id}")
async def get_recommendations(product_id: int, request: Request) -> dict:
    """Return product recommendations for a given product page.

    Reads 'X-User-Id' header to distinguish prospects from customers.
    """
    user_id_header = request.headers.get("X-User-Id")

    if user_id_header is None:
        # Anonymous visitor, content-based
        recs = recommend_for_prospect(product_id, top_k=4)
        label = "Players also explored"
    else:
        # Authenticated customer, personalized
        user_id = int(user_id_header)
        recs = recommend_for_customer(user_id, product_id, top_k=4)
        label = "Based on your library"

    return {
        "label": label,
        "items": [
            {
                "id": r["id"],
                "title": r["payload"]["title"],
                "genre": r["payload"]["genre"],
                "price": r["payload"]["price"],
                "score": round(r["score"], 3),
            }
            for r in recs
        ],
    }

A frontend call then becomes straightforward:

// On the product page (React/Vue/vanilla JS)
async function loadRecommendations(productId) {
  const headers = userId ? { "X-User-Id": userId } : {};
  const res = await fetch(`/api/recommendations/${productId}`, { headers });
  const data = await res.json();

  renderSnippet(data.label, data.items);
}

Option B: Synchronous Python (Django, Flask, server-rendered)

If your stack renders HTML server-side, call the recommendation functions directly in your view and inject the results into the template context. VelesDB queries are synchronous and fast enough to inline:

# Django view (simplified)
def product_detail(request, product_id):
    user_id = request.user.id if request.user.is_authenticated else None

    if user_id:
        recs = recommend_for_customer(user_id, product_id, top_k=4)
        rec_label = "Based on your library"
    else:
        recs = recommend_for_prospect(product_id, top_k=4)
        rec_label = "Players also explored"

    return render(request, "product.html", {
        "product": get_product(product_id),
        "recommendations": recs,
        "rec_label": rec_label,
    })

The typical latency for a cold vector search on a catalog of 50,000 products is under 5ms on a standard developer machine. The graph traversal (get_outgoing) is O(degree), so the lookup time is proportional only to the number of edges that user has, not to the total number of users in the graph. For a customer with 50 purchases, it is effectively instant.

Keeping data fresh

A recommendation engine is only as good as its data freshness. Two operations keep it current.

Adding a new product to the catalog:

def add_product(product_id: int, game_metadata: dict) -> None:
    vector = embed_game(game_metadata)
    products.upsert([{
        "id": product_id,
        "vector": vector,
        "payload": game_metadata,
    }])
    print(f"Product {game_metadata['title']} indexed.")

Recording a purchase after checkout:

def on_purchase_complete(user_id: int, product_id: int, price: float) -> None:
    """Hook called by the order confirmation handler."""
    record_purchase(user_id, product_id, price)
    # The next recommendation call for this user will immediately
    # reflect this purchase, no nightly batch needed.

Because the graph is persistent (VelesDB writes to disk on each edge insertion), restarting the server does not lose behavioral data. The product collection is also durable, it survives restarts and only needs to be re-indexed when the catalog changes.

Deployment options

"Local-first" does not mean running on the end-user's laptop. For a web store, it means VelesDB lives inside your infrastructure, no third-party ML service involved. Three deployment modes are available depending on your stack.

Mode 1: Embedded (what this article uses)

VelesDB runs in the same process as your Python backend. The pip install velesdb package compiles the engine to a native extension that is imported directly:

[Browser] --> HTTP --> [FastAPI / Django process]
                           |__ velesdb (in-process, no network hop)
                                 |__ products collection
                                 |__ user_behavior graph

Concurrency: the Python GIL (Global Interpreter Lock) is held during search() and upsert() calls. This means a single Python process serializes concurrent VelesDB calls. Parallel threads do not speed up vector search. Each search on a 50k-product catalog takes roughly 0.5ms; at that speed a single process can still handle a few hundred recommendations per second. But for real production concurrency, the correct deployment is multiple worker processes:

# Each worker is a separate OS process with its own VelesDB instance
# Python's GIL does not apply across processes
uvicorn main:app --workers 4 --host 0.0.0.0 --port 8000

Each worker loads the database independently. Writes (new purchases) need to reach all workers, use a message queue (Redis pub/sub, Kafka) to fan out behavioral events, or switch to Mode 2 below.

Best for: Python shops (FastAPI, Django, Flask) with moderate traffic. Multi-worker setup handles hundreds to low thousands of concurrent recommendation requests.

Mode 2: Standalone server (velesdb-server)

VelesDB ships a separate Rust binary that exposes a REST API. You run it as a sidecar container and call it over HTTP from any language.

# Docker
docker run -p 8080:8080 -v velesdb_data:/data ghcr.io/cyberlife-coder/velesdb

# From cargo
cargo install velesdb-server && velesdb-server --port 8080 --data-dir ./data

[Browser] --> HTTP --> [Node.js / Java / Ruby backend]
                           |__ HTTP --> [velesdb-server :8080]
                                           |__ products collection
                                           |__ user_behavior graph

Then the recommendation call from any language becomes a plain HTTP POST:

# Search for similar products (replaces products.search() in Python)
curl -X POST http://localhost:8080/collections/products/search \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.12, -0.87, ...], "top_k": 5}'

Best for: polyglot stacks, microservices, or when multiple backend instances need to share the same vector index.

Mode 3: WebAssembly (client-side)

VelesDB compiles to WASM via the @wiscale/velesdb-wasm npm package. The search runs entirely in the browser. The product catalog is shipped once and queried locally in the session, with no server round-trip at all.

npm install @wiscale/velesdb-wasm

import init, { VectorStore } from '@wiscale/velesdb-wasm';

await init();
const store = new VectorStore(384, 'cosine');

// Catalog loaded once on page init (from your CDN or API)
store.insert(1n, new Float32Array(eldenRingEmbedding));
store.insert(2n, new Float32Array(darkSoulsEmbedding));
// ...

// Search runs in the browser, zero server call
const results = store.search(new Float32Array(currentProductEmbedding), 4);

[Browser]
  |__ velesdb-wasm (in-browser, SIMD-optimized)
        |__ products loaded from CDN on first visit
        |__ search() runs in < 1ms client-side

Best for: progressive enhancement (instant recommendations even before the API responds), offline-capable PWAs, or privacy-first architectures where user profile vectors never leave the device.

Mode	Language	Extra infra	Latency	Use case
Embedded	Python only	None	~1ms in-process	FastAPI/Django monolith
Server	Any	One container	~2-5ms + network	Polyglot / microservices
WASM	JavaScript	None	< 1ms in browser	Offline, privacy-first

For the rest of this article we use the embedded mode (Python bindings), but the recommendation logic is identical regardless of which mode you choose. Only the transport layer changes.

Practical limits to keep in mind

This approach shines for catalogs up to roughly 500,000 products on a single machine (HNSW index fits in RAM). Beyond that, you would need to shard by genre cluster or move to a distributed setup, which is outside the current VelesDB scope.

The quality of the recommendations depends directly on the quality of the embeddings. With generic tag concatenation and a small model like all-MiniLM-L6-v2, you get reasonable clusters. With a domain-specific model fine-tuned on game reviews, results improve significantly. The VelesDB API stays the same regardless of the embedding model you choose.

The behavioral graph is in-memory within a graph collection and persists to disk via flush(). For very high-traffic stores (millions of edge writes per hour), you would want to batch write events and call flush() periodically rather than on every edge insertion.

The complete flow in one diagram

User visits product page
        |
        +-- always ---------> collaborative signal
        |                      behavior.get_incoming(product_id)  [who bought it?]
        |                      behavior.get_outgoing(buyer_id)    [what else they bought?]
        |                      rank by co-purchase frequency
        |
        +-- anonymous? ------> content-based search
        |                      products.search(current_vector, top_k=4)
        |                              |
        +-- authenticated? --> graph traversal + vector fusion
                               behavior.get_outgoing(user_id)    [history]
                               products.get(purchased_ids)       [history vectors]
                               products.multi_query_search(
                                   [*history_vecs, current_vec],
                                   fusion=FusionStrategy.rrf()
                               )
                                       |
                               filter out already-owned
                                       |
                               merge(collab + personal), deduplicate
                                       |
                               return top_k as JSON/HTML snippet

What this architecture leaves out: price as a signal

One thing worth noticing: price is stored in every product's payload, but it plays no role in the recommendations above. That is a deliberate simplification. In practice, a EUR 14.99 indie game and a EUR 59.99 AAA title occupy the same vector space if their descriptions are similar enough, and nothing in the current setup prevents recommending a EUR 60 game to a customer who has only ever bought EUR 10 titles.

There are a few ways to bring price into the picture, depending on the tradeoffs you are willing to make. The simplest is a post-filter: after the vector search returns its top candidates, discard anything outside the customer's observed price range. No change to the index, no change to the embedding, just a filter on the payload. A more ambitious approach would encode price as an additional embedding dimension, either by appending a normalized price feature to the raw text embedding before indexing, or by building a separate "price-tier" collection and fusing its results with the semantic results via RRF. The latter keeps concerns separate and lets you tune the weight of price vs. semantic similarity independently.

Neither of these is hard to add on top of what we have built here. The VelesDB API stays the same either way. Worth thinking about before you go live.

Complete runnable example

Copy and run. Requires only pip install velesdb. For real semantic embeddings, also install sentence-transformers and flip the flag at the top.

"""
Complete demo: recommendation engine with VelesDB 1.12.0
  pip install velesdb
  pip install sentence-transformers  # optional, for real embeddings
"""

import shutil
import hashlib
import time
from collections import Counter

import numpy as np
import velesdb

# Set True if sentence-transformers is installed
USE_REAL_EMBEDDINGS = False
EMBED_DIM = 384
DB_PATH = "./demo_gamestore_db"

if USE_REAL_EMBEDDINGS:
    from sentence_transformers import SentenceTransformer
    encoder = SentenceTransformer("all-MiniLM-L6-v2")

    def embed_game(game: dict) -> list[float]:
        text = f"{game['title']} {game['genre']} {game.get('description', '')} {' '.join(game['tags'])}"
        return encoder.encode(text).tolist()
else:
    def embed_game(game: dict) -> list[float]:
        text = f"{game['title']} {game['genre']} {game.get('description', '')} {' '.join(game['tags'])}"
        seed = int(hashlib.sha256(text.encode()).hexdigest()[:8], 16)
        rng = np.random.default_rng(seed)
        v = rng.standard_normal(EMBED_DIM).astype(np.float32)
        return (v / np.linalg.norm(v)).tolist()

catalog = [
    {"title": "Elden Ring", "genre": "RPG", "publisher": "FromSoftware", "price": 59.99,
     "description": "A vast open-world action RPG set in the Lands Between. Brutal combat, cryptic lore, and interconnected dungeons designed by Hidetaka Miyazaki and George R.R. Martin.",
     "tags": ["soulslike", "open-world", "fantasy", "difficult"]},
    {"title": "Dark Souls III", "genre": "RPG", "publisher": "FromSoftware", "price": 39.99,
     "description": "The final chapter of the Dark Souls trilogy. Punishing yet rewarding combat, intricate level design, and a dark gothic world on the verge of collapse.",
     "tags": ["soulslike", "fantasy", "hardcore", "gothic"]},
    {"title": "God of War", "genre": "Action-RPG", "publisher": "Sony", "price": 44.99,
     "description": "Kratos and his son Atreus journey through Norse mythology. A cinematic action RPG combining visceral combat with an emotional story of fatherhood and identity.",
     "tags": ["mythology", "action", "story-rich", "combat"]},
    {"title": "The Witcher 3", "genre": "RPG", "publisher": "CD Projekt Red", "price": 29.99,
     "description": "An open-world RPG following Geralt of Rivia, a monster hunter for hire. Morally complex quests, a living world, and one of the richest narratives in gaming.",
     "tags": ["open-world", "fantasy", "story-rich", "narrative"]},
    {"title": "Cyberpunk 2077", "genre": "RPG", "publisher": "CD Projekt Red", "price": 49.99,
     "description": "A first-person open-world RPG set in Night City, a megacity obsessed with power and body modification. Deep story branching, cybernetic upgrades, and a neon-drenched dystopia.",
     "tags": ["sci-fi", "open-world", "story-rich", "action"]},
    {"title": "FIFA 25", "genre": "Sports", "publisher": "EA", "price": 49.99,
     "description": "The most popular football simulation game. Real teams, real players, Ultimate Team card collecting, and competitive online multiplayer.",
     "tags": ["football", "multiplayer", "simulation"]},
    {"title": "NBA 2K25", "genre": "Sports", "publisher": "2K", "price": 44.99,
     "description": "The definitive basketball simulation. Authentic player animations, MyCareer story mode, and deep franchise management across all NBA teams.",
     "tags": ["basketball", "multiplayer", "simulation"]},
    {"title": "Minecraft", "genre": "Sandbox", "publisher": "Mojang", "price": 26.99,
     "description": "A procedurally generated world where you gather resources, build structures, and survive the night. Limitless creativity in survival or creative mode.",
     "tags": ["building", "survival", "creativity", "multiplayer"]},
    {"title": "Stardew Valley", "genre": "Simulation", "publisher": "ConcernedApe", "price": 13.99,
     "description": "Inherit your grandfather's old farm and build it into something great. Grow crops, raise animals, mine for resources, and build relationships with the villagers.",
     "tags": ["farming", "relaxing", "indie", "rpg-elements"]},
    {"title": "Hollow Knight", "genre": "Metroidvania", "publisher": "Team Cherry", "price": 14.99,
     "description": "A challenging 2D action adventure set in a vast underground kingdom of insects. Precise combat, hundreds of rooms to explore, and a haunting hand-drawn art style.",
     "tags": ["platformer", "difficult", "exploration", "indie"]},
]

shutil.rmtree(DB_PATH, ignore_errors=True)
db = velesdb.Database(DB_PATH)
products = db.get_or_create_collection("products", EMBED_DIM, metric="cosine")
behavior = db.create_graph_collection("user_behavior")

points = []
for i, game in enumerate(catalog, start=1):
    points.append({
        "id": i,
        "vector": embed_game(game),
        "payload": {k: game[k] for k in ("title", "genre", "publisher", "price", "description", "tags")},
    })
print(f"Catalog indexed: {products.upsert(points)} games")

behavior.store_node_payload(1001, {"name": "Alice"})
behavior.add_edge({"id": 1, "source": 1001, "target": 1, "label": "purchased", "properties": {"price_paid": 59.99, "timestamp": int(time.time())}})
behavior.add_edge({"id": 2, "source": 1001, "target": 2, "label": "purchased", "properties": {"price_paid": 39.99, "timestamp": int(time.time())}})

behavior.store_node_payload(1002, {"name": "Bob"})
behavior.add_edge({"id": 3, "source": 1002, "target": 6, "label": "purchased", "properties": {"price_paid": 49.99, "timestamp": int(time.time())}})
behavior.add_edge({"id": 4, "source": 1002, "target": 7, "label": "purchased", "properties": {"price_paid": 44.99, "timestamp": int(time.time())}})


def recommend_for_prospect(current_product_id: int, top_k: int = 4) -> list[dict]:
    pts = products.get([current_product_id])
    if not pts or pts[0] is None:
        return []
    results = products.search(vector=pts[0]["vector"], top_k=top_k + 1)
    return [r for r in results if r["id"] != current_product_id][:top_k]


def recommend_for_customer(user_id: int, current_product_id: int, top_k: int = 4) -> list[dict]:
    user_profile = behavior.get_node_payload(user_id)
    if user_profile is None:
        return recommend_for_prospect(current_product_id, top_k)
    outgoing_edges = behavior.get_outgoing(user_id)
    purchased_ids = [e["target"] for e in outgoing_edges if e["label"] == "purchased"]
    if not purchased_ids:
        return recommend_for_prospect(current_product_id, top_k)
    all_ids = purchased_ids + [current_product_id]
    query_vectors = [p["vector"] for p in products.get(all_ids) if p is not None]
    strategy = velesdb.FusionStrategy.rrf(k=60)
    results = products.multi_query_search(query_vectors, top_k=top_k + len(purchased_ids) + 1, fusion=strategy)
    exclude_ids = set(purchased_ids) | {current_product_id}
    return [r for r in results if r["id"] not in exclude_ids][:top_k]


def recommend_collaborative(current_product_id: int, exclude_ids: set = None, top_k: int = 4) -> list[dict]:
    if exclude_ids is None:
        exclude_ids = set()
    exclude_ids = exclude_ids | {current_product_id}
    buyer_ids = [e["source"] for e in behavior.get_incoming(current_product_id) if e["label"] == "purchased"]
    if not buyer_ids:
        return []
    counts: Counter = Counter()
    for bid in buyer_ids:
        for e in behavior.get_outgoing(bid):
            if e["label"] == "purchased" and e["target"] not in exclude_ids:
                counts[e["target"]] += 1
    pts = products.get([pid for pid, _ in counts.most_common(top_k)])
    results = [p for p in pts if p is not None]
    for r in results:
        r["co_purchase_count"] = counts[r["id"]]
    return results


def show(label, recs):
    print(f"\n  {label}:")
    for r in recs:
        p = r["payload"]
        score = f"score={r['score']:.3f}" if "score" in r else f"buyers={r.get('co_purchase_count')}"
        print(f"    [{score}] {p['title']} ({p['genre']}) - EUR {p['price']:.2f}")


print("\nStrategy A: Prospect on Elden Ring page")
show("Players also explored", recommend_for_prospect(1))

print("\nStrategy B: Alice on God of War page (owns Elden Ring + Dark Souls III)")
show("Based on your library", recommend_for_customer(1001, 3))

print("\nStrategy B: Bob on Minecraft page (owns FIFA 25 + NBA 2K25)")
show("Based on your library", recommend_for_customer(1002, 8))

print("\nStrategy C: Collaborative filtering on Elden Ring page")
show("Others who bought this also bought", recommend_collaborative(1))

shutil.rmtree(DB_PATH, ignore_errors=True)

With USE_REAL_EMBEDDINGS = False the vectors are deterministic (no model required, scores stay low). Flip it to True after pip install sentence-transformers to get scores above 0.85 and semantically meaningful clusters.

Getting started

VelesDB on GitHub, about 3MB binary, source-available under the VelesDB Core License 1.0 (based on ELv2)
Documentation and examples

The project is still young. A star on GitHub helps other developers discover it, and I am always looking for partners and contributors who want to build on local-first AI data infrastructure. Details on velesdb.com.

What recommendation strategy are you running in production today, pure content-based, collaborative filtering, or something hybrid? Drop a comment below.

DEV Community