We are building a lightweight recommendation layer that uses an LLM to rank items from a small product catalog based on a user's recent history and stated preferences. It is useful for teams that need explainable suggestions without deploying a full embedding pipeline or training a collaborative filtering model.
What you'll need
- Python 3.10 or newer
- The OpenAI SDK:
pip install openai - An Oxlo.ai API key from https://portal.oxlo.ai
Step 1: Define the catalog and user profile
I start with a hardcoded catalog of electronics and a short user history. Keeping everything in plain Python dictionaries makes the tutorial easy to port to your own database later.
CATALOG = [
{"id": "p1", "name": "Sony WH-1000XM5", "category": "headphones", "price": 348, "tags": ["noise-canceling", "wireless", "over-ear"]},
{"id": "p2", "name": "Apple AirPods Pro 2", "category": "headphones", "price": 249, "tags": ["noise-canceling", "wireless", "in-ear"]},
{"id": "p3", "name": "Audio-Technica ATH-M50x", "category": "headphones", "price": 149, "tags": ["studio", "wired", "over-ear"]},
{"id": "p4", "name": "Logitech MX Master 3S", "category": "mouse", "price": 99, "tags": ["wireless", "ergonomic", "productivity"]},
{"id": "p5", "name": "Keychron Q1 Pro", "category": "keyboard", "price": 199, "tags": ["mechanical", "wireless", "hot-swappable"]},
]
USER_PROFILE = {
"recent_views": ["p1", "p4"],
"bought_last_month": ["p5"],
"preferred_categories": ["headphones", "keyboard"],
"budget_hint": "under 300",
}
Step 2: Build a minimal retrieval layer
Before sending everything to the model, I filter the catalog to the user's preferred categories. This cuts token usage and keeps the context focused.
def retrieve_candidates(catalog, user_profile, max_items=8):
preferred = set(user_profile.get("preferred_categories", []))
candidates = [
item for item in catalog
if item["category"] in preferred
]
# Simple diversification: sort by price ascending, then take top N
candidates.sort(key=lambda x: x["price"])
return candidates[:max_items]
candidates = retrieve_candidates(CATALOG, USER_PROFILE)
Step 3: Write the system prompt
The system prompt tells the model how to behave, how to format JSON, and what factors to weigh. I keep it strict so the output is predictable.
SYSTEM_PROMPT = """
You are a product recommendation engine.
Given a JSON user profile and a JSON list of candidate products, rank the top 3 products for this user.
Return ONLY a JSON object with this exact structure:
{
"recommendations": [
{
"product_id": "string",
"rank": 1,
"reason": "One sentence explaining why this fits the user."
}
]
}
Consider the user's recent views, purchase history, preferred categories, and budget hint.
Do not recommend items the user already bought.
"""
Step 4: Call Oxlo.ai for ranking and explanation
I format the user message as a JSON string containing both the profile and the filtered candidates. Oxlo.ai's flat per-request pricing means I do not need to worry about prompt length when I add extra context, so I can include full product descriptions without ballooning cost.
import json
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")
user_message = json.dumps({
"user_profile": USER_PROFILE,
"candidates": candidates
}, indent=2)
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_message},
],
temperature=0.2,
)
output = response.choices[0].message.content
recommendations = json.loads(output)
print(json.dumps(recommendations, indent=2))
Step 5: Add conversational follow-up
A real system should handle follow-up questions. I append the assistant's previous response to the message history and ask for a cheaper alternative. Because Oxlo.ai supports multi-turn conversations with no cold starts, the second request is just as fast as the first.
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_message},
{"role": "assistant", "content": output},
{"role": "user", "content": "Can you suggest something cheaper? My budget is tighter now."},
]
follow_up = client.chat.completions.create(
model="llama-3.3-70b",
messages=messages,
temperature=0.2,
)
print(follow_up.choices[0].message.content)
Run it
Save the complete script as recs.py, export your key, and run it.
export OXLO_API_KEY="sk-oxlo.ai-..."
python recs.py
Expected output from Step 4:
{
"recommendations": [
{
"product_id": "p2",
"rank": 1,
"reason": "The user prefers wireless noise-canceling headphones and has not yet purchased the AirPods Pro 2, which fits the under-300 budget."
},
{
"product_id": "p1",
"rank": 2,
"reason": "They recently viewed the WH-1000XM5, indicating strong interest, though it is at the top of their budget."
},
{
"product_id": "p3",
"rank": 3,
"reason": "A solid wired studio option at a lower price point, useful as a secondary pair for their setup."
}
]
}
Next steps
Swap the keyword filter in Step 2 with semantic retrieval using Oxlo.ai's embeddings endpoint. Send product names and descriptions through bge-large or e5-large, store the vectors in your database, and retrieve candidates by cosine similarity.
Alternatively, add an explicit feedback loop. Log each recommendation with a thumbs-up or thumbs-down, append that signal to the user profile JSON, and pass it back in the next request so the model learns implicit preferences without retraining.
Top comments (0)