How to get day-one relevance when you don't have data (and probably never did)
Everyone wants an "AI-powered matching engine".
In practice, this usually means one thing:
"We'll train a neural network and let it figure things out."
That sounds reasonable --- until you ask the first uncomfortable question:
Where exactly will the training data come from?
This article is about that gap between ambition and reality.
It's about building matching systems before you have Big Data, feedback loops, or ML infrastructure - and still delivering relevance from day one.
1. The real business problem: "Where do we get data to train a neural network?"
Let's start with the problem most teams avoid articulating clearly.
Neural networks do not fail because they are bad.
They fail because they need data that doesn't exist yet.
To train a meaningful matching model, you need:
- historical matches
- outcomes (success/failure)
- user behavior (clicks, acceptances, conversions)
- enough volume to avoid overfitting
Early-stage systems have none of that.
This creates a paradox:
- you need good matching to get users
- you need users to get data
- you need data to train matching
Most teams quietly ignore this and ship:
- random relevance
- overconfident AI labels
- or brittle rule engines disguised as "ML"
That's not a technical issue.
That's a product and architecture problem.
2. A concrete use case: choosing the right marketing channel or agency
To make this tangible, let's define a clear use case.
Imagine a company launching a new marketing campaign.
They want to choose the right advertising channel, agency, or influencer.
Their constraints are realistic:
- limited budget
- brand reputation at stake
- unclear expectations about what will work
- no historical performance data in this exact setup
On the supply side (channels, agencies, influencers), you have:
- different levels of reach
- different credibility
- different risk profiles
- different communication styles
The business question is not:
"Which option is statistically similar to this campaign?"
The real question is:
"Which option best fits the expectations and constraints of this campaign?"
That's a compatibility problem, not a similarity problem.
3. Why "just train a neural network" doesn't work here
At this point, someone usually says:
"Let's just embed everything and train a model later."
That works only if:
- you already have outcomes
- you already have labels
- you already have scale
In our use case, you don't.
Trying to use neural networks here leads to one of three failures:
- The model overfits on tiny data
- The model outputs noise that looks confident
- The team disables the model "temporarily" --- permanently
The real issue is not lack of ML talent.
It's that the system has no prior understanding of what "fit" means.
So you need a prior.
4. Reframing the problem: similarity vs compatibility
This is the key conceptual shift.
Most ML tooling is built around similarity:
- cosine similarity
- Euclidean distance
- nearest neighbors
Similarity answers:
"How alike are these two things?"
But matching in business systems rarely asks that question.
Instead, it asks:
"How appropriate is this option for this context?"
That's compatibility.
Compatibility is:
- asymmetric
- expectation-driven
- domain-specific
And it can be expressed explicitly, without pretending to learn it from non-existent data.
5. Solution: Compatibility Matrix (feature matrix, not ML)
Now we get to the core idea.
Instead of trying to learn relevance, we encode domain knowledge as a matrix.
We define two small, stable feature spaces.
Campaign side
blog_type ∈ { corporate, brand_voice, expert, personal }
This captures:
- how formal the communication should be
- how much authority is expected
- how much personal storytelling is acceptable
Supply side (agency / influencer / channel)
social_status ∈ { celebrity, macro, micro, nano }
This captures:
- perceived authority
- reach expectations
- risk tolerance
- credibility
Now we define a compatibility matrix:
compatibility[blog_type][social_status] → score ∈ [0..1]
This matrix answers:
"Given this campaign style, how appropriate is this level of authority?"
It is not a guess.
It is a product hypothesis.
6. Example: a simple 4×4 compatibility matrix
Let's make this concrete.
| celebrity | macro | micro | nano
-----------|-----------|-------|-------|------
corporate | 1.0 | 0.8 | 0.4 | 0.2
brand_voice| 0.7 | 1.0 | 0.8 | 0.5
expert | 0.6 | 0.9 | 1.0 | 0.7
personal | 0.3 | 0.6 | 0.9 | 1.0
# Compatibility Matrix lookup (Day 1 matching)
matrix = {
'corporate': [1.0, 0.8, 0.4, 0.2],
'brand_voice': [0.7, 1.0, 0.8, 0.5],
'expert': [0.6, 0.9, 1.0, 0.7],
'personal': [0.3, 0.6, 0.9, 1.0]
}
def matrix_score(campaign, influencer):
"""O(1) lookup — 1000s RPS без проблем"""
influencers = ['corporate', 'macro', 'micro', 'nano']
idx = influencers.index(influencer)
return matrix[campaign][idx]
# Production usage
score = matrix_score('corporate', 'macro') # 0.8 ✅
print(f"Corporate ↔ Macro: {score}")
What this represents in business terms:
- Corporate campaigns prioritize authority and low risk
- Personal storytelling thrives with relatable, smaller voices
- Expert campaigns value credibility over raw reach
Important clarification:
- These numbers are relative, not absolute
- They don't predict success
- They define expected fit, not outcomes
7. Why this works without data
At this stage, a reasonable question arises:
"Isn't this just hard-coded logic?"
Yes --- and that's exactly the point.
But it's structured, graded, and explicit, unlike:
- binary rules
- if/else chains
- or fake ML
A compatibility matrix gives you:
- deterministic behavior
- explainable decisions
- controllable bias
- and stable early relevance
Most importantly, it gives the system a worldview before data exists.
8. How this evolves into machine learning (without rewrites)
This approach is not anti-ML.
It's pre-ML.
As the system runs, you naturally collect:
- which matches were shortlisted
- which were accepted
- which led to engagement or conversion
At that point, the transition is incremental.
Phase 1 --- Matrix only
score = compatibility_matrix[blog_type][social_status]
Phase 2 --- Hybrid
score = 0.7 * matrix_score + 0.3 * nn_prediction
# Phase 2: Matrix 70% + NN 30%
matrix_score = 0.8
nn_score = nn_model.predict(features) # 0.75
final = 0.7 * matrix_score + 0.3 * nn_score # 0.785
Phase 3 --- ML-dominant
score = nn_prediction
The matrix never disappears.
It becomes:
- a baseline
- a regularizer
- a fallback for cold start
This is how production systems actually grow.
9. Why this gives you day-one relevance
The biggest hidden risk in matching systems is irrelevance at launch.
If users see poor matches:
- they don't interact
- you don't collect data
- your ML roadmap dies before it starts
A compatibility matrix avoids that trap.
You get:
- reasonable defaults
- behavior aligned with business expectations
- trust from users
- and data that actually reflects intent
All without pretending you have Big Data.
# Day 1: 100% matrix, no training data needed
def get_matches(request, suppliers, min_score=0.6):
matches = []
for supplier in suppliers:
score = matrix_score(request.campaign_type, supplier.category)
if score >= min_score:
matches.append((supplier, score))
return sorted(matches, key=lambda x: x, reverse=True)[14]
# Real metrics: 47 suppliers → 12 matches → 3% conversion
# O(n) complexity, 1000s RPS, zero cold start
Final takeaway
If there's one idea worth remembering:
Similarity is a mathematical concept.
Compatibility is a business concept.
Neural networks are excellent at learning similarity ---
after the world gives you data.
Compatibility matrices let you act before that moment arrives.
Matrix first.
Neural nets later.
That's not a compromise. That's how real matching systems survive long enough to learn.
Yurii Lozinskyi
Director of Engineering & CPO @ Verysell AI
Enterprise AI delivery | Regulated industries | 25+ years
Top comments (0)