Yurii Lozinskyi

Posted on Jan 31 • Edited on Feb 25

AI Matching: Matrix First, Neural Nets Later

#aiml #matching #marketplaces #product

How to get day-one relevance when you don't have data (and probably never did)

Everyone wants an "AI-powered matching engine".

In practice, this usually means one thing:

"We'll train a neural network and let it figure things out."

That sounds reasonable --- until you ask the first uncomfortable question:

Where exactly will the training data come from?

This article is about that gap between ambition and reality.

It's about building matching systems before you have Big Data, feedback loops, or ML infrastructure - and still delivering relevance from day one.

1. The real business problem: "Where do we get data to train a neural network?"

Let's start with the problem most teams avoid articulating clearly.

Neural networks do not fail because they are bad.

They fail because they need data that doesn't exist yet.

To train a meaningful matching model, you need:

historical matches
outcomes (success/failure)
user behavior (clicks, acceptances, conversions)
enough volume to avoid overfitting

Early-stage systems have none of that.

This creates a paradox:

you need good matching to get users
you need users to get data
you need data to train matching

Most teams quietly ignore this and ship:

random relevance
overconfident AI labels
or brittle rule engines disguised as "ML"

That's not a technical issue.

That's a product and architecture problem.

2. A concrete use case: choosing the right marketing channel or agency

To make this tangible, let's define a clear use case.

Imagine a company launching a new marketing campaign.

They want to choose the right advertising channel, agency, or influencer.

Their constraints are realistic:

limited budget
brand reputation at stake
unclear expectations about what will work
no historical performance data in this exact setup

On the supply side (channels, agencies, influencers), you have:

different levels of reach
different credibility
different risk profiles
different communication styles

The business question is not:

"Which option is statistically similar to this campaign?"

The real question is:

"Which option best fits the expectations and constraints of this campaign?"

That's a compatibility problem, not a similarity problem.

3. Why "just train a neural network" doesn't work here

At this point, someone usually says:

"Let's just embed everything and train a model later."

That works only if:

you already have outcomes
you already have labels
you already have scale

In our use case, you don't.

Trying to use neural networks here leads to one of three failures:

The model overfits on tiny data
The model outputs noise that looks confident
The team disables the model "temporarily" --- permanently

The real issue is not lack of ML talent.

It's that the system has no prior understanding of what "fit" means.

So you need a prior.

4. Reframing the problem: similarity vs compatibility

This is the key conceptual shift.

Most ML tooling is built around similarity:

cosine similarity
Euclidean distance
nearest neighbors

Similarity answers:

"How alike are these two things?"

But matching in business systems rarely asks that question.

Instead, it asks:

"How appropriate is this option for this context?"

That's compatibility.

Compatibility is:

asymmetric
expectation-driven
domain-specific

And it can be expressed explicitly, without pretending to learn it from non-existent data.

5. Solution: Compatibility Matrix (feature matrix, not ML)

Now we get to the core idea.

Instead of trying to learn relevance, we encode domain knowledge as a matrix.

We define two small, stable feature spaces.

Campaign side

blog_type ∈ { corporate, brand_voice, expert, personal }

This captures:

how formal the communication should be
how much authority is expected
how much personal storytelling is acceptable

Supply side (agency / influencer / channel)

social_status ∈ { celebrity, macro, micro, nano }

This captures:

perceived authority
reach expectations
risk tolerance
credibility

Now we define a compatibility matrix:

compatibility[blog_type][social_status] → score ∈ [0..1]

This matrix answers:

"Given this campaign style, how appropriate is this level of authority?"

It is not a guess.

It is a product hypothesis.

6. Example: a simple 4×4 compatibility matrix

Let's make this concrete.

           | celebrity | macro | micro | nano
-----------|-----------|-------|-------|------
corporate  | 1.0       | 0.8   | 0.4   | 0.2
brand_voice| 0.7       | 1.0   | 0.8   | 0.5
expert     | 0.6       | 0.9   | 1.0   | 0.7
personal   | 0.3       | 0.6   | 0.9   | 1.0

# Compatibility Matrix lookup (Day 1 matching)
matrix = {
    'corporate': [1.0, 0.8, 0.4, 0.2],
    'brand_voice': [0.7, 1.0, 0.8, 0.5],
    'expert': [0.6, 0.9, 1.0, 0.7],
    'personal': [0.3, 0.6, 0.9, 1.0]
}

def matrix_score(campaign, influencer):
    """O(1) lookup — 1000s RPS без проблем"""
    influencers = ['corporate', 'macro', 'micro', 'nano']
    idx = influencers.index(influencer)
    return matrix[campaign][idx]

# Production usage
score = matrix_score('corporate', 'macro')  # 0.8 ✅
print(f"Corporate ↔ Macro: {score}")

What this represents in business terms:

Corporate campaigns prioritize authority and low risk
Personal storytelling thrives with relatable, smaller voices
Expert campaigns value credibility over raw reach

Important clarification:

These numbers are relative, not absolute
They don't predict success
They define expected fit, not outcomes

7. Why this works without data

At this stage, a reasonable question arises:

"Isn't this just hard-coded logic?"

Yes --- and that's exactly the point.

But it's structured, graded, and explicit, unlike:

binary rules
if/else chains
or fake ML

A compatibility matrix gives you:

deterministic behavior
explainable decisions
controllable bias
and stable early relevance

Most importantly, it gives the system a worldview before data exists.

8. How this evolves into machine learning (without rewrites)

This approach is not anti-ML.

It's pre-ML.

As the system runs, you naturally collect:

which matches were shortlisted
which were accepted
which led to engagement or conversion

At that point, the transition is incremental.

Phase 1 --- Matrix only

score = compatibility_matrix[blog_type][social_status]

Phase 2 --- Hybrid

score = 0.7 * matrix_score + 0.3 * nn_prediction

# Phase 2: Matrix 70% + NN 30%
matrix_score = 0.8
nn_score = nn_model.predict(features)  # 0.75
final = 0.7 * matrix_score + 0.3 * nn_score  # 0.785

Phase 3 --- ML-dominant

score = nn_prediction

The matrix never disappears.

It becomes:

a baseline
a regularizer
a fallback for cold start

This is how production systems actually grow.

9. Why this gives you day-one relevance

The biggest hidden risk in matching systems is irrelevance at launch.

If users see poor matches:

they don't interact
you don't collect data
your ML roadmap dies before it starts

A compatibility matrix avoids that trap.

You get:

reasonable defaults
behavior aligned with business expectations
trust from users
and data that actually reflects intent

All without pretending you have Big Data.

# Day 1: 100% matrix, no training data needed
def get_matches(request, suppliers, min_score=0.6):
    matches = []
    for supplier in suppliers:
        score = matrix_score(request.campaign_type, supplier.category)
        if score >= min_score:
            matches.append((supplier, score))
    return sorted(matches, key=lambda x: x, reverse=True)[14]

# Real metrics: 47 suppliers → 12 matches → 3% conversion
# O(n) complexity, 1000s RPS, zero cold start