Jashwanth

Posted on Feb 2

Making an ANN Like Faiss Is Not Everyone’s Cup of Tea

#machinelearning #webdev #ai #programming

(A survival guide you didn’t ask for)

Building an ANN system like Faiss is not hard.
Building a fast ANN system like Faiss will make you question every life decision you’ve ever made.

If you’re thinking, “How hard can vector search be?”.... Congrats - this article is for you.

Act 1: The Innocent Beginning (Python Era)

You start in Python. Life is good - NumPy Works
Accuracy looks decent.
Latency is… acceptable.

You tell yourself:

“I’ll just prototype it. Later I’ll optimize.”

Classic mistake. Rookie energy.

Act 2: “Let’s Rewrite It in C++” (Boss Music Starts)

At some point, queries feel slow.
You say the forbidden words:

"Let’s rewrite it in C++ for speed."
This is where the tutorial ends and the boss fight begins.

Suddenly:

You’re not debugging logic
You’re debugging existence

Segfaults.
Undefined behavior.
Memory crashes… for reasons you swear are illegal.

You fix one bug → three new ones spawn.

Act 3:Speed Bound → Memory Bound (The Plot Twist)
At first, you’re speed-bound:

Bad loops
Bad data layout
Unoptimized math

You fix those.
Latency drops.
You feel powerful.

Then… nothing improves.

Welcome to the realization:

You are no longer speed-bound.
You are memory-bound.

And memory-bound is where real suffering begins.

Act 4: Milliseconds Matter (You Finally Understand Big Tech)

Seconds were easy.
Milliseconds are war.

You change one file.
Latency spikes.
QPS drops.
Cache misses explode.

Now your life is:
Change code → Build → Benchmark → Cry → Repeat

You learn:

Cache misses cost hundreds of QPS
Memory access > CPU speed
“Fast code” means nothing if data is in the wrong place

You finally understand why every millisecond matters in tech.

Act 5: SIMD, AVX, OpenMP (False Hope Arc)
You go full tryhard:

SIMD
AVX2 / AVX-512
OpenMP
BLAS
Hand-tuned loops
Then reality hits again:
Small batches → OpenMP overhead > benefit
Threads fight for cache
More cores ≠ more speed
Optimizations now need optimization.

Beautiful. Right..?

Act 6: Python Bindings (New Boss, Same Pain)
“Fine,” you say,
“I’ll just expose this with Python bindings.”

Welcome to pybind11 + CMake hell.

CMake can’t find pybind
pybind exists but CMake denies it
Errors you didn’t know were possible
Compiler messages that feel personally insulting

Also:

Python memory
C++ memory
NumPy memory
Recall drops
Speed lies

At some point you realize:

NumPy math ≠ C++ speed
And yes, you briefly consider throwing your CPU out the window.

Act 7: Scalar C++ Reality Check

You try pure scalar C++.

Surprise:

Well-optimized NumPy / Cython can beat naïve C++

Congrats.
Your ego just segfaulted.

Now you:

Learn data alignment
Learn cache lines
Learn prefetching
Learn why “just C++” is not enough

Final Act: The Faiss Reality Check
After all this:

Memory tuning
Cache tuning
Layout tuning
QPS tuning
Latency tuning

You benchmark against Faiss.

You are…
nowhere near it.

And that’s when it hits:

Faiss isn’t just algorithms.
It’s years of low-level pain, tuning, and memory mastery.

Advice From Someone Who Survived (Barely)

If you’re starting out:
Step 1: Start in Python

Build the algorithm first.
Validate accuracy.
If it’s good enough - stop here. Be happy.

Step 2: Move to C++ only if:

You hit real memory limits
You hit real latency ceilings
You understand what you’re signing up for

Step 3: Optimization Hell

SIMD
AVX
OpenMP (carefully)
Cache-aware design
Memory-first thinking

If you reach this stage…

Congrats.
This is where hating your life officially begins.

Writing an ANN engine is fun.
Writing a fast ANN engine is pain.
Writing one that competes with Faiss?

That’s not a project.
That’s a boss fight marathon.

If you’re still here - respect. 🫡
If you’re thinking of starting - I warned you.

Now excuse me while I benchmark again and cry over cache misses.

So yeah… Its already halfway.
there is an unfinished business.
The ANN is coming.
It will be open-sourced.
Not “soon™”.
Not “startup soon”.
But soon - the kind of soon where code already exists and pain is already paid for.

DEV Community

Making an ANN Like Faiss Is Not Everyone’s Cup of Tea

Advice From Someone Who Survived (Barely)

Top comments (0)