Why reverse face search is so hard to build at scale (and what we learned)

#ai #database #performance #systemdesign

I've been looking into facial recognition workflows lately, and it’s a total rabbit hole. Most people think you just throw an image into a model and get a "match."

In reality, the infrastructure is the real headache. Here are three things I've learned while working on the monitoring logic at reverse face search

The "False Positive" Trap: Using high-confidence thresholds means you miss half the results; lowering them means you get people who just "look similar." Finding that sweet spot in a vector database (like Milvus) is a constant battle.

Speed vs. Accuracy: Moving from a simple index to HNSW (Hierarchical Navigable Small World) graphs was a game changer for us. It’s the difference between a 10-second wait and sub-second results.

The Ethical Gray Area: This is the elephant in the room. Scrapers are everywhere. We’ve found that the best way to use this tech isn't for "stalking," but for defensive monitoring—finding where your own data is being leaked before someone else uses it.

If you’re building something similar or struggling with image indexing, I’d love to hear how you handle vector storage. It’s still a bit of a Wild West out there.

DEV Community

Why reverse face search is so hard to build at scale (and what we learned)

Top comments (0)