DEV Community

AI Look-Alike Search for OF Creators — Need Advice on Better Face Models

AI Look-Alike Search for OF Creators — Need Advice on Better Face Models

I’m currently building an AI-based face similarity (look-alike) search for OF models as part of a real-world side project.

The dataset contains 100,000+ public OF model images, and the goal is to help users discover visually similar OF models based on facial features rather than usernames or text-based search.

This is not identity verification — the focus is purely on visual similarity.


What I’m Building (Quick Overview)

  • Users upload an image (reference photo / celebrity image)
  • The system finds OF models with similar facial characteristics
  • Results are ranked using face embeddings + vector similarity search
  • Everything currently runs on CPU, but I’m considering a move to GPU for scale and experimentation

What I’m Building (More Detail)

The system allows users to upload an image and receive a list of OF models with similar facial characteristics.

The intent is to support visual discovery, where perceived similarity matters more than exact identity matching.


Key Constraints

  • Similarity over identity

    The system ranks faces by perceived similarity (look-alike matching), not by strict identity verification.

  • Low tolerance for false positives

    Returning visually different faces as “similar” is more harmful than missing a potential match.

  • Real-world images

    The dataset consists of non-studio images with varying lighting, poses, resolutions, and overall quality.

  • Scalability

    The solution needs to scale beyond 100k+ images without significant drops in accuracy or performance.


Current Pipeline (CPU-Based)

At the moment, the entire pipeline runs on CPU only.

The setup looks like this:

  • Face detection and alignment
  • Feature extraction using a pre-trained face model
  • Storing embeddings in a vector index
  • Nearest-neighbor search using cosine similarity

At this scale, the system works reasonably well, but both accuracy and performance are starting to become limiting factors.


Current Model Setup (InsightFace)

Face embeddings are currently generated using InsightFace, specifically the buffalo_l model bundle.

The pipeline includes:

  • Face detection and alignment via InsightFace
  • Feature extraction using the buffalo_l model
  • Embeddings stored for similarity search
  • Cosine similarity for ranking similar faces

This provides a solid baseline, but for look-alike matching, small inaccuracies are very noticeable.


Where the System Struggles

As the dataset grows, several issues become more apparent:

  • Visually similar faces sometimes rank lower than expected
  • Different individuals with shared facial traits can appear as false positives
  • Lighting, pose, and image quality introduce noise
  • CPU inference becomes a bottleneck during re-indexing and experimentation

Because this is a look-alike use case, even small errors can significantly affect perceived quality.


CPU vs GPU — Is the Move Worth It?

I’m planning to migrate the pipeline to GPU-based inference, but I want to make sure the model choice justifies the move.

Some of the questions I’m evaluating:

  • Which face models provide the best results for visual similarity, not identity recognition?
  • Does GPU inference unlock meaningfully better accuracy, or is it mainly a speed improvement?
  • Are there models that are simply not practical to run on CPU at this scale?

If I’m going to reprocess 100k+ OF model images, I want to do it with the right model.


What I’m Looking for in a Better Face Model

I’m particularly interested in models that:

  • Produce high-quality embeddings for similarity search
  • Perform well on non-ideal, real-world images
  • Scale efficiently beyond 100k images
  • Benefit from GPU acceleration
  • Can be fine-tuned (or perform well out of the box) for look-alike matching

I’m open to both open-source and commercial solutions.


Real-World Context

This work is part of a discovery platform where users can upload an image and find visually similar OF models using AI-based face similarity.

The project is called Explore.Fans, and face similarity search is one of its core components.

👉 https://explore.fans

(Shared only for technical context.)


Questions for the Community

If you’ve worked with face similarity or face recognition models at scale, I’d really appreciate your input:

  • Which models gave you the best results for look-alike similarity?
  • Did GPU inference improve accuracy, or mostly performance?
  • Any experience fine-tuning models for similarity-based ranking?
  • Anything you’d avoid based on real-world experience?

Thanks in advance — happy to share more details if helpful.
Have a wonderfull holiday!


References


Top comments (0)