Skip to content

DEV Community

DetectNix

Posted on Jun 11

How I Built an AI-Powered Adult (Porn) Content Scanner for Windows (And the Engineering Challenges I Didn't Expect)

#ai #dotnet #machinelearning #performance

Building an AI-Powered Content Scanner for Windows: Performance, Multithreading and GPU Acceleration in .NET

Building software always looks straightforward from the outside.

You load a machine learning model, point it at some images, and display the results.

At least that's what I thought when I started building DetectNix Vision, a Windows desktop application that performs local AI-powered image analysis without uploading user data to the cloud.

In reality, the project became a deep dive into performance optimization, memory management, multithreading, GPU acceleration, and user experience.

This article covers the engineering challenges I encountered and the architectural decisions I made while building the software from the perspective of a senior developer.

The Original Goal

The initial goal was simple:

Scan images stored on a Windows PC
Detect potentially explicit or sensitive content
Keep all processing local
Support both CPU and GPU execution
Process large image collections efficiently
Remain responsive while scanning

Privacy was a major requirement.

I didn't want users uploading personal files to third-party services. Everything needed to run locally on the user's machine.

That decision immediately influenced every technical choice that followed.

Challenge #1: Model Loading Performance

One of the first mistakes I made was loading the AI model too frequently.

A modern computer vision model can be hundreds of megabytes in size. Loading it repeatedly creates significant startup overhead and quickly destroys performance.

My initial implementation worked perfectly during testing because I was only processing a handful of images.

Once I started testing larger image collections, the bottleneck became obvious.

The Solution

I moved to a singleton-style architecture where the model is loaded once during application startup and remains resident in memory.

private readonly InferenceSession _session;

public VisionEngine()
{
    _session = CreateSession();
}

This reduced initialization costs dramatically and ensured every image could reuse the same loaded model.

The lesson here is simple:

AI models should usually be treated like databases or connection pools, not disposable objects.

Load them once. Reuse them often.

Challenge #2: CPU Usage Was Out of Control

The next issue appeared when processing thousands of images.

The obvious approach is:

foreach(var image in images)
{
    Analyze(image);
}

Unfortunately, this wastes modern hardware.

A single image analysis might only use a fraction of the available CPU resources.

My first instinct was to parallelize everything.

Parallel.ForEach(images, image =>
{
    Analyze(image);
});

This certainly increased throughput.

It also created new problems.

Challenge #3: Too Much Parallelism

Many developers assume that more threads automatically equals more performance.

With machine learning workloads, that's often not true.

I discovered that excessive parallelism caused:

Increased memory consumption
Context switching overhead
Reduced GPU efficiency
System responsiveness issues

The application became faster in benchmarks but slower in real-world usage.

The operating system was spending too much time managing threads instead of performing useful work.

The Solution

I implemented a controlled worker model.

Instead of allowing unlimited concurrency, I created a configurable processing pool.

var maxSessions = Math.Min(Environment.ProcessorCount, 4);

This allowed me to tune throughput while keeping resource usage predictable.

In practice, a carefully controlled number of workers consistently outperformed unrestricted parallel execution.

This was one of the most valuable lessons from the project:

The fastest architecture is rarely the one with the most threads.

Challenge #4: GPU Acceleration Isn't Automatic

Many users assume that installing a graphics card means software automatically becomes faster.

Unfortunately, that's not how machine learning inference works.

Supporting GPU acceleration introduced several challenges:

Detecting available hardware
Selecting execution providers
Handling driver differences
Supporting systems without compatible GPUs
Providing reliable CPU fallback

A failed GPU initialization could not be allowed to crash the application.

The Solution

The startup sequence attempts GPU initialization first.

If that fails, the application transparently falls back to CPU execution.

try
{
    EnableGpuProvider();
}
catch
{
    EnableCpuProvider();
}

This approach ensured the software would run on virtually any Windows machine.

Performance varies significantly between systems, but functionality remains consistent.

Challenge #5: Memory Pressure During Large Scans

Scanning a directory containing 50 images is easy.

Scanning a directory containing 100,000 images is a different problem entirely.

Early versions accumulated too much data in memory.

This resulted in:

Increased garbage collection activity
Higher memory usage
Reduced throughput
Longer scan times

The Solution

I switched to a streaming pipeline.

Instead of loading large batches of files, images are processed incrementally.

foreach(var file in Directory.EnumerateFiles(path))
{
    Process(file);
}

This dramatically reduced memory consumption and allowed scans of extremely large collections without exhausting system resources.

Sometimes the simplest optimization is simply processing less data at once.

Challenge #6: Keeping the UI Responsive

Desktop users have very little tolerance for frozen applications.

A scan that takes several minutes is acceptable.

An application that stops responding for several minutes is not.

Initially, image analysis was competing with the user interface thread.

The result was predictable.

Windows marked the application as "Not Responding."

The Solution

I completely separated the scanning pipeline from the UI layer.

The scanner runs on background workers while the UI receives progress updates.

await Task.Run(() =>
{
    StartScan();
});

This allowed users to:

Browse results
Pause scans
View progress
Continue interacting with the application

Even during intensive processing.

The difference in perceived quality was enormous.

Challenge #7: Handling Real-World Image Collections

Developers often test with ideal data.

Users never provide ideal data.

Real-world collections contain:

Corrupted files
Unsupported formats
Zero-byte images
Huge images
Tiny images
Invalid metadata

The software needed to continue scanning even when individual files failed.

The Solution

Every image is treated as potentially invalid.

Failures are isolated and logged.

try
{
    Analyze(image);
}
catch(Exception ex)
{
    LogError(ex);
}

A single bad file should never stop an entire scan.

This significantly improved reliability.

Challenge #8: Finding the Right Balance Between Accuracy and Speed

One of the biggest engineering trade-offs involved balancing:

Scan speed
Detection accuracy
Hardware requirements
User expectations

Larger models generally improve accuracy.

They also increase:

Memory usage
Startup times
Processing times

Smaller models improve responsiveness but may sacrifice precision.

There is no universally correct answer.

The optimal balance depends on the target audience and intended use case.

For DetectNix Vision, I prioritized a solution that delivered strong accuracy while remaining practical on average consumer hardware.

Additional Engineering Decisions Worth Mentioning

Why I Chose ONNX Runtime

After evaluating several options, I standardized on ONNX Runtime.

Benefits included:

Excellent .NET support
GPU acceleration support
Cross-hardware compatibility
Consistent inference performance
Easy deployment with Windows applications

Most importantly, it allowed me to focus on building the product instead of maintaining machine learning infrastructure.

Why I Built a Prediction Engine Pool

Creating inference sessions can be expensive.

Rather than constantly creating and disposing resources, I implemented a reusable engine pool.

Benefits included:

Reduced allocations
Lower startup overhead
Better throughput
More predictable memory usage

This became particularly important when users scanned tens of thousands of files.

Why Privacy Became a Competitive Advantage

Initially, local processing was simply a technical requirement.

Over time it became one of the product's strongest differentiators.

Many competing solutions upload content to cloud services for analysis.

DetectNix Vision performs all scanning locally.

Benefits include:

No image uploads
Faster processing
Improved privacy
No internet dependency
Better suitability for business environments

Sometimes architectural decisions become product features.

This was one of those cases.

What I'd Do Differently

Looking back, there are several things I would prototype earlier:

GPU support
Threading architecture
Memory profiling
Large-scale stress testing

Many performance issues don't appear until the software processes thousands of files under real-world conditions.

Building those tests earlier would have saved significant development time.

Key Takeaways

Building AI-powered desktop software taught me that machine learning is only a small part of the problem.

The real challenges often involve traditional software engineering:

Resource management
Concurrency
Reliability
User experience
Performance optimization

The AI model itself might take months to train.

But creating a fast, stable, user-friendly application around that model can easily take longer.

For me, the most important lesson was this:

Success comes from engineering the entire system, not just the AI.

Users don't care how sophisticated your model is if the application is slow, unstable, or difficult to use.

They care that it works.

And that's where software engineering still matters most.

dotnet #csharp #machinelearning #ai

Top comments (0)

Subscribe

I'm Mark; Senior Software Engineer with 25+ years experience I share real-world developer lessons on YouTube 📹 → https://www.youtube.com/@lessonsfromproduction

Location

Australia
Work

Real-world developer lessons; Find me on YouTube 🎥@ https://www.youtube.com/@lessonsfromproduction
Joined

May 15, 2025

What are your goals for the week? #188

#discuss #jokes #watercooler

Context Is King: Rethinking Domain Ownership, Product, and the "Spec Phase"

#discuss #management #codequality #product