Jashwanth

Posted on Jan 6

Why I Built a Dedicated Benchmarking System

#machinelearning #ai #webdev #programming

Over the past few months, I’ve been working on large-scale benchmarks for SmartKNN.

What I didn’t expect was how frustrating benchmarking itself would become.

Not the models.
Not the algorithms.
But the process.

Every benchmark script turned into a mess:

Slightly different preprocessing
Hidden data leakage
Inconsistent splits
Models “available” but failing at runtime
DL models requiring totally different environments
No clear visibility into what actually runs on my machine

At some point, Benchmarking stopped being about models and started being about debugging pipelines.

So I paused.

And instead of writing another benchmark script, I built a Benchmarking system.

SmartML (Part of the SmartEco Ecosystem)

SmartML is a benchmarking-only tool I created purely to answer one question:

“If I benchmark models today, can I trust the results tomorrow?”

It’s not AutoML.
It’s not an optimizer.
It’s not a framework trying to be clever.

It’s just a transparent, deterministic, CPU-first benchmarking engine.

No innovation claims.
No magic.
Just honest evaluation.

Why This Exists (Especially for SmartKNN)

I originally built SmartML to benchmark SmartKNN properly.

But once the system was in place, it made sense to support:

Classical ML models
Tree-based models
Optional deep learning models
Research models (when available)

Right now, SmartML supports up to ~20 models, including both ML and DL models.

Some DL models:

Require different environments
May not install on Windows
May silently fail on CPU

So SmartML does not pretend they exist.

Runtime Model Detection (This Part Matters)

SmartML exposes a utility called:

SmartML_Inspect()

This tells you:

Which classification models are available right now
Which regression models actually work in your environment
What metrics SmartML uses

No guessing.
No crashes.
No “works on my machine” nonsense.

If a model can’t run - it simply doesn’t appear.

What SmartML Actually Does

SmartML enforces:

Fixed random seeds
Deterministic train/test splits
Leakage-free encoding
Identical preprocessing across models
CPU-only execution by default
Real inference latency measurement

It measures:

Training time
Batch inference time
Batch throughput
Single-sample latency
P95 latency
Core accuracy / F1 / R² metrics

Same pipeline.
Same rules.
Every model

What SmartML Is Not

Let me be very clear:

No AutoML
No hyperparameter tuning
No leaderboard optimization
No claims of being “state of the art”

This is just a tool for benchmarking.

If you want to:

Benchmark Models at scale
Compare ML vs DL fairly on CPU
Run large experiments without rewriting pipelines
Trust your results next week

Then this might help.

If not - that’s totally fine too.

Using It (When You Need Scale)

For large benchmarks:

pip install SmartEco

Then explore what’s available in your environment:

from SmartEco.SmartML import SmartML_Inspect
SmartML_Inspect()

What’s Coming Next

Huge SmartKNN benchmarks (the original goal)
Public benchmark reports on the SmartEco website

Open & Honest

If you:

Use it
Break it
Add a model
Find a bug
Want something clearer
Open an issue or send a PR.

This is an engineering tool, not a product pitch.

Links

GitHub (SmartML):
Repo

Website (benchmarks & docs):
SmartEco

DEV Community