After months of deep engineering, redesigns, and painful benchmarking, SmartKNN v2 (v0.2.0) is finally live.
This release turns SmartKNN from an experimental idea into a high-performance, production-ready nearest-neighbor system - capable of competing with tree-based models on CPU latency, while preserving the interpretability and flexibility of KNN.
SmartKNN v2 is not just an update.
It’s a complete architectural leap.
What Is SmartKNN?
SmartKNN is a modern nearest-neighbor learning algorithm that goes beyond classical KNN by introducing:
- learned feature importance
- adaptive neighbor search
- distance-weighted voting
- scalable backends (brute + ANN)
All while keeping a scikit-learn–compatible API.
It supports both classification and regression, and is designed with low-latency inference as a first-class goal.
Why SmartKNN v2 Is a Big Deal
- Classic KNN breaks down when:
- feature importance is uneven
- datasets grow large
- latency matters (single-row prediction)
SmartKNN v2 directly attacks these problems.
What changed in v2?
- A fully vectorized core
- ANN backend for fast neighbor search
- Automatic backend selection
- Numba-accelerated inner loops
- Robust numerical safety
- Production-grade packaging
The result:
SmartKNN achieves state-of-the-art p95 single-inference latency on CPU for non-linear tabular models, while preserving competitive predictive performance
SmartKNN v2 - Major Capabilities
Full Classification & Regression Support
SmartKNN v2 fully supports:
- classification (binary + multi-class)
- regression
With correct label handling, distance-weighted voting, and robust evaluation.
Automatic Backend Selection
SmartKNN chooses the fastest execution strategy automatically:
Brute-force backend:
- Used for small datasets
- Fully vectorized NumPy
- Extremely fast at low data volumes
ANN backend:
- Designed for medium and large datasets
- Scales to millions of rows
- Optional GPU support (neighbor search only)
You don’t need to decide - SmartKNN does it for you.
ANN Backend (Approximate Nearest Neighbors)
SmartKNN v2 introduces an ANN backend with safe defaults and expert-level tuning options:
- nlist - number of coarse clusters
- nprobe - number of clusters searched per query
This allows you to trade off:
- accuracy
- speed
- memory
depending on your workload.
Learned Feature Weighting
Unlike classical KNN, SmartKNN learns feature importance using data-driven signals:
- MSE-based relevance
- Mutual Information
- Random Forest importance
Weak or noisy features are automatically suppressed, improving both:
- accuracy
- distance quality
Robust Numerical Handling
SmartKNN v2 is safe by default:
- internal NaN / Inf handling
- sanitized training data
- strict validation for query inputs
- stable distance computation
This matters in real pipelines.
Automatic Evaluation Utilities
SmartKNN includes unified evaluation helpers:
Automatic task detection (classification vs regression)
Built-in metrics:
- Regression: MSE, RMSE, MAE, R²
- Classification: Accuracy, Precision, Recall, F1, Confusion Matrix
Safe handling of non-numeric labels
Performance Highlights
SmartKNN v2 was stress-tested on:
- classification benchmarks
- regression benchmarks
- large, real-world datasets
Key outcomes:
- Significant speedup over v1
- Ultra-low single-prediction latency
- Competitive p95 latency on CPU vs tree-based models
- Faster training compared to v1 due to vectorization
Full benchmark reports will be released soon.
For now, you’re encouraged to install SmartKNN v2 and try it yourself.
Try It Now
pip install smart-knn
Examples are available in the repository:
python examples/classification_example.py
python examples/regression_example.py
Website
[(https://thatipamula-jashwanth.github.io/SmartEco/)]
SmartKNN Is Part of a Bigger Vision: SmartEco
SmartKNN is not a standalone experiment.
It’s part of SmartEco - a broader ecosystem focused on:
- high-performance ML systems
- interpretable models
- CPU-efficient inference
What’s coming next?
SmartEco’s next research project is already underway:
An O(1) geometry-based classification model, designed for constant-time prediction, currently in research & development.
Note on Evolution
SmartKNN’s journey so far:
- v1.0 - accuracy-first, experimental, slow but correct
- v1.1 - safety & correctness focused
- v2.0 - scalable, low-latency, production-ready
This release marks the point where SmartKNN becomes a serious tool, not just an idea.
SmartKNN v2 Has Not Reached Its Ceiling
SmartKNN v2 represents a major architectural foundation, not the final form of the system.
While v2 already delivers competitive accuracy and low-latency inference, it is not yet operating at its theoretical limits.
Several high-impact optimizations are actively planned and under development.
What’s Coming Next
Even Faster Inference
- Further kernel-level optimizations for distance computation
- More aggressive batch inference optimizations
- Reduced memory movement in hot paths
- Lower p95 latency for both single and batch predictions
Adaptive-K Neighbor Selection
- Dynamic adjustment of K per query
- Data-dependent neighborhood sizing
- Improved accuracy in sparse or heterogeneous regions
- Better bias–variance tradeoffs compared to fixed-K methods
SmartKNN-Native ANN Backend
- A custom ANN backend designed specifically for SmartKNN
- Tighter integration with feature weighting and distance kernels
- Reduced dependency on external ANN libraries
- Better control over:
- recall vs latency tradeoffs
- memory layout
- batch query execution
Batch-First Execution Path
- Dedicated optimizations for high-throughput batch prediction
- Smarter cache utilization
- Improved CPU scaling across cores
Looking Ahead
SmartKNN v2 is the starting point, not the finish line.
The roadmap is focused on:
- pushing CPU-only inference to its practical limits
- improving accuracy without sacrificing latency
- building SmartKNN into a first-class engine, not just an algorithm
Future versions will continue to evolve both speed and intelligence, while keeping the API stable and the system interpretable.
Want to Contribute or Experiment?
Try SmartKNN v2 on your datasets
- Share feedback
- Benchmark it against your current models
- Explore the internals — it’s built to be readable
SmartKNN is open to research and engineering collaboration.
Final Words
SmartKNN v2 proves that nearest-neighbor models don’t have to be slow.
With the right architecture, vectorization, and adaptive execution, KNN can be:
- fast
- scalable
- interpretable
- production-ready
Benchmarks are coming.
The ecosystem is growing.
And this is just the beginning.
Welcome to SmartKNN v2.
For Documentation visit website
https://thatipamula-jashwanth.github.io/SmartEco/
Top comments (0)