SmartKNN v0.2.3 Released - Stability, Performance, and Global Distance Improvements
I’m excited to share the release of SmartKNN v0.2.3, the latest update to the SmartKNN library. This version focuses on improving stability, deterministic behavior, and performance, while also introducing a new feature that helps the model capture broader structure within datasets.
SmartKNN is designed as a modern approach to the classic K-Nearest Neighbors algorithm. The goal is to make KNN more practical for real-world tabular machine learning, with better scalability, learned feature weighting, and optimized CPU inference.
What’s New in v0.2.3
One of the key additions in this release is global structure distance integration.
In addition to the standard feature-level distance used by traditional KNN, SmartKNN now supports an optional parameter called global_lambda. This allows the model to incorporate dataset-level structure when ranking neighbors.
In many datasets this small structural awareness can improve neighbor quality and sometimes lead to 1–3% accuracy improvements, while keeping the default behavior fully backward compatible.
Improvements in This Release
This update also introduces several improvements aimed at making SmartKNN more reliable and production-ready.
Some of the key areas improved include:
- stronger parameter and input validation
- more robust handling of NaN and infinite values
- deterministic ANN validation for reproducible results
- safer serialization and backend rebuilding
- improved compatibility with scikit-learn tooling
- faster and more memory-efficient distance computations
- improved ANN backend safety and stability
These changes make the system more stable when running on larger datasets or more complex feature spaces.
Performance and Stability
A major focus of this version was improving numerical stability and memory efficiency.
Distance computations and internal kernels were optimized to reduce temporary memory allocations, resulting in more consistent performance on larger datasets. Several safeguards were also added to ensure that invalid ANN results or numeric edge cases are detected early.
Overall, this release continues the effort to make SmartKNN fast, stable, and predictable in real-world usage.
What’s Next
Future updates will focus on pushing SmartKNN even further:
- faster neighbor search and improved ANN tuning
- additional performance optimizations
- lower memory usage for large datasets
- further improvements in robustness and reproducibility
- potential improvements in prediction accuracy through better distance modeling
The long-term goal is to make SmartKNN a high-performance, scalable alternative to traditional KNN implementations for tabular machine learning.
Project Repository
If you’d like to explore the project or try it out, you can find SmartKNN here:
https://github.com/thatipamula-jashwanth/smart-knn
Feedback, suggestions, and contributions are always welcome!
Top comments (0)