SmartKNN v0.2.3 Released

#ai #python #opensource #programming

SmartKNN v0.2.3 Released - Stability, Performance, and Global Distance Improvements

I’m excited to share the release of SmartKNN v0.2.3, the latest update to the SmartKNN library. This version focuses on improving stability, deterministic behavior, and performance, while also introducing a new feature that helps the model capture broader structure within datasets.

SmartKNN is designed as a modern approach to the classic K-Nearest Neighbors algorithm. The goal is to make KNN more practical for real-world tabular machine learning, with better scalability, learned feature weighting, and optimized CPU inference.

What’s New in v0.2.3

One of the key additions in this release is global structure distance integration.

In addition to the standard feature-level distance used by traditional KNN, SmartKNN now supports an optional parameter called global_lambda. This allows the model to incorporate dataset-level structure when ranking neighbors.

In many datasets this small structural awareness can improve neighbor quality and sometimes lead to 1–3% accuracy improvements, while keeping the default behavior fully backward compatible.

Improvements in This Release

This update also introduces several improvements aimed at making SmartKNN more reliable and production-ready.

Some of the key areas improved include:

stronger parameter and input validation
more robust handling of NaN and infinite values
deterministic ANN validation for reproducible results
safer serialization and backend rebuilding
improved compatibility with scikit-learn tooling
faster and more memory-efficient distance computations
improved ANN backend safety and stability

These changes make the system more stable when running on larger datasets or more complex feature spaces.

Performance and Stability

A major focus of this version was improving numerical stability and memory efficiency.

Distance computations and internal kernels were optimized to reduce temporary memory allocations, resulting in more consistent performance on larger datasets. Several safeguards were also added to ensure that invalid ANN results or numeric edge cases are detected early.

Overall, this release continues the effort to make SmartKNN fast, stable, and predictable in real-world usage.