DEV Community

Itay Gabbay
Itay Gabbay

Posted on

Test suites for machine learning models in Python (New OSS package)

Validating machine learning models and data have never been so easy

What is Deepchecks?

Deepchecks (https://github.com/deepchecks/deepchecks) is an open-source Python package for comprehensively validating your machine learning models and data with minimal effort. This includes checks related to various types of issues, such as model performance, data integrity, distribution mismatches, and more.

Validating machine learning models?

ML Model validation, especially in production environments, has always been one of the most difficult subtopics in machine learning research. A lot of testing is needed to fully trust the ML pipeline to behave as we expect it to behave in production - similar to software testing in the software engineering world. But writing tests for ML models is a challenging task and must take into account a lot of aspects, both in the data and in the model itself.

Running from data problems

When should you use deepchecks

Deepchecks has many built-in checks and suites that can be helpful for efficiently identifying potential problems in your data and models, at various points throughout the model building process. All checks and suites can easily be adapted and extended to the specific phase in the pipeline, and run separately or together. In addition, it includes built-in suites for typical scenarios:

  • When you start working with a new dataset - Usually some integrity checks are being performed to validate that the data is clean and ready for the model. These validations include, for example, checks for missing values, data duplicates, significant outliers, and more.

  • When splitting the data into smaller portions - in this phase it is important to make sure the splits represent the reality your model is trying to predict. Validations in this phase usually include distributions checks (for detecting data drifts), data leakage (that may cause the model to learn incorrectly), class imbalance, and more.

  • When evaluating a new model - The first step usually done after training a new model is to evaluate its performance on the test set, and compare it to previous models and benchmarks. Validations in this phase include checks like weak segments detection (find where the model performs worse compared to other segments in the data), performance analysis, and some other performance-related checks.

  • And more...

Model problems vs. deepchecks

The state of the project

Deepchecks is currently under development but we already have a good number of people who use it.

You can already try it:

Contributing

We finally accept contributions from the community!
Currently, deepchecks is implemented mainly for tabular data use cases, and in a few weeks, our first checks and suites in the computer vision field will be released. We're looking for feedback and contributions to improve the experience of users and to make validations of ML models an easy task.

Whether it's an annoying bug or a feature request - don't hesitate to submit a contribution and join us!

And if you liked the project, we'll be delighted to count you as one of our stargazers at https://github.com/deepchecks/deepchecks/stargazers!

Where to go next?

Feel free to install and try our package!
pip install --user deepchecks
https://github.com/deepchecks/deepchecks

And in conclusion, don't be that guy:

We need to add tests to our ML pipeline

Top comments (0)