DEV Community

Hemanath Kumar J
Hemanath Kumar J

Posted on

Machine Learning - Hyperparameter Tuning with GridSearchCV - Complete Tutorial

Machine Learning - Hyperparameter Tuning with GridSearchCV - Complete Tutorial

Introduction

In the realm of machine learning, hyperparameter tuning is a crucial process that can significantly enhance the performance of a model. One effective method for this is using GridSearchCV from the scikit-learn library. This tutorial aims to guide intermediate developers through the process of using GridSearchCV for hyperparameter tuning, showcasing its practical application in machine learning projects.

Prerequisites

  • Basic understanding of Python programming
  • Familiarity with machine learning concepts
  • Experience with scikit-learn library

Step-by-Step

Step 1: Import Necessary Libraries

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
Enter fullscreen mode Exit fullscreen mode

Step 2: Load Dataset

iris = load_iris()
X, y = iris.data, iris.target
Enter fullscreen mode Exit fullscreen mode

Step 3: Define Parameter Grid

param_grid = {
  'n_estimators': [10, 50, 100],
  'max_features': ['auto', 'sqrt', 'log2'],
  'max_depth': [None, 5, 10, 15],
  'criterion': ['gini', 'entropy']
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Initialize GridSearchCV

clf = GridSearchCV(RandomForestClassifier(), param_grid, cv=5)
Enter fullscreen mode Exit fullscreen mode

Step 5: Fit the Model

clf.fit(X, y)
Enter fullscreen mode Exit fullscreen mode

Step 6: Review Best Parameters and Performance

print("Best parameters found:", clf.best_params_)
print("Best score achieved:", clf.best_score_)
Enter fullscreen mode Exit fullscreen mode

Best Practices

  • Always perform hyperparameter tuning on a separate validation set to avoid overfitting.
  • Start with a broad range of parameter values to understand their impact, then narrow down to more specific ranges.
  • Consider using other tuning strategies like RandomizedSearchCV for larger parameter grids to save time.

Conclusion

This tutorial covered the essentials of hyperparameter tuning using GridSearchCV, a powerful tool for optimizing machine learning models. By following these steps, developers can improve their model's performance and gain deeper insights into the optimization process. Happy tuning!

Top comments (0)