DEV Community

MustafaLSailor
MustafaLSailor

Posted on

Model Selection: GridSearchCV

GridSearchCV is a method used to tune model hyperparameters. Hyperparameters are parameters that control the training process of a machine learning model and can affect the overall performance of the model. GridSearchCV finds the best set of hyperparameters by trying all possible combinations on a specified set of hyperparameters.

Here's how GridSearchCV works:

First of all, the hyperparameters to be searched and their values are determined as a "grid".
GridSearchCV trains the model for each combination of hyperparameters in the grid and evaluates the performance of the model using cross-validation.
The hyperparameter set that gives the best performance is selected.
Below is a Python example showing how GridSearchCV can be implemented using the scikit-learn library:

from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

# Load iris dataset
data = load_iris()
X = data.data
y = data.target

# Create your model
model = RandomForestClassifier()

# Determine the grid of hyperparameters to search
param_grid = {
     'n_estimators': [50, 100, 200],
     'max_depth': [None, 10, 20, 30],
}

# Create GridSearchCV
grid_search = GridSearchCV(model, param_grid, cv=5)

# Fit GridSearchCV
grid_search.fit(X, y)

# Print the best hyperparameters and the best score
print("Best parameters: ", grid_search.best_params_)
print("Best score: ", grid_search.best_score_)
Enter fullscreen mode Exit fullscreen mode

In this example, GridSearchCV is used to set the 'n_estimators' and 'max_depth' hyperparameters of the RandomForestClassifier model. GridSearchCV evaluates the performance of each hyperparameter combination using 5-fold cross-validation and selects the hyperparameter set that gives the best performance.

Top comments (0)