ML Hyperparameter Tuning GridSearch

Hyperparameter Tuning & Grid Search

"A machine learning model without tuned hyperparameters is like a high-performance sports car running on regular fuel. It works, but it's far from optimal."

Parameters vs. Hyperparameters

In machine learning, parameters are learned automatically by the algorithm during the training process (like the slope and intercept in linear regression). Hyperparameters, on the other hand, are the structural settings of the model that you, the engineer, must specify before training begins (such as the depth of a decision tree or the learning rate).

What is Grid Search?

GridSearchCV (Grid Search Cross-Validation) is an algorithmic tool provided by Scikit-Learn that allows you to specify a dictionary of hyperparameters and the values you want to test. The algorithm will methodically build and evaluate a model for every possible combination of those values to find the absolute best performer.

The Role of Cross-Validation

Tuning a model on your training set directly often leads to overfitting—the model memorizes the data instead of learning general patterns. Grid Search uses Cross-Validation (the 'CV' in GridSearchCV) to split the training data into multiple folds, ensuring the hyperparameter combination performs consistently across different subsets of data.

❓ A.I. Frequently Asked Questions

Grid Search vs. Random Search?

Grid Search tests *every single* combination. Random Search (RandomizedSearchCV) tests a random sample of combinations from the grid. For very large grids, Random Search is faster and often finds a near-optimal solution with drastically less computational cost.

What are common hyperparameters to tune?

Random Forests: n_estimators, max_depth.
SVM: C, kernel, gamma.
Logistic Regression: C, penalty.

API Lexicon

param_grid

A Python dictionary defining the hyperparameter names and the list of values to iterate through.

grid = {'C': [1, 10], 'gamma': [0.1, 1]}

GridSearchCV

The Scikit-Learn class that automates the training and evaluation of all grid combinations.

gs = GridSearchCV(estimator, param_grid, cv=5)

best_params_

An attribute holding the dictionary of the hyperparameters that achieved the highest cross-validated score.

print(gs.best_params_)

Hyperparameter Tuning

Architecture Map

Grid Search Concept

System Evaluation

Launch Lab Exercises