Grid Search is a powerful way to optimize the process of fitting
Parameters that describe how the learning process should be performed are vital
to the quality of the resulting model. Unfortunately, it’s often difficult to guess
what the best values for them are.
Grid Search operation allows us to specify a set of values for the parameters of the input
Estimator. The operation then goes through every combination of parameters from specified sets
and for each one the
Estimator is fitted and the resulting trained model is evaluated
by means of cross validation.
The goal of
Grid Search is to choose the best combination of parameters, where “best”
is defined as having received the highest grade from the Evaluator.
In order to grade a particular combination of parameters, the
Estimator is fitted
number of folds times. In each “round” of training, the input dataset is divided
into training and test parts. The model fitted on the training data is used to score
the test part of the dataset. This score is evaluated and the final grade of the
parameter combination is the average score from all folds.
The result of the
Grid Search operation is a Report in which
every combination of parameters is graded by the
Parameters of the
Grid Search operation mirror the parameters of its input
Estimator, but some
of them accept multiple, comma-separated values. These special parameters are marked with
, as in the following example:
Note that the
Grid Search is an expensive operation. Selecting 5 values for 5 parameters results
in 25 models being cross validated.
Since: Seahorse 1.0.0
In the following case, the
Grid Search operation is used to determine the best parameters
for training a Random Forest Regression model.
number of folds has to be at least
2, but higher values make model evaluation more accurate.
In the PARAMETERS OF INPUT ESTIMATOR section of
Grid Search’s parameters, we specify the parameter values.
max depth to
10, 20, 30,
max bins to
32, 40, 50 and
num trees to
10, 50, 100.
This yields 27 distinct combinations of parameters.
In the report below, every parameter combination is listed along with its grade.
A property of