You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the section 'Hyperparameter tuning by randomized search', different hyperparameters are tuned for Histogram gradient-boosting decision trees than in section 'Hyperparameter tuning with ensemble models'. In the former section, l2_regularization and max_bins are tuned but not in the latter. In the latter section max_depth is tuned but not in the former section. My proposal would be to:
remove tuning of max_bins; this argument is only to set the granularity of optimal split finding in the trees so I don't think it affects the complexity of the model and the ability to generalize
add a line on how l2-regularisation works for GBT as it is not explained or remove it
add tuning of max_depth in the former section
Please let me know what you think of this. I would be happy to create a PR.
The text was updated successfully, but these errors were encountered:
Hi @fritshermans sorry for taking so long to answer!
I would say that I don't really see the problem on not having consistent hyperparameters between those 2 notebooks, otherwise we might end up being redundant. Instead, the Hyperparameter tuning by randomized search notebook presents some hyperparameters, but the focus is mostly on how one can pass distributions to RandomizedSearchCV rather than giving a detailed description of what they do; whereas in the Hyperparameter tuning with ensemble models notebook the emphasis is on interactions between learning_rate and both max_iter and max_leaf_nodes.
I do agree that the very shallow explanation "l2_regularization: it corresponds to the strength of the regularization" (in the randomized search notebook) is not very informative, maybe we can compare this hyperparameter with alpha in Ridge and 1/C in the LogisticRegression)
We don't really tune max_depth in either notebook, but maybe we can make it explicit that we demo the interactions using max_leaf_nodes only to keep the discussion simple, but encourage students to modify the code and experiment what would happen when using max_depth instead.
Then I would possibly modify the param_distributions in the Hyperparameter tuning with ensemble models notebook to try fixed values of learning_rate e.g. [0.01, 0.03, 0.1, 0.3], then add a parallel plot after the Caution message and before the interpretation
In the section 'Hyperparameter tuning by randomized search', different hyperparameters are tuned for Histogram gradient-boosting decision trees than in section 'Hyperparameter tuning with ensemble models'. In the former section,
l2_regularization
andmax_bins
are tuned but not in the latter. In the latter sectionmax_depth
is tuned but not in the former section. My proposal would be to:max_bins
; this argument is only to set the granularity of optimal split finding in the trees so I don't think it affects the complexity of the model and the ability to generalizemax_depth
in the former sectionPlease let me know what you think of this. I would be happy to create a PR.
The text was updated successfully, but these errors were encountered: