Using Sample_weight In Gridsearchcv
Solution 1:
Just trying to close out this long hanging question...
You needed to get the last version of SKL and use the following:
gs.fit(Xtrain, ytrain, fit_params={'sample_weight': sw_train})
However, it is more in line with the documentation to pass fit_params
to the constructor:
gs = GridSearchCV(svm.SVC(C=1), [{'kernel': ['linear'], 'C': [.1, 1, 10], 'probability': [True], 'sample_weight': sw_train}], fit_params={'sample_weight': sw_train})
gs.fit(Xtrain, ytrain)
Solution 2:
The previous answers are now obsolete. The dictionary fit_params
should be passed to the fit
method.
From the documentation for GridSearchCV:
fit_params : dict, optional
Parameters to pass to the fit method.
Deprecated since version 0.19: fit_params as a constructor argument was deprecated in version 0.19 and will be removed in version 0.21. Pass fit parameters to the fit method instead.
Solution 3:
In version 0.16.1, if you use Pipeline
, you need to pass the param to GridSearchCV
constructor:
clf = pipeline.Pipeline([('svm', svm_model)])
model = grid_search.GridSearchCV(estimator = clf, param_grid=param_grid,
fit_params={'svm__sample_weight': sw_train})
Solution 4:
The following works in Sklearn 0.23.1,
grid_cv = GridSearchCV(clf, param_grid=param_grid,
scoring='recall', n_jobs=-1, cv=10)
grid_cv.fit(x_train_orig, y=y_train_orig,
sample_weight=my_sample_weights)
Solution 5:
OP's edit and other answers are not entirely correct. While for fitting fit_params={'sample_weight': weights}
works, those weight will not be used to compute validation loss! (github issue).
Consequently, cross-validation will report unweighted loss, and thus the hyper-parameter-tuning might get steered off into the wrong direction.
Here is my work-around for cross-validation with class weights using accuracy as metric. Should also work with other metrics.
from sklearn.metrics import accuracy_score
from sklearn.utils import compute_sample_weight
from sklearn.metrics import make_scorer
defweighted_accuracy_eval(y_pred, y_true, **kwargs):
balanced_class_weights_eval = compute_sample_weight(
class_weight='balanced',
y=y_true
)
out = accuracy_score(y_pred=y_pred, y_true=y_true, sample_weight=balanced_class_weights_eval, **kwargs)
return out
weighted_accuracy_eval_skl = make_scorer(weighted_accuracy_eval)
gridsearch = GridSearchCV(
estimator=model,
scoring=weighted_accuracy_eval,
param_grid=paramGrid,
)
cv_result = gridsearch.fit(
X_train,
y_train,
fit_params=fit_params
)
Post a Comment for "Using Sample_weight In Gridsearchcv"