Description
What happened:
I was running GridSearchCV with multiple scoring metrics. One of them ("neg_mean_poisson_deviance") was undefined for some folds b/c it is undefined when y_hat is 0. This was handled during scoring but when create_cv_results
was called, this raised a TypeError: 'float' object is not subscriptable
. This is b/c score
would normally return a dictionary when mutliple scorers are requested but in this case it returned the value I had passed as error_score
to GridSearchCV
, which in this case was np.nan
. The issue is between L274 and L297 in methods.py.
What you expected to happen:
I expected that score to be np.nan
for the folds in which the scorer failed, but not to raise an error
Minimal Complete Verifiable Example:
from sklearn.linear_model import LinearRegression
from dask_ml.model_selection import GridSearchCV
from sklearn.model_selection import LeaveOneOut
import numpy as np
X = np.array([[1, 2],
[2, 1],
[0, 0]])
y = 3 * X[:, 0] + 4 * X[:, 1]
cv = LeaveOneOut()
ols = LinearRegression(fit_intercept=False)
regr = GridSearchCV(
ols,
{"normalize": [False, True]},
scoring=["neg_mean_squared_error", "neg_mean_poisson_deviance"],
refit=False,
cv=cv,
error_score=np.nan,
n_jobs=1
)
regr.fit(X,y)
This gives the TypeError I mentioned
Anything else we need to know?:
I think this should be a fairly quick fix so I'm going to give it a try