-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
Closed
Labels
Description
all,
i do not know why i am getting the following error: ValueError: feature_names mismatch. this is what i am running:
to get my data:
target=df['status']
train = df.drop(columns=['status'])
x_train, x_valid, y_train, y_valid = train_test_split(train, target, stratify=target, random_state=42, test_size=0.2)
x_train, x_test, y_train, y_test = train_test_split(x_train, y_train, stratify=y_train, random_state=42, test_size=0.2)
and then i run grid search
kfolds = StratifiedKFold(3)
clf = GridSearchCV(models['XGBOOST'], params['XGBOOST'], cv=kfolds.split(x_train, y_train),
scoring='roc_auc', return_train_score=True)
clf.fit(x_train, y_train)
model = clf.best_estimator_
clf_isotonic = CalibratedClassifierCV(model, cv='prefit', method='isotonic')
clf_isotonic.fit(x_valid, y_valid)
but i get the above error. my x_valid and x_train have the same columns even when i fix the columns using:
#f_names = model.get_booster().feature_names
f_names = x_train.columns.tolist()
x_valid[f_names]
i do not understand why i am getting that error even when i fix the columns as above, i have tried doing x_valid.values but still no hope... they have the same features so i really do not know what is happening