Skip to content

is xgboostclassifier imcomptabile with calibratedclassifier? #5887

@zahs123

Description

@zahs123

all,

i do not know why i am getting the following error: ValueError: feature_names mismatch. this is what i am running:
to get my data:

target=df['status']
train = df.drop(columns=['status'])
x_train, x_valid, y_train, y_valid = train_test_split(train, target, stratify=target, random_state=42, test_size=0.2)
x_train, x_test, y_train, y_test = train_test_split(x_train, y_train, stratify=y_train, random_state=42, test_size=0.2)

and then i run grid search

kfolds = StratifiedKFold(3)
clf = GridSearchCV(models['XGBOOST'], params['XGBOOST'], cv=kfolds.split(x_train, y_train),
                       scoring='roc_auc', return_train_score=True)

clf.fit(x_train, y_train)

model = clf.best_estimator_

clf_isotonic = CalibratedClassifierCV(model, cv='prefit', method='isotonic')
clf_isotonic.fit(x_valid, y_valid)

but i get the above error. my x_valid and x_train have the same columns even when i fix the columns using:

#f_names = model.get_booster().feature_names
f_names = x_train.columns.tolist()
x_valid[f_names]

i do not understand why i am getting that error even when i fix the columns as above, i have tried doing x_valid.values but still no hope... they have the same features so i really do not know what is happening

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions