Skip to content

[BUG]: Transformed test set shape (samples) is not the same input test_data shape (samples) #3428

@aymaneco

Description

@aymaneco

pycaret version checks

Issue Description

My X_train has 2552 samples and my X_HO has 851 samples.

  • In the latest version of pycaret, when i'm using just test_data=X_HO in the setup() method, my transformed test set shape has 1702 samples (the origin samples are doubled), however as stated before my X_HO has just 851 samples.
    image
    This issue was not observed in the previous version of pycaret, i tested with the old version ('2.3.10') '2.3.10'
    image

Reproducible Example

exp_name = setup(
    session_id=1234,
    data=X_train,
    target='target',
    preprocess=False,
    remove_outliers=False,
    test_data=X_HO,
    data_split_shuffle=True,
    ordinal_features={'column': list(X_train.column.unique())})

Expected Behavior

I'm expecting that the results be the same as the previous version of pycaret, as here in the screen.
image

Actual Results

The problem is already detailed in the description.

Installed Versions

Pycaret 3.0.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions