Skip to content

Broken datasets due to pandas API changes  #13

@francois-rozet

Description

@francois-rozet

Hello @gpapamak,

Due to API changes in pandas, the GAS and HEPMASS datasets are not usable anymore. Notably, the DataFrame.as_matrix method has been deprecated since pandas=0.23.0 and the DataFrame pickling format of pandas<2.0 is not compatible with pandas>=2.0. There is also an issue with Counter.iteritems which is deprecated since Python 3.0.

I don't think modifying this repository to fix these issues is a good idea as it could break the code. Instead, I made a lightweight fork (francois-rozet/uci-datasets) of the repo's UCI datasets and wrote instructions to generate environment-agnostic .npy files containing the processed data. These .npy files can then be used without relying on the original code and its dependencies. I hope it's ok for you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions