Skip to content

Conversation

Innixma
Copy link
Contributor

@Innixma Innixma commented Jun 5, 2024

Issue #, if available:

Description of changes:

  • Add detailed exception logging to AsTypeFeatureGenerator if the data at test time cannot be successfully converted to the expected dtype.

Example output when switching age column from int to a string column at test time in AdultIncome:

This PR

Exception encountered in AsTypeFeatureGenerator ... Please check if feature data types differ between train and test (via df.dtypes).
Exception breakdown by feature:
           feature      status dtype_input   dtype_to_convert_to                                          exception
0              age  ValueError      object                 int64  invalid literal for int() with base 10: ' Priv...
1        workclass     Success      object                object                                               None
2           fnlwgt     Success       int64                 int64                                               None
3        education     Success      object                object                                               None
4    education-num     Success       int64                 int64                                               None
5   marital-status     Success      object                object                                               None
6       occupation     Success      object                object                                               None
7     relationship     Success      object                object                                               None
8             race     Success      object                object                                               None
9              sex     Success        int8  <class 'numpy.int8'>                                               None
10    capital-gain     Success       int64                 int64                                               None
11    capital-loss     Success       int64                 int64                                               None
12  hours-per-week     Success       int64                 int64                                               None
13  native-country     Success      object                object                                               None

ValueError: invalid literal for int() with base 10: ' Private': Error while type casting for column 'age'

Mainline

ValueError: invalid literal for int() with base 10: ' Private': Error while type casting for column 'age'

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Innixma Innixma added API & Doc Improvements or additions to documentation module: tabular labels Jun 5, 2024
@Innixma Innixma added this to the 1.1.1 Release milestone Jun 5, 2024
@Innixma Innixma requested review from rey-allan and suzhoum June 5, 2024 23:49
@yinweisu
Copy link
Contributor

yinweisu commented Jun 5, 2024

Previous CI Run Current CI Run

Copy link

github-actions bot commented Jun 6, 2024

Job PR-4251-7df9c95 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4251/7df9c95/index.html

Copy link
Contributor

@rey-allan rey-allan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@suzhoum suzhoum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very helpful feature.

@Innixma Innixma merged commit dae0e5c into autogluon:master Jun 6, 2024
@Innixma Innixma deleted the improve_astype_exception branch April 16, 2025 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API & Doc Improvements or additions to documentation module: tabular
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants