Skip to content

Conversation

shchur
Copy link
Collaborator

@shchur shchur commented May 7, 2024

Tabular forecasting models would occasionally fail when a non-boolean known real covariate was mistakenly interpreted as a boolean covariate.

MWE:

import pandas as pd
from autogluon.timeseries import TimeSeriesPredictor
N = 30
df = pd.DataFrame(
    {
        "item_id": ["A"] * N,
        "timestamp": pd.date_range("2020-01-01", freq="D", periods=N),
        "target": np.random.normal(size=N),
        "feat": np.tile([5, 0, 0], int(N/3)),
    }
)
predictor = TimeSeriesPredictor(known_covariates_names=["feat"]).fit(df, hyperparameters={"RecursiveTabular": {}})

This code will fail with the following exception

	Warning: Exception caused RecursiveTabular to fail during training... Skipping this model.
	"['__scaled_feat'] not in index"

This happens because during fit(), the feature feat is interpreted as non-boolean, so a scaled copy of the feature is added. At predict time, when transforming known_covariates (containing one row with only the 0 value), the feature was interpreted as boolean, so no scaled version of the feature was added. This results in the model failing at prediction time.

Description of changes:

  • This PR fixes the above bug.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shchur shchur added bug Something isn't working module: timeseries related to the timeseries module labels May 7, 2024
@shchur shchur added this to the 1.1.1 Release milestone May 7, 2024
@shchur shchur requested a review from canerturkmen May 7, 2024 09:08
@yinweisu
Copy link
Contributor

yinweisu commented May 7, 2024

Previous CI Run Current CI Run
tenacity==8.2.3 tenacity==8.3.0
tenacity==8.2.3 tenacity==8.3.0

Copy link
Contributor

@canerturkmen canerturkmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

Copy link

github-actions bot commented May 7, 2024

Job PR-4175-3958cdb is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4175/3958cdb/index.html

@shchur shchur merged commit 84999da into autogluon:master May 7, 2024
@shchur shchur deleted the fix-nonbool-tab-covariates branch May 7, 2024 11:43
LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: timeseries related to the timeseries module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants