[timeseries] Fix tabular models sometimes failing because of a bug in preprocessing logic #4175

shchur · 2024-05-07T09:07:45Z

Tabular forecasting models would occasionally fail when a non-boolean known real covariate was mistakenly interpreted as a boolean covariate.

MWE:

import pandas as pd
from autogluon.timeseries import TimeSeriesPredictor
N = 30
df = pd.DataFrame(
    {
        "item_id": ["A"] * N,
        "timestamp": pd.date_range("2020-01-01", freq="D", periods=N),
        "target": np.random.normal(size=N),
        "feat": np.tile([5, 0, 0], int(N/3)),
    }
)
predictor = TimeSeriesPredictor(known_covariates_names=["feat"]).fit(df, hyperparameters={"RecursiveTabular": {}})

This code will fail with the following exception

	Warning: Exception caused RecursiveTabular to fail during training... Skipping this model.
	"['__scaled_feat'] not in index"

This happens because during fit(), the feature feat is interpreted as non-boolean, so a scaled copy of the feature is added. At predict time, when transforming known_covariates (containing one row with only the 0 value), the feature was interpreted as boolean, so no scaled version of the feature was added. This results in the model failing at prediction time.

Description of changes:

This PR fixes the above bug.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

yinweisu · 2024-05-07T09:16:06Z

Previous CI Run	Current CI Run
tenacity==8.2.3	tenacity==8.3.0
tenacity==8.2.3	tenacity==8.3.0

canerturkmen

Thanks! LGTM!

github-actions · 2024-05-07T11:32:45Z

Job PR-4175-3958cdb is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4175/3958cdb/index.html

… preprocessing logic (autogluon#4175)

Fix preprocessing for non-boolean real covariates

3958cdb

shchur added bug Something isn't working module: timeseries related to the timeseries module labels May 7, 2024

shchur added this to the 1.1.1 Release milestone May 7, 2024

shchur requested a review from canerturkmen May 7, 2024 09:08

canerturkmen approved these changes May 7, 2024

View reviewed changes

shchur merged commit 84999da into autogluon:master May 7, 2024

shchur deleted the fix-nonbool-tab-covariates branch May 7, 2024 11:43

LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024

[timeseries] Fix tabular models sometimes failing because of a bug in…

39f157a

… preprocessing logic (autogluon#4175)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[timeseries] Fix tabular models sometimes failing because of a bug in preprocessing logic #4175

[timeseries] Fix tabular models sometimes failing because of a bug in preprocessing logic #4175

Uh oh!

shchur commented May 7, 2024

Uh oh!

yinweisu commented May 7, 2024

Uh oh!

canerturkmen left a comment

Uh oh!

github-actions bot commented May 7, 2024

Uh oh!

Uh oh!

[timeseries] Fix tabular models sometimes failing because of a bug in preprocessing logic #4175

[timeseries] Fix tabular models sometimes failing because of a bug in preprocessing logic #4175

Uh oh!

Conversation

shchur commented May 7, 2024

Uh oh!

yinweisu commented May 7, 2024

Uh oh!

canerturkmen left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 7, 2024

Uh oh!

Uh oh!