-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[timeseries] Fix bugs in DirectTabularModel #3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Job PR-3350-64b1eda is done. |
@shchur Can we create a separate issue to track usage of ruff? Sounds useful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
predictions.set_index(data_future.index, inplace=True) | ||
predictions.index = data_future.index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that is very subtle. I'm surprised Series doesn't have .set_index
, seems like a strange difference in API. Good catch!
64b1eda
to
d16a57a
Compare
Job PR-3350-d16a57a is done. |
* 'master' of https://github.com/awslabs/autogluon: 0.8.2 release notes (autogluon#3371) Pin pydantic (autogluon#3370) 0.8.1 Post Release (autogluon#3367) [Merge after v0.8.1][AutoMM] Unpin sentencepiece (autogluon#3368) codespell: action, config + some typos fixed (autogluon#3323) 0.8.1 release notes (autogluon#3363) [AutoMM] Update per gpu batch size for HPO (autogluon#3360) [Multimodal] Fix hpo crashing on reuse_actor (autogluon#3361) Add missing colab buttons to advanced AutoMM tutorials (autogluon#3359) Fix log suppression workaround for print statements in suod (autogluon#3364) Pin dependencies (autogluon#3358) [timeseries] Fix bugs in DirectTabularModel (autogluon#3350) Fix refit crash (autogluon#3348)
Issue #, if available: #3341
Description of changes:
This PR addresses two bugs related to the
DirectTabularModel
introduced in v0.8.DirectTabular
model fails for non-default values ofeval_metric
.Under the hood,
TabularPredictor
is used differently depending on theeval_metric
.eval_metric="mean_wQuantileLoss"
: we useproblem_type="quantile"
, predictor returns apd.DataFrame
.eval_metric!="mean_wQuantileLoss"
: we useproblem_type="regression"
, predictor returns apd.Series
. Unlike DataFrame, Series has no methodset_index
, which leads to an exception.Our smoke tests did not consider the non-default values for
eval_metric
. When doing a sanity check of the leaderboards produced during benchmark runs (to make sure that no model fails on any dataset), I picked the first folder for each dataset, which happened to correspond to the quantile loss eval metric, where the model didn't fail.I modified the smoke test in
tests/smoketests/test_features_and_covariates.py
to run with both quantile & point forecast metrics. Another good idea is to use a static analysis tool like https://github.com/astral-sh/ruff to catch such bugs in the future.AutoARIMA
were not silenced in some cases and flooded the logs (see example below).An interaction between Joblib, Python warning management, and the changed model order in the presets. The timeline looks as follows:
Naive
,SeasonalNaive
,AutoETS
,Theta
). Joblib creates a process pool, where the warnings are silenced usingPYTHONWARNINGS=ignore
environmental variable. This pool is reused by all models in this group, warnings are silenced.DeepAR
,RecursiveTabular
) that don't use Joblib.DirectTabular
. This model uses Joblib but never produces any warnings itself.PYTHONWARNINGS=ignore
flag.AutoARIMA
-> we re-use the process pool used byDirectTabular
. We always set the flagPYTHONWARNINGS=ignore
here, but this may have no effect if the pool was created byDirectTabular
without the flag. In this case, warnings leak into the log.I ran the sanity checks on small datasets, where step (b) took < 300 seconds, so the bug did not occur. Out CI does not check if warnings are raised.
I made sure that the flag
PYTHONWARNINGS=ignore
is set in theDirectTabular
model. Ideally, a larger refactor to standardize the usage of warning filters in the code would be helpful, but I would postpone these changes to v0.9. I don't think that there is a waterproof way to check this automatically in the CI, but we should investigate the logs for large benchmark runs more thoroughly.Example warnings produced by AutoARIMA:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.