Skip to content

Conversation

shchur
Copy link
Collaborator

@shchur shchur commented Jun 26, 2023

Issue #, if available: #3341

Description of changes:
This PR addresses two bugs related to the DirectTabularModel introduced in v0.8.

  1. DirectTabular model fails for non-default values of eval_metric.
    • Reason
      Under the hood, TabularPredictor is used differently depending on the eval_metric.
      • If eval_metric="mean_wQuantileLoss": we use problem_type="quantile", predictor returns a pd.DataFrame.
      • If eval_metric!="mean_wQuantileLoss": we use problem_type="regression", predictor returns a pd.Series. Unlike DataFrame, Series has no method set_index, which leads to an exception.
    • Why didn't we catch this bug before?
      Our smoke tests did not consider the non-default values for eval_metric. When doing a sanity check of the leaderboards produced during benchmark runs (to make sure that no model fails on any dataset), I picked the first folder for each dataset, which happened to correspond to the quantile loss eval metric, where the model didn't fail.
    • How do we avoid such bugs in the future?
      I modified the smoke test in tests/smoketests/test_features_and_covariates.py to run with both quantile & point forecast metrics. Another good idea is to use a static analysis tool like https://github.com/astral-sh/ruff to catch such bugs in the future.
  2. Warnings produced by AutoARIMA were not silenced in some cases and flooded the logs (see example below).
    • Reason
      An interaction between Joblib, Python warning management, and the changed model order in the presets. The timeline looks as follows:
      1. We train most local models that all rely on Joblib (Naive, SeasonalNaive, AutoETS, Theta). Joblib creates a process pool, where the warnings are silenced using PYTHONWARNINGS=ignore environmental variable. This pool is reused by all models in this group, warnings are silenced.
      2. Train other models (e.g., DeepAR, RecursiveTabular) that don't use Joblib.
      3. Train DirectTabular. This model uses Joblib but never produces any warnings itself.
        • If step (b) took < 300 seconds, this model re-uses the existing process pool.
        • If step (b) took > 300 seconds, a new process pool is created WITHOUT the PYTHONWARNINGS=ignore flag.
      4. Train AutoARIMA -> we re-use the process pool used by DirectTabular. We always set the flag PYTHONWARNINGS=ignore here, but this may have no effect if the pool was created by DirectTabular without the flag. In this case, warnings leak into the log.
    • Why didn't we catch this bug before?
      I ran the sanity checks on small datasets, where step (b) took < 300 seconds, so the bug did not occur. Out CI does not check if warnings are raised.
    • How do we avoid such bugs in the future?
      I made sure that the flag PYTHONWARNINGS=ignore is set in the DirectTabular model. Ideally, a larger refactor to standardize the usage of warning filters in the code would be helpful, but I would postpone these changes to v0.9. I don't think that there is a waterproof way to check this automatically in the CI, but we should investigate the logs for large benchmark runs more thoroughly.

Example warnings produced by AutoARIMA:

/home/shchuro/miniconda3/envs/ag/lib/python3.9/site-packages/statsforecast/arima.py:884: UserWarning: possible convergence problem: minimize gave code 2]
  warnings.warn(
/home/shchuro/miniconda3/envs/ag/lib/python3.9/site-packages/statsforecast/arima.py:884: UserWarning: possible convergence problem: minimize gave code 2]
  warnings.warn(
/home/shchuro/miniconda3/envs/ag/lib/python3.9/site-packages/statsforecast/arima.py:884: UserWarning: possible convergence problem: minimize gave code 2]
  warnings.warn(

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

Job PR-3350-64b1eda is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3350/64b1eda/index.html

@Innixma
Copy link
Contributor

Innixma commented Jun 26, 2023

@shchur Can we create a separate issue to track usage of ruff? Sounds useful

@Innixma Innixma added this to the 0.9 Release milestone Jun 26, 2023
@Innixma Innixma added bug Something isn't working module: timeseries related to the timeseries module labels Jun 26, 2023
Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines -389 to +391
predictions.set_index(data_future.index, inplace=True)
predictions.index = data_future.index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that is very subtle. I'm surprised Series doesn't have .set_index, seems like a strange difference in API. Good catch!

@shchur shchur force-pushed the fix-direct-tabular-bugs branch from 64b1eda to d16a57a Compare June 27, 2023 10:06
@shchur shchur merged commit 174d67c into autogluon:master Jun 27, 2023
@shchur shchur deleted the fix-direct-tabular-bugs branch June 27, 2023 12:21
@github-actions
Copy link

Job PR-3350-d16a57a is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3350/d16a57a/index.html

ddelange added a commit to ddelange/autogluon that referenced this pull request Jul 3, 2023
* 'master' of https://github.com/awslabs/autogluon:
  0.8.2 release notes (autogluon#3371)
  Pin pydantic (autogluon#3370)
  0.8.1 Post Release (autogluon#3367)
  [Merge after v0.8.1][AutoMM]  Unpin sentencepiece (autogluon#3368)
  codespell: action, config + some typos fixed (autogluon#3323)
  0.8.1 release notes (autogluon#3363)
  [AutoMM] Update per gpu batch size for HPO (autogluon#3360)
  [Multimodal] Fix hpo crashing on reuse_actor (autogluon#3361)
  Add missing colab buttons to advanced AutoMM tutorials (autogluon#3359)
  Fix log suppression workaround for print statements in suod (autogluon#3364)
  Pin dependencies (autogluon#3358)
  [timeseries] Fix bugs in DirectTabularModel (autogluon#3350)
  Fix refit crash (autogluon#3348)
@Innixma Innixma modified the milestones: 0.9 Release, 1.0 Release Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: timeseries related to the timeseries module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants