[timeseries] Clarify documentation related to `test_data` #4054

shchur · 2024-04-06T09:01:27Z

Issue #, if available: fixes #4050

Description of changes:

Clearly describe that test data must include both historic & future time series values

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

review-notebook-app · 2024-04-06T09:01:32Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

yinweisu · 2024-04-06T09:12:43Z

Previous CI Run	Current CI Run

canerturkmen

LGTM! Thanks!

github-actions · 2024-04-06T11:41:13Z

Job PR-4054-3c7fe15 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4054/3c7fe15/index.html

lleiou · 2024-04-06T12:20:43Z

Thank you so much for moving so quickly to adopt my suggested changes!

Just want to follow up on the reason for requiring historical period in test_data: let's say the prediction_length of my predictor is 30 days, why can't I pass a test_data which also has 30 days of data to .leaderboard() since that is the only data I want to calculate the evaluation score on? Thanks!

shashankumar2812 · 2024-04-10T03:32:24Z

I have the same question: why is it required that data must be at least prediction_length + 1? Why not prediction_length instead of prediction_length+1?

Thank you so much for moving so quickly to adopt my suggested changes!

Just want to follow up on the reason for requiring historical period in test_data: let's say the prediction_length of my predictor is 30 days, why can't I pass a test_data which also has 30 days of data to .leaderboard() since that is the only data I want to calculate the evaluation score on? Thanks!

shchur · 2024-04-10T07:42:41Z

@lleiou @shashankumar2812 When we call predictor.evaluate(data), essentially, the following happens:

We split data into two disjoint parts:
- future_data (last prediction_length time steps of each time series)
- past_data (everything before the last prediction_length steps of each time series).
We call forecast = predictor.predict(past_data)
We compare forecast & future data score = eval_metric(future_data, forecast).

If we only provide the last prediction_length observations of each time series to predictor.evaluate, then after the split our past_data will be empty, so the predictor won't be able to generate a forecast for the forecast horizon.

We opted for this more general design (where both past data + future data is required when calling evaluate) since it allows us to evaluate the predictor on different time series / different train-test splits. If we went for the alternative approach, where predictor.evaluate accepts just the future data, this would mean that we can only evaluate the predictor on the exact same time series used for training, using only the future values starting from the end of the train data.

shashankumar2812 · 2024-04-16T15:09:37Z

Thanks @shchur for your answer

…4054)

Clarify docs for test data

3c7fe15

shchur added the API & Doc Improvements or additions to documentation label Apr 6, 2024

shchur mentioned this pull request Apr 6, 2024

Add an explanation of requirements for data input in TimeSeriesPredictor.leaderboard() #4050

Closed

shchur changed the title ~~Clarify docs for test data~~ [timeseries] Clarify documentation related to test_data Apr 6, 2024

shchur requested a review from canerturkmen April 6, 2024 09:02

canerturkmen approved these changes Apr 6, 2024

View reviewed changes

shchur merged commit 15cf0ed into autogluon:master Apr 6, 2024

shchur deleted the clarify-eval-docs branch April 6, 2024 11:15

LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024

[timeseries] Clarify documentation related to test_data (autogluon#…

d5e335c

…4054)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[timeseries] Clarify documentation related to `test_data` #4054

[timeseries] Clarify documentation related to `test_data` #4054

Uh oh!

shchur commented Apr 6, 2024

Uh oh!

review-notebook-app bot commented Apr 6, 2024

Uh oh!

yinweisu commented Apr 6, 2024

Uh oh!

canerturkmen left a comment

Uh oh!

github-actions bot commented Apr 6, 2024

Uh oh!

lleiou commented Apr 6, 2024 •

edited

Loading

Uh oh!

shashankumar2812 commented Apr 10, 2024

Uh oh!

shchur commented Apr 10, 2024

Uh oh!

shashankumar2812 commented Apr 16, 2024

Uh oh!

Uh oh!

[timeseries] Clarify documentation related to test_data #4054

[timeseries] Clarify documentation related to test_data #4054

Uh oh!

Conversation

shchur commented Apr 6, 2024

Uh oh!

review-notebook-app bot commented Apr 6, 2024

Uh oh!

yinweisu commented Apr 6, 2024

Uh oh!

canerturkmen left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 6, 2024

Uh oh!

lleiou commented Apr 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shashankumar2812 commented Apr 10, 2024

Uh oh!

shchur commented Apr 10, 2024

Uh oh!

shashankumar2812 commented Apr 16, 2024

Uh oh!

Uh oh!

[timeseries] Clarify documentation related to `test_data` #4054

[timeseries] Clarify documentation related to `test_data` #4054

lleiou commented Apr 6, 2024 •

edited

Loading