-
Notifications
You must be signed in to change notification settings - Fork 1k
Support Dynamic Stacking to Avoid Stacked Overfitting #3616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… to ray in the future
Job PR-3616-11eef0c is done. |
Job PR-3616-bb16dbc is done. |
Job PR-3616-fa14d95 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added initial review
raise ValueError("Unsupported validation procedure during dynamic stacking!") | ||
|
||
set_logger_verbosity(ag_fit_kwargs["verbosity"]) | ||
org_learner = copy.deepcopy(self._learner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we creating a copy of the learner instead of a copy of the predictor? (both are probably fine, but wanted to know the reasoning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding the only part of the predictor that is affected / needed for a reset to the original state before a fit is the learner. I think copying the predictor would be more expensive (copying useless stuff) than copying the learner. But for safety, we could also copy the predictor and reset the object this way.
Job PR-3616-64c98fc is done. |
Job PR-3616-7711a9a is done. |
Job PR-3616-8228cc4 is done. |
Job PR-3616-9e20387 is done. |
Job PR-3616-6e2aace is done. |
Job PR-3616-f090dc2 is done. |
Benchmark results look good! Thanks for the amazing contribution @LennartPurucker! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Rel.: #2779
Description of changes:
This PR adds support for dynamic stacking. The idea of dynamic stacking is to avoid stacked overfitting (a.k.a. stacked information leakage) by fitting AutoGluon at least twice. All but the last fits are done on a subset of the training data whereby a holdout set is used to determine if stacked overfitting occurs for the provided data. Afterwards, we run the last fit with or without multi-layer stacking depending on whether stacked overfitting has not occurred in a previous fit.
We understand dynamic stacking as a baseline solution to avoid stacked overfitting. Nevertheless, so far, it is the best solution we have benchmarked due to it being most consistent (which is very important for AutoML systems). Moreover, due to having a holdout set as a result of this approach, we have an empirical source of truth to determine if stacked overfitting occurred. While this source of truth may not necessarily generalize to the test data, it has so far still been better than any heuristic alternative or adjustment to the out-of-fold predictions.
The default version of the code in this PR has an accuracy (!preliminary numbers!) of correctly determining whether stacked overfitting occurs for each fold of the AutoML benchmark of ~74% (balanced accuracy: ~69%). This translates into better predictive performance on average.
Code Example
Script (outdated, see tests for newest examples)
TODOs and Open Questions for Merge
Write tests:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.