Avoid Memory Leak by Using a Sequential Version of ParallelFoldFittingStrategy instead of SequentialLocalFoldFittingStrategy If Not Enough Memory to Fit Models in Parallel #3614
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem that is to be solved:
By default, AutoGluon uses the ParallelFoldFittingStrategy. In edge cases where AutoGluon detects that there is not enough memory to fit folds in parallel, AutoGluon switches to SequentialLocalFoldFittingStrategy instead.
While this is the most time efficient alternative, I observed that this leads to memory leaks after training models; reducing the already too small memory even further in the above-mentioned edge case. Consequently, certain models won’t be fit and the available memory is reduced for the rest of the run of AutoGluon.
The reasons that this happens is that Python's garbage collector works in mysterious and unreliable ways. In this case, it also makes no difference to manually delete objects from memory, use sub-functions, or calling the garbage collector explicitly. The only reliable solution I found so far that avoids memory leakage, is to execute memory-consuming code in a subprocess such that when leaving the subprocess no trace of unwanted memory remains.
Luckily, AutoGluon already has a functionality to fit all models in a subprocess via ray - as it is done in the ParallelFoldFittingStrategy. I.e., usually, we do not have to worry about memory leakage but once we switch to SequentialLocalFoldFittingStrategy, we do need to worry.
Thus, I propose to adjust ParallelFoldFittingStrategy instead of switching to SequentialLocalFoldFittingStrategy in the edge cases where there is not enough memory to fit folds in parallel. In essence, we adjust ParallelFoldFittingStrategy such that it only fits one fold at most in a sequential manner, see the code for more details.
One downside of this PR is that ParallelFoldFittingStrategy introduces a time overhead that makes it less efficient than SequentialLocalFoldFittingStrategy. But considering that we have a serious memory and not a time problem in the above-mentioned edge case, this should be a tolerable trade-off. One additional downside would be debugging, as it becomes harder for the user to debug the edge-case. As the log messages explains, the user can still overrule the behavior in the edge case by passing setting SequentialLocalFoldFittingStrategy to be run for the model in question.
Description of changes:
Adjust the reaction to not having enough memory in
bagged_ensemble_model.py
and adjustfold_fitting_strategy.py
to support a sequential version of ParallelFoldFittingStrategy in a memory-optimal way.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.