-
Notifications
You must be signed in to change notification settings - Fork 1k
[tabular] Stratify regression tasks via binning #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tabular] Stratify regression tasks via binning #4586
Conversation
Job PR-4586-8070c34 is done. |
8070c34
to
28e823d
Compare
Benchmark results were inconclusive. I'll wait to benchmark this more thoroughly in future. For now I've not enabled the regression stratification by default, but the user can still enable it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Left a nit comment on docstring.
""" | ||
Parameters | ||
---------- | ||
stratify : bool, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add the first few args in the docstring as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added
Issue #, if available:
Resolves #4453
Resolves #4771
Description of changes:
Stratify regression and quantile tasks via binning when doing cross-validation (such as during bagging)
Fix bug in dynamic stacking cross-validation mode where stratification was not being done on binary/multiclass, but was being done on other problem_types (opposite of intention).
Added type hints
TODO: Benchmark
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.