[POC] Implement callback for dask. #5612

trivialfis · 2020-04-27T10:07:44Z

Proof of concept implementation for dask training callback. As it is not consistent with single node implementation, this is an early PR for review and discussion.

The rational behind a new style of callback function is that, the original one is modelled after R implementation. The R impl uses language environment for storing states, while Python doesn't have such facility so the original callback creates an artificial one to be consistent with R. I believe the new style is cleaner, native to Python, and easier customize. With the new interface, #5238 and #5495 can be easily implemented.

To provide a short introduction to new callback function, here is its interface:

class Callback:
    def __init__(self):
        self.history = {}
        pass

    def before_training(self, model):
        pass

    def after_training(self, model):
        pass

    def before_iteration(self, model, epoch):
        return False

    def after_iteration(self, model, epoch):
        return False

All the other callbacks is sub-class of this Callback. This interface is suitable for both single node and distributed training. Function return value is whether training should be stopped, False if not. By using a return value, we can deprecate EarlyStopException, which is created for one particular callback.

To-dos:

Make sure no data copy is performed.
Get correct starting rounds: Obtain num boosted rounds directly from booster. #5511 is closed so I need to figure out a way to get correct starting round.
Be compatible with original callback function. The new Callback can be a wrapper of original callback, I need to port the implementation to non-dask training.
Support user defined metric.
Tests.

Future plan

If the new design is approved, I would like to deprecate the following parameters:

In scikit-learn interface fit function eval_set, eval_metric, early_stopping_rounds, verbose, sample_weight_eval_set,
From train function evals, feval, maximize, early_stopping_rounds, evals_result, verbose_eval=True
lastly corresponding parameters in dask.

Codecov Report

Attention: Patch coverage is 82.10227% with 63 lines in your changes missing coverage. Please review.

Project coverage is 82.52%. Comparing base (d3a0efb) to head (421ce2e).
Report is 3137 commits behind head on master.

Files with missing lines	Patch %	Lines
python-package/xgboost/dask.py	72.66%	41 Missing ⚠️
python-package/xgboost/callback.py	87.93%	21 Missing ⚠️
python-package/xgboost/training.py	95.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5612      +/-   ##
==========================================
- Coverage   82.62%   82.52%   -0.11%     
==========================================
  Files          12       12              
  Lines        2728     3015     +287     
==========================================
+ Hits         2254     2488     +234     
- Misses        474      527      +53

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

trivialfis commented Apr 27, 2020

View reviewed changes

python-package/xgboost/dask.py Outdated Show resolved Hide resolved

trivialfis force-pushed the callback-1 branch from c7b4127 to f4e59e6 Compare May 11, 2020 10:13

trivialfis force-pushed the callback-1 branch 3 times, most recently from 487b805 to 7428ef5 Compare June 2, 2020 07:59

hcho3 mentioned this pull request Jun 2, 2020

Reorder includes. #5749

Merged

hcho3 mentioned this pull request Jun 3, 2020

[Roadmap] 1.2.0 Roadmap #5734

Closed

14 tasks

Implement callback for dask.

1b0408a

trivialfis force-pushed the callback-1 branch from 95e09e9 to 1b0408a Compare June 3, 2020 06:48

trivialfis added 2 commits June 3, 2020 18:28

Basic test for single node.

1d59f01

Custom metric tests.

421ce2e

trivialfis mentioned this pull request Aug 18, 2020

[Roadmap] 1.3.0 Roadmap #6031

Closed

14 tasks

trivialfis mentioned this pull request Oct 4, 2020

Rework Python callback functions. #6199

Merged

trivialfis closed this Oct 11, 2020

trivialfis deleted the callback-1 branch October 11, 2020 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[POC] Implement callback for dask. #5612

[POC] Implement callback for dask. #5612

Uh oh!

trivialfis commented Apr 27, 2020 •

edited

Loading

Uh oh!

Uh oh!

trivialfis commented Jun 2, 2020

Uh oh!

hcho3 commented Jun 2, 2020

Uh oh!

codecov-commenter commented Jun 2, 2020 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[POC] Implement callback for dask. #5612

[POC] Implement callback for dask. #5612

Uh oh!

Conversation

trivialfis commented Apr 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Future plan

Related:

Uh oh!

Uh oh!

trivialfis commented Jun 2, 2020

Uh oh!

hcho3 commented Jun 2, 2020

Uh oh!

codecov-commenter commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

trivialfis commented Apr 27, 2020 •

edited

Loading

codecov-commenter commented Jun 2, 2020 •

edited

Loading