Skip to content

Conversation

ogrisel
Copy link
Contributor

@ogrisel ogrisel commented Jun 19, 2018

And change the default value to 5min.

I have noticed that initializing 88 Python workers on a machine with 88 logical CPUs and an NFS filesystem can take 20 to 30s. Therefore I think using 10s as default worker timeout is too low to be useful on large computing servers. I think 5 min is a more useful default value.

I think 5 min should also be the default value for loky itself.

@ogrisel ogrisel requested a review from tomMoral June 19, 2018 12:30
@ogrisel ogrisel force-pushed the idle-worker-timeout branch from 76ba643 to 1943efe Compare June 19, 2018 12:30
@codecov
Copy link

codecov bot commented Jun 19, 2018

Codecov Report

Merging #698 into master will decrease coverage by 7.67%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #698      +/-   ##
==========================================
- Coverage    95.1%   87.42%   -7.68%     
==========================================
  Files          40       40              
  Lines        5694     5694              
==========================================
- Hits         5415     4978     -437     
- Misses        279      716     +437
Impacted Files Coverage Δ
joblib/_parallel_backends.py 92.79% <ø> (-4.24%) ⬇️
joblib/executor.py 100% <100%> (ø) ⬆️
joblib/numpy_pickle_compat.py 44% <0%> (-47%) ⬇️
joblib/_compat.py 72.72% <0%> (-27.28%) ⬇️
joblib/test/common.py 64.4% <0%> (-23.73%) ⬇️
joblib/test/test_numpy_pickle.py 80.46% <0%> (-17.37%) ⬇️
joblib/pool.py 75.86% <0%> (-15.52%) ⬇️
joblib/backports.py 81.25% <0%> (-14.59%) ⬇️
joblib/test/test_hashing.py 85.47% <0%> (-13.41%) ⬇️
joblib/compressor.py 79.93% <0%> (-13.27%) ⬇️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1490995...1943efe. Read the comment docs.

Copy link
Contributor

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ogrisel ogrisel merged commit 9a3e122 into joblib:master Jun 19, 2018
@ogrisel ogrisel deleted the idle-worker-timeout branch June 19, 2018 13:45
yarikoptic added a commit to yarikoptic/joblib that referenced this pull request Jul 28, 2018
* tag '0.12': (116 commits)
  Release 0.12
  typo
  typo
  typo
  ENH add initializer limiting n_threads for C-libs (joblib#701)
  DOC better parallel docstring (joblib#704)
  [MRG] Nested parallel call thread bomb mitigation (joblib#700)
  MTN vendor loky2.1.3 (joblib#699)
  Make it possible to configure the reusable executor workers timeout (joblib#698)
  MAINT increase timeouts to make test more robust on travis
  DOC: use the .joblib extension instead of .pkl (joblib#697)
  [MRG] Fix exception handling in nested parallel calls (joblib#696)
  Fix skip test lz4 not installed (joblib#695)
  [MRG] numpy_pickle:  several enhancements (joblib#626)
  Introduce Parallel.__call__ backend callbacks (joblib#689)
  Add distributed on readthedocs (joblib#686)
  Support registration of external backends (joblib#655)
  [MRG] Add a dask.distributed example (joblib#613)
  ENH use cloudpickle to pickle interactively defined callable (joblib#677)
  CI freeze the version of sklearn0.19.1 and scipy1.0.1 (joblib#685)
  ...
yarikoptic added a commit to yarikoptic/joblib that referenced this pull request Jul 28, 2018
* releases: (121 commits)
  Release 0.12.1
  fix kwonlydefaults key error in filter_args (joblib#715)
  MNT fix some "undefined name" flake8 warnings (joblib#713).
  from importlib import reload for Python 3 (joblib#675)
  MTN vendor loky2.1.4 (joblib#708)
  Release 0.12
  typo
  typo
  typo
  ENH add initializer limiting n_threads for C-libs (joblib#701)
  DOC better parallel docstring (joblib#704)
  [MRG] Nested parallel call thread bomb mitigation (joblib#700)
  MTN vendor loky2.1.3 (joblib#699)
  Make it possible to configure the reusable executor workers timeout (joblib#698)
  MAINT increase timeouts to make test more robust on travis
  DOC: use the .joblib extension instead of .pkl (joblib#697)
  [MRG] Fix exception handling in nested parallel calls (joblib#696)
  Fix skip test lz4 not installed (joblib#695)
  [MRG] numpy_pickle:  several enhancements (joblib#626)
  Introduce Parallel.__call__ backend callbacks (joblib#689)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants