Skip to content

examples/serialization_and_wrappers.py: infinite loop with freeze_support() message on macos #1465

@ogrisel

Description

@ogrisel

Reproducer (on current master):

$ python examples/serialization_and_wrappers.py
42
With loky backend and cloudpickle serialization: 0.054s
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/spawn.py", line 129, in _main
    prepare(preparation_data)
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/ogrisel/code/joblib/examples/serialization_and_wrappers.py", line 35, in <module>
    print(Parallel(n_jobs=2)(delayed(func_async)(21) for _ in range(1))[0])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/parallel.py", line 1923, in __call__
    next(output)
  File "/Users/ogrisel/code/joblib/joblib/parallel.py", line 1561, in _get_outputs
    self._start(iterator, pre_dispatch)
  File "/Users/ogrisel/code/joblib/joblib/parallel.py", line 1544, in _start
    if self.dispatch_one_batch(iterator):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/parallel.py", line 1435, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/ogrisel/code/joblib/joblib/parallel.py", line 1357, in _dispatch
    job = self._backend.apply_async(batch, callback=batch_tracker)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/_parallel_backends.py", line 599, in apply_async
    future = self._workers.submit(func)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/reusable_executor.py", line 225, in submit
    return super().submit(fn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/process_executor.py", line 1248, in submit
    self._ensure_executor_running()
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/process_executor.py", line 1220, in _ensure_executor_running
    self._adjust_process_count()
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/process_executor.py", line 1209, in _adjust_process_count
    p.start()
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/backend/process.py", line 45, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/backend/popen_loky_posix.py", line 48, in __init__
    self._launch(process_obj)
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/backend/popen_loky_posix.py", line 99, in _launch
    prep_data = spawn.get_preparation_data(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/backend/spawn.py", line 61, in get_preparation_data
    _check_not_importing_main()
  File "/Users/ogrisel/code/joblib/joblib/externals/loky/backend/spawn.py", line 39, in _check_not_importing_main
    raise RuntimeError(
RuntimeError: An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

    if __name__ == '__main__':
        freeze_support()
        ...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

[...]

The snippet of the example that causes the crash is:

import sys
import time
from joblib import parallel_backend, Parallel, delayed

large_list = list(range(1000000))

if sys.platform != 'win32':
    def func_async(i, *args):
        return 2 * i

    with parallel_backend('multiprocessing'):
        t_start = time.time()
        Parallel(n_jobs=2)(
            delayed(func_async)(21, large_list) for _ in range(1))
        print("With multiprocessing backend and pickle serialization: {:.3f}s"
              .format(time.time() - t_start))

which causes:

Process SpawnPoolWorker-2:
Traceback (most recent call last):
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/pool.py", line 114, in worker
    task = get()
           ^^^^^
  File "/Users/ogrisel/code/joblib/joblib/pool.py", line 147, in get
    return recv()
           ^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/connection.py", line 250, in recv
    return _ForkingPickler.loads(buf.getbuffer())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't get attribute 'func_async' on <module '__main__' (built-in)>

when run alone in an IPython session.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions