Skip to content

make joblib.Parallel return a generator #217

@drhyrum

Description

@drhyrum

Often one wants to perform simple operations on the output of a very long sequence of tasks. If the number of outputs is large, it may be inefficient or impossible to store them in a list. Instead, add functionality to joblib.Parallel so that one can do:

parallel_job = ( delayed( job )( param ) for param in so_many_job_params )  # generator for input
for output in Parallel(n_jobs=10, iterable=parallel_job):                  # generator as output 
   do_something( output )

In the example above, I've added the job iterable to the constructor of Parallel. The only required change would be to add an __iter__(self) method to Parallel which has almost identical functionality to __call__(self.iterable), but instead uses self.iterable and yields an element one completed job at a time, rather than returning a list of outputs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions