Skip to content

Should we have a conventional way to "spawn a process into a nursery"? #1104

@njsmith

Description

@njsmith

We have run_process, which is a convenient way to run a process like it's a regular subroutine, with error checking (check=), cancellation support, automatic waiting, etc. And we have trio.Process, which is a fully general way to spawn a process and then do whatever with it.

In the discussion in #872, @Badg brought up some dissatisfaction with both of these options. I think I mostly get the issue now, so I'll try to rephrase in my own words:

In some ways, when a task spawns a subprocess it's very similar to spawning a new task: the subprocess is a chunk of code that runs simultaneously with the Trio task, and they can potentially interact.

Trio has some strong and convenient conventions for spawning tasks:

  • trio always makes sure that both tasks are finished before exiting the nursery block
  • if the new task crashes, the exception propagates until caught, and automatically cancels sibling tasks
  • if the original task crashes, the new task is automatically cancelled too
  • if the surrounding context is cancelled, then this automatically cancels both tasks

It would be nice to be able to get this combination of features for subprocesses too. But we don't currently have any simple way to do that.

The closest thing we have is to use async with on a Process object. But that's stuck between two conflicting sets of conventions. The Process API mostly treats processes as independent objects, and our async with process mostly follows the general conventions for async with <resource>: it never cancels the body of the with (even if the process crashes). It doesn't care whether the body of the with block completed with an exception or not. And it doesn't do any particular checks on the subprocess's return code. This is also reasonable and what you'd expect for like, with file_obj. OTOH, the conventions above are specific to how Trio handles code, and really only show up with nurseries and nursery-like objects. Which a Process... kind of is, and kind of isn't.

I don't think it makes sense to try to wedge all those nursery semantics into async with process_obj. In particular, it would be very weird for async with process_obj to suddenly enable the equivalent of run_process's check=True. And then we'd need some way to allow it to be overridden with check=False on a case-by-case basis, and there's no obvious way to spell that. And if we use check=True and the subprocess fails and the body fails, do we raise a MultiError? In multierror v2, does that mean we need to wrap all outgoing exceptions in MultiError? It would involve a lot of weirdness.

Another idea that's been proposed is to extend run_process so that it can be used like:

process_obj = await nursery.start(run_process, ...)

This has several attractive aspects: run_process's job is to "domesticate" a subprocess so it acts like a subroutine – for example, the check= argument translates error-reporting from subprocess-style to subroutine-style. So here we're reusing that, plus a nursery's general ability to take any subroutine and push it off into a parallel task. You get nursery semantics because you used a nursery. Elegant!

It's also kinda weird. It's effectively a totally different 'mode' of using run_process: normally run_process treats the process_obj as hidden, and the return value as important; this flips that around completely, so you can mess with the process_obj, and never see the return value. Normally, the capture_* arguments are a major part of why people use run_process; here, they're completely useless, and would probably need to be forbidden. The use of start is sorta gratuitous: the actual process startup is synchronous, so you could just as well have a synchronous version. The start is just a trick to start a task and get a return value at the same time. (never mind, see #1109)

Are there other options to consider? Brainstorming:

async with trio.open_process(...) as process_obj:  # creates a nursery
    ...

process_obj = nursery.process_start_soon(...)
# or start_process_soon or start_soon_process?

async with trio.open_nursery() as nursery:
    await nursery.start(trio.process_as_task, ...)

async with trio.open_nursery() as nursery:
    process_obj = trio.start_process(nursery, ...)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions