-
-
Notifications
You must be signed in to change notification settings - Fork 365
Description
We have run_process
, which is a convenient way to run a process like it's a regular subroutine, with error checking (check=
), cancellation support, automatic waiting, etc. And we have trio.Process
, which is a fully general way to spawn a process and then do whatever with it.
In the discussion in #872, @Badg brought up some dissatisfaction with both of these options. I think I mostly get the issue now, so I'll try to rephrase in my own words:
In some ways, when a task spawns a subprocess it's very similar to spawning a new task: the subprocess is a chunk of code that runs simultaneously with the Trio task, and they can potentially interact.
Trio has some strong and convenient conventions for spawning tasks:
- trio always makes sure that both tasks are finished before exiting the nursery block
- if the new task crashes, the exception propagates until caught, and automatically cancels sibling tasks
- if the original task crashes, the new task is automatically cancelled too
- if the surrounding context is cancelled, then this automatically cancels both tasks
It would be nice to be able to get this combination of features for subprocesses too. But we don't currently have any simple way to do that.
The closest thing we have is to use async with
on a Process
object. But that's stuck between two conflicting sets of conventions. The Process
API mostly treats processes as independent objects, and our async with process
mostly follows the general conventions for async with <resource>
: it never cancels the body of the with
(even if the process crashes). It doesn't care whether the body of the with
block completed with an exception or not. And it doesn't do any particular checks on the subprocess's return code. This is also reasonable and what you'd expect for like, with file_obj
. OTOH, the conventions above are specific to how Trio handles code, and really only show up with nurseries and nursery-like objects. Which a Process
... kind of is, and kind of isn't.
I don't think it makes sense to try to wedge all those nursery semantics into async with process_obj
. In particular, it would be very weird for async with process_obj
to suddenly enable the equivalent of run_process
's check=True
. And then we'd need some way to allow it to be overridden with check=False
on a case-by-case basis, and there's no obvious way to spell that. And if we use check=True
and the subprocess fails and the body fails, do we raise a MultiError
? In multierror v2, does that mean we need to wrap all outgoing exceptions in MultiError
? It would involve a lot of weirdness.
Another idea that's been proposed is to extend run_process
so that it can be used like:
process_obj = await nursery.start(run_process, ...)
This has several attractive aspects: run_process
's job is to "domesticate" a subprocess so it acts like a subroutine – for example, the check=
argument translates error-reporting from subprocess-style to subroutine-style. So here we're reusing that, plus a nursery's general ability to take any subroutine and push it off into a parallel task. You get nursery semantics because you used a nursery. Elegant!
It's also kinda weird. It's effectively a totally different 'mode' of using run_process
: normally run_process
treats the process_obj
as hidden, and the return value as important; this flips that around completely, so you can mess with the process_obj
, and never see the return value. Normally, the capture_*
arguments are a major part of why people use run_process
; here, they're completely useless, and would probably need to be forbidden. The use of start
is sorta gratuitous: the actual process startup is synchronous, so you could just as well have a synchronous version. The (never mind, see #1109)start
is just a trick to start a task and get a return value at the same time.
Are there other options to consider? Brainstorming:
async with trio.open_process(...) as process_obj: # creates a nursery
...
process_obj = nursery.process_start_soon(...)
# or start_process_soon or start_soon_process?
async with trio.open_nursery() as nursery:
await nursery.start(trio.process_as_task, ...)
async with trio.open_nursery() as nursery:
process_obj = trio.start_process(nursery, ...)