-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
jobs: child proc must have a separate process-group #8107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
:let id = jobstart('sleep 30 | sleep 30 | sleep 30')
:call jobstop(id) ..finally works as expected after this patch. 👍 |
src/nvim/os/pty_process_unix.c
Outdated
{ | ||
// New session and progress-group. #6530 | ||
setsid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a new session instead of a new process group? Would setpgid(0,0);
also do the job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a new session customary in a terminal emulator?
src/nvim/event/process.c
Outdated
ILOG("Sending %s to pid %d", sig == SIGTERM ? "SIGTERM" : "SIGKILL", | ||
proc->pid); | ||
uv_kill(proc->pid, sig); | ||
ILOG("sending %s to pid %d", sig == SIGTERM ? "SIGTERM" : "SIGKILL", pid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pid -> pgid and -pid
?
bc11a54
to
69be6cb
Compare
Will this fix the |
I wouldn't expect this to fix "invalid channel id". And, I don't think it will change behavior of cat.exe unless cat.exe was running in a shell.
We should probably keep that test "pending" then :) |
src/nvim/os/process.c
Outdated
bool exists = false; | ||
size_t p_count = len / sizeof(*p_list); | ||
for (size_t i = 0; i < p_count; i++) { | ||
exists |= (p_list[i].kp_proc.p_pid == ppid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be
exists ||= (p_list[i].ki_pid == ppid);
if (p_list[i].ki_ppid == ppid) {
temp = xrealloc(temp, (*proc_count + 1) * sizeof(*temp));
temp[*proc_count] = p_list[i].ki_pid;
src/nvim/os/process.c
Outdated
snprintf(proc_p, sizeof(proc_p), "/proc/%d/task/%d/children", ppid, ppid); | ||
FILE *fp = fopen(proc_p, "r"); | ||
if (fp == NULL) { | ||
return 1; // Process not found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing the children
entry under proc
, because of an unset kernel option. Could one fallback to pgrep -P ppid
or ps --ppid ppid
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oni-link How does ps
get this info without the kernel providing an API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps iterating through the entries (processes) under /proc
and evaluating each entry status
for ppid
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oni-link Added a fallback for pgrep
since it seems to be available on linux/macOS/BSD.
UV_PROCESS_DETACHED compels libuv:uv__process_child_init() to call setsid() in the child just after fork(). That ensures the process and its descendants are grouped in a separate session (and process group). The following jobstart() call correctly groups `sh` and `sleep` in a new session (and process-group), where `sh` is the "session leader" (and process-group leader): :call jobstart(['sh','-c','sleep 60']) SESN PGRP PID PPID Command 30383 30383 30383 3620 │ ├─ -bash 30383 31432 31432 30383 │ │ └─ nvim -u NORC 30383 31432 31433 30383 │ │ ├─ nvim -u NORC 8105 8105 8105 31432 │ │ └─ sh -c sleep 60 8105 8105 8106 8105 │ │ └─ sleep 60 closes neovim#6530 ref: https://stackoverflow.com/q/1046933 ref: https://unix.stackexchange.com/a/404065 Helped-by: Marco Hinz <mh.codebro+github@gmail.com> Discussion ------------------------------------------------------------------------ On my linux box before this patch, the termclose_spec.lua:'kills job trapping SIGTERM' test indirectly causes cmake/busted to wait for 60s. That's because the test spawns a `sleep 60` descendant process which hangs around even after nvim exits: nvim killed the parent PID, but not PGID (process-group), so the grandchild "reparented" to init (PID 1). Session contains processes (and process-groups) which are logically part of the same "login session". Process-group is a set of logically/informally-related processes within a session; for example, shells assign a process group to each "job". Session IDs and PGIDs both have type pid_t (like PIDs). These OS-level mechanisms are, as usual, legacy accidents whose purpose is upheld by convention and folklore. We can use session-level grouping (setsid), or we could use process-group-level grouping (setpgid). Vim uses setsid() if available, otherwise setpgid(0,0). Windows ------------------------------------------------------------------------ UV_PROCESS_DETACHED on win32 sets CREATE_NEW_PROCESS_GROUP flag. But uv_kill() does not kill the process-group: nodejs/node#3617 Ideas: - Set UV_PROCESS_DETACHED (CREATE_NEW_PROCESS_GROUP), then call GenerateConsoleCtrlEvent(CTRL_BREAK_EVENT, pid) - Maybe won't work because MSDN says "Only processes that share the same console as the calling process receive the signal." https://docs.microsoft.com/en-us/windows/console/generateconsolectrlevent But CREATE_NEW_PROCESS_GROUP creates a new console ... ref https://stackoverflow.com/q/1453520 - Group processes within a "job". libuv does that *globally* for non-detached processes: uv__init_global_job_handle. - Iterate through CreateToolhelp32Snapshot(). - https://stackoverflow.com/q/1173342 - Vim does this, see terminate_all()
XXX: comment at https://stackoverflow.com/q/1173342 : > Windows recycles PIDs quite fast, you have to be extra careful not > to kill unrelated processes. These APIs will report PPIDs for long > dead processes whose PIDs may have been recycled. Check the parent > start date to make sure it is related to the processes you spawned.
Which |
On WIN32 it uses the WIN32 API. No ps, no tasklist. |
489f0f1
to
4c690e5
Compare
TODO: "exepath" field (win32: QueryFullProcessImageName()) On unix-likes `ps` is used because the platform-specific APIs are a nightmare. For reference, below is a (incomplete) attempt: diff --git a/src/nvim/os/process.c b/src/nvim/os/process.c index 0976992..99afbbf290c1 100644 --- a/src/nvim/os/process.c +++ b/src/nvim/os/process.c @@ -208,3 +210,60 @@ int os_proc_children(int ppid, int **proc_list, size_t *proc_count) return 0; } +/// Gets various properties of the process identified by `pid`. +/// +/// @param pid Process to inspect. +/// @return Map of process properties, empty on error. +Dictionary os_proc_info(int pid) +{ + Dictionary pinfo = ARRAY_DICT_INIT; +#ifdef WIN32 + +#elif defined(__APPLE__) + char buf[PROC_PIDPATHINFO_MAXSIZE]; + if (proc_pidpath(pid, buf, sizeof(buf))) { + name = getName(buf); + PUT(pinfo, "exepath", STRING_OBJ(cstr_to_string(buf))); + return name; + } else { + ILOG("proc_pidpath() failed for pid: %d", pid); + } +#elif defined(BSD) +# if defined(__FreeBSD__) +# define KP_COMM(o) o.ki_comm +# else +# define KP_COMM(o) o.p_comm +# endif + struct kinfo_proc *proc = kinfo_getproc(pid); + if (proc) { + PUT(pinfo, "name", cstr_to_string(KP_COMM(proc))); + xfree(proc); + } else { + ILOG("kinfo_getproc() failed for pid: %d", pid); + } + +#elif defined(__linux__) + char fname[256] = { 0 }; + char buf[MAXPATHL]; + snprintf(fname, sizeof(fname), "/proc/%d/comm", pid); + FILE *fp = fopen(fname, "r"); + // FileDescriptor *f = file_open_new(&error, fname, kFileReadOnly, 0); + // ptrdiff_t file_read(FileDescriptor *const fp, char *const ret_buf, + // const size_t size) + if (fp == NULL) { + ILOG("fopen() of /proc/%d/comm failed", pid); + } else { + size_t n = fread(buf, sizeof(char), sizeof(buf) - 1, fp); + if (n == 0) { + WLOG("fread() of /proc/%d/comm failed", pid); + } else { + size_t end = MIN(sizeof(buf) - 1, n); + end = (end > 0 && buf[end - 1] == '\n') ? end - 1 : end; + buf[end] = '\0'; + PUT(pinfo, "name", STRING_OBJ(cstr_to_string(buf))); + } + } + fclose(fp); +#endif + return pinfo; +}
Test correctly fails before 8d90171. ref neovim#6530
Can revert this after neovim#8120.
@justinmk |
@janlazo Gah, I changed that at the last minute to match
|
My bad on that commit. I used ping for <= 1 sec. timeout but any program that exits quickly would have sufficed because of how for loops work on cmd.exe. Powershell has a long startup (2-3 sec. to run |
FEATURES: 3cc7ebf #7234 built-in VimL expression parser 6a7c904 #4419 implement <Cmd> key to invoke command in any mode b836328 #7679 'startup: treat stdin as text instead of commands' 58b210e :digraphs : highlight with hl-SpecialKey #2690 7a13611 #8276 'startup: Let `-s -` read from stdin' 1e71978 events: VimSuspend, VimResume #8280 1e7d5e8 #6272 'stdpath()' f96d99a #8247 server: introduce --listen e8c39f7 #8226 insert-mode: interpret unmapped META as ESC 98e7112 msg: do not scroll entire screen (#8088) f72630b #8055 let negative 'writedelay' show all redraws 5d2dd2e win: has("wsl") on Windows Subsystem for Linux #7330 a4f6cec cmdline: CmdlineEnter and CmdlineLeave autocommands (#7422) 207b7ca #6844 channels: support buffered output and bytes sockets/stdio API: f85cbea #7917 API: buffer updates 418abfc #6743 API: list information about all channels/jobs. 36b2e3f #8375 API: nvim_get_commands 273d2cd #8329 API: Make nvim_set_option() update `:verbose set …` 8d40b36 #8371 API: more reliable/descriptive VimL errors ebb1acb #8353 API: nvim_call_dict_function 9f994bb #8004 API: nvim_list_uis 3405704 #7520 API/UI: forward option updates to UIs 911b1e4 #7821 API: improve nvim_command_output WINDOWS OS: 9cefd83 #8084, #8516 build/win: support MSVC ee4e1fd win: Fix reading content from stdin (#8267) TUI: ffb8904 #8309 TUI: add support for mouse release events in urxvt 8d5a46e #8081 TUI: implement "standout" attribute 6071637 TUI: support TERM=konsole-256color 67848c0 #7653 TUI: report TUI info with -V3 ('verbose' >= 3) 3d0ee17 TUI/rxvt: enable focus-reporting d109f56 #7640 TUI: 'term' option: reflect effective terminal behavior FIXES: ed6a113 #8273 'job-control: avoid kill-timer race' 4e02f1a #8107 'jobs: separate process-group' 451c48a terminal: flush vterm output buffer on pty output #8486 5d6732f :checkhealth fixes #8335 53f11dc #8218 'Fix errors reported by PVS' d05712f inccommand: pause :terminal redraws (#8307) 51af911 inccommand: do not execute trailing commands #8256 84359a4 terminal: resize to the max dimensions (#8249) d49c1dd #8228 Make vim_fgets() return the same values as in Vim 60e96a4 screen: winhl=Normal:Background should not override syntax (#8093) 0c59ac1 #5908 'shada: Also save numbered marks' ba87a2c cscope: ignore EINTR while reading the prompt (#8079) b1412dc #7971 ':terminal Enter/Leave should not increment jumplist' 3a5721e TUI: libtermkey: force CSI driver for mouse input #7948 6ff13d7 #7720 TUI: faster startup 1c6e956 #7862 TUI: fix resize-related segfaults a58c909 #7676 TUI: always hide cursor when flushing, never flush buffers during unibilium output 303e1df #7624 TUI: disable BCE almost always 249bdb0 #7761 mark: Make sure that jumplist item will not have zero lnum 6f41ce0 #7704 macOS: Set $LANG based on the system locale a043899 #7633 'Retry fgets on EINTR' CHANGES: ad60927 #8304 default to 'nofsync' f3f1970 #8035 defaults: 'fillchars' a6052c7 #7984 defaults: sidescroll=1 b69fa86 #7888 defaults: enable cscopeverbose 7c4bb23 defaults: do :filetype stuff unless explicitly "off" 2aa308c #5658 'Apply :lmap in macros' 8ce6393 terminal: Leave 'relativenumber' alone (#8360) e46534b #4486 refactor: Remove maxmem, maxmemtot options 131aad9 win: defaults: 'shellcmdflag', 'shellxquote' #7343 c57d315 #8031 jobwait(): return -2 on interrupt also with timeout 6452831 clipboard: macOS: fallback to tmux if pbcopy is broken #7940 300d365 #7919 Make 'langnoremap' apply directly after a map ada1956 #7880 'lua/executor: Remove lightuserdata' INTERNAL: de0a954 #7806 internal statistics for list impl dee78a4 #7708 rewrite internal list impl
jobstart() does NOT run in the same process-group (neovim#8107): :call jobstart('ps', {'on_stdout':{j,d,e->append(0,d)}})
closes neovim#8217 closes neovim#8450 system() and :! are expected to run the in same process-group as Nvim. NB: jobstart() does NOT run in the same process-group (neovim#8107): :call jobstart('ps', {'on_stdout':{j,d,e->append(0,d)}}) Background: 8d90171 changed ALL child-spawn utilities to do setsid(). Q: If we don't create a new session/process-group for :! and system(), how to avoid zombie descendants (e.g. process_wait() calls process_stop(), which only kills the root process)? A: Send signal to process-group, but ignore the signal in our own process (signal_reject_deadly()). Vim does something similar: https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L4834-L4841 https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L5122-L5134 Vim does setsid() in some cases of mch_call_shell_fork() (analogous to Nvim's os_system()), but check the logic carefully--it's only for some (irrelevant) GUI scenarios.
closes neovim#8217 closes neovim#8450 system() and :! are expected to run the in same process-group as Nvim. NB: jobstart() does NOT run in the same process-group (neovim#8107): :call jobstart('ps', {'on_stdout':{j,d,e->append(0,d)}}) Background: 8d90171 changed ALL child-spawn utilities to do setsid(). Q: If we don't create a new session/process-group for :! and system(), how to avoid zombie descendants (e.g. process_wait() calls process_stop(), which only kills the root process)? A: Send signal to process-group, but ignore the signal in our own process (signal_reject_deadly()). Vim does something similar: https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L4834-L4841 https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L5122-L5134 Vim does setsid() in some cases of mch_call_shell_fork() (analogous to Nvim's os_system()), but check the logic carefully--it's only for some (irrelevant) GUI scenarios.
(This commit is for reference; the functional change will be reverted.) ref neovim#8217 ref neovim#8450 ref neovim#8678 In terminal-Vim, system() and :! run in Vim's process-group. But 8d90171 changed all of Nvim's process-spawn utilities to do setsid(), which conflicts with that expected terminal-Vim behavior. To "fix" that, this commit defines Process.detach as a TriState, then handles the kNone case such that system() and :! do not do setsid() in the spawned child. But this commit REGRESSES 8d90171 (neovim#8107), so for example the following code causes orphan processes: :echo system('sleep 30|sleep 30|sleep 30') Q: If we don't create a new session/process-group, how to avoid zombie descendants (e.g. process_wait() calls process_stop(), which only kills the root process)? A: Vim's approach in mch_call_shell_fork() is: 1. BLOCK_SIGNALS (ignores deadly) 2. fork() 3. unblock signals in the child 4. On CTRL-C, send SIGINT to the process-group id: kill(-pid, SIGINT) 5. Parent (vim) ignores the signal. Child (and descendants) do not. https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L4834-L4841 https://github.com/vim/vim/blob/e7499ddc33508d3d341e96f84a0e7b95b2d6927c/src/os_unix.c#L5122-L5134 But we can't do that if we want to use the existing (libuv-based) form of process_spawn().
closes #6530
is process_teardown() hang #6891 related?os_proc_tree_kill()
)nvim_get_proc_children()
nvim_get_proc()
Didn't fix this common test failure (maybe #7376 ?):
quickbuild failure is unrelated,
spell_spec.lua
#8102