Skip to content

Conversation

takase1121
Copy link
Member

This is a very weird Windows-specific regression.
Currently, when we call read_stdout or read_stderr repeatedly, the function returns these three distinct results:

  1. "<data>": We read something
  2. "": There is no data available / stream is closed (EOF)
  3. nil: An error occured or the process has exited and we hit EOF

This is a weird status quo, but well, killing the process when stream closes is not good, and I am not here to rethink the design.

On Windows, this is not the case due to how ReadFile and OVERLAPPED IO works.

On Linux:

read() == 2, errno == 0 // read 2
read() == -1, errno == EAGAIN // stream has no data
read() == 2, errno == 0 // read 2
read() == 0, errno == 0 // stream hits EOF

On Windows:

(1) ReadFile() == true, dwBytesTransferred == 2, GetLastError() == ERROR_SUCCESS // read 2
(2) ReadFile() == false, dwBytesTransferred == 0, GetLastError() == ERROR_IO_PENDING // stream has no data
(3) GetOverlappedResult() == true, dwBytesTransferred == 2, overlapped.Internal = ERROR_SUCCESS // read 2
(4) ReadFile() == false, dwBytesTransferred == 0, GetLastError() == ERROR_IO_PENDING // stream has no data
(5) GetOverlappedResult() == true, dwBytesTransferred == 0, GetLastError() == ERROR_EOF_HANDLE // stream hits EOF
(6) ReadFile() == false, dwBytesTransferred == 0, GetLastError() == ERROR_HANDLE_EOF // stream hits EOF

The code is somewhat equivalent, but on Windows OVERLAPPED IO operates on completion, so one can think of GetOverlappedResult() and ReadFile() can be merged into a single operation here. For that reason, the synchronous read branch is not shown.

The problem lies in (5) and (6). When (5) fails with ERROR_EOF_HANDLE, Lite XL ignores this error code and pretends that it does a 0-size read. That's fine. But when (6) comes around, Lite XL thought that some other error has occured and kills the program. This is not the case on Linux:

lite-xl/src/api/process.c

Lines 631 to 639 in 36db156

length = read(self->child_pipes[stream][0], buffer, read_size > READ_BUF_SIZE ? READ_BUF_SIZE : read_size);
if (length == 0 && !poll_process(self, WAIT_NONE))
return 0;
else if (length < 0 && (errno == EAGAIN || errno == EWOULDBLOCK))
length = 0;
if (length < 0) {
signal_process(self, SIGNAL_TERM);
return 0;
}

On Linux, we only kill the process if we hit an actual pipe error, we never kill the process because we hit EOF. When we hit EOF, we continue to return an empty string until the process dies.

On Windows:

lite-xl/src/api/process.c

Lines 609 to 626 in 36db156

if (self->reading[writable_stream_idx] || !ReadFile(self->child_pipes[stream][0], self->buffer[writable_stream_idx], read_size > READ_BUF_SIZE ? READ_BUF_SIZE : read_size, NULL, &self->overlapped[writable_stream_idx])) {
if (self->reading[writable_stream_idx] || GetLastError() == ERROR_IO_PENDING) {
self->reading[writable_stream_idx] = true;
DWORD bytesTransferred = 0;
if (GetOverlappedResult(self->child_pipes[stream][0], &self->overlapped[writable_stream_idx], &bytesTransferred, false)) {
self->reading[writable_stream_idx] = false;
length = bytesTransferred;
memset(&self->overlapped[writable_stream_idx], 0, sizeof(self->overlapped[writable_stream_idx]));
}
} else {
signal_process(self, SIGNAL_TERM);
return 0;
}
} else {
length = self->overlapped[writable_stream_idx].InternalHigh;
memset(&self->overlapped[writable_stream_idx], 0, sizeof(self->overlapped[writable_stream_idx]));
}
lua_pushlstring(L, self->buffer[writable_stream_idx], length);

The Windows code is really terse and hard to read, but the takeaway is the code will kill the process no matter what happened, as long as it didn't read a thing, like EOF. The function also don't return an empty string until the process actually dies, it also doesn't call poll_process() which causes an edge case of calling process:returncode() right after read_stdout() fails to not work.

This PR emulates the POSIX behavior of calling poll_process when we hit EOF. This will fix the aforementioned edge case, and returns an empty string until the process actually exits.

@takase1121
Copy link
Member Author

takase1121 commented Dec 3, 2024

As for a better stream design, I have been cooking up the streams API, and I propose the following convention:

local output, err = read()
  1. output == "<data>", err == nil: data is read
  2. output == "", err == nil: no data can be read, EAGAIN
  3. output == nil, err == nil: stream is closed, EOF
  4. output == nil, err == "<error message>": actual stream error

Based on this convention, most users want to do

while true do
    local read, err = process:read()
    if err then return error(err) end
    if not read and not process:running() then break end -- if you just want to read data, no need to check process:running()
end

@Guldoman
Copy link
Member

Guldoman commented Dec 3, 2024

As for a better stream design, I have been cooking up the streams API, and I propose the following convention:

Yeah, sounds like the way to go.

@takase1121 takase1121 merged commit 9f39e1c into lite-xl:master Dec 3, 2024
9 checks passed
@takase1121 takase1121 deleted the process/eof-weirdness-on-windows branch March 23, 2025 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants