Skip to content

Conversation

laurazard
Copy link
Collaborator

Changes were made in #4599 to provide a mechanism for the CLI to notify running plugin processes that they should exit, in order to improve the general CLI/plugin UX. The current implementation boils down to:

  1. The CLI creates a socket
  2. The CLI executes the plugin
  3. The plugin connects to the socket
  4. (When) the CLI receives a termination signal, it uses the socket to notify the plugin that it should exit
  5. The plugin's gets notified via the socket, and cancels it's cmd.Context, which then gets handled appropriately

This change works in most cases and fixes the issue it sets out to solve (see: docker/compose#11292) however, in the case where the user has a TTY attached and the plugin is not already handling received signals, steps 4+ changes:
4. (When) the CLI receives a termination signal, before it can use the socket to notify the plugin that it should exit, the plugin process also receives a signal due to sharing the pgid with the CLI

Since we now have a proper "job control" mechanism, we can simplify the scenarios by executing the plugins with their own process group id, thereby removing the "double notification" issue and making it so that plugins can handle the same whether attached to a TTY or not.

In order to make this change "plugin-binary" backwards-compatible, in the case that a plugin does not connect to the socket, the CLI passes the signal to the plugin process.


- What I did

Execute plugins with their own process group id. If a(n older) plugin does not connect to the CLI socket, pass received signals onto the plugin process, so as to preserve compatibility with older plugin binaries.

- How to verify it

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

IMG_2069

@laurazard laurazard force-pushed the signal-handling-fix-tty branch 2 times, most recently from 0891316 to c719dfb Compare January 10, 2024 14:08
@codecov-commenter
Copy link

codecov-commenter commented Jan 10, 2024

Codecov Report

Merging #4769 (c719dfb) into master (7d92573) will decrease coverage by 0.08%.
Report is 123 commits behind head on master.
The diff coverage is 0.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4769      +/-   ##
==========================================
- Coverage   59.68%   59.60%   -0.08%     
==========================================
  Files         287      288       +1     
  Lines       24865    24785      -80     
==========================================
- Hits        14841    14774      -67     
+ Misses       9138     9125      -13     
  Partials      886      886              

@laurazard laurazard self-assigned this Jan 10, 2024
laurazard added a commit to laurazard/buildx that referenced this pull request Jan 10, 2024
See docker/cli#4599 and
docker/cli#4769.

When running through the CLI, signal handling + the "3
SIGINTS = exit" behavior is handled by the CLI, and the
CLI will signal buildx through the plugin socket to cancel
it's context.

To deal with the standalone case, this commit introduces
`cobrautil.HandleContextCancellation` which checks if the
context being executed is already cancellable/plugin is
running through the CLI and sets up signal handling if
necessary.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
@thaJeztah thaJeztah added status/2-code-review area/plugins kind/refactor PR's that refactor, or clean-up code labels Jan 10, 2024
laurazard added a commit to laurazard/buildx that referenced this pull request Jan 11, 2024
See docker/cli#4599 and
docker/cli#4769.

When running through the CLI, signal handling + the "3
SIGINTS = exit" behavior is handled by the CLI, and the
CLI will signal buildx through the plugin socket to cancel
it's context.

To deal with the standalone case, this commit introduces
`cobrautil.HandleContextCancellation` which checks if the
context being executed is already cancellable/plugin is
running through the CLI and sets up signal handling if
necessary.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
@laurazard laurazard marked this pull request as ready for review January 11, 2024 12:44
@thaJeztah
Copy link
Member

(GitHub STILL doesn't allow commenting on commit messages, so putting it here);

Screenshot 2024-01-11 at 14 07 55

It's worth considering to refer to the commit (1554ac3b5f38147301c518c60da16f3f80c1717b) in the commit message, instead of the PR. Doing so prevents some noise if commits are cherry-picked or merged into other branches (which can be in any fork), plus it allows finding back the related changes without requiring access to GitHub.

(I usually refer to the commit in the commit message, and add a link to the PR in the GitHub description of the PR).

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code changes look good afaics; left some comments / suggestions 😅


func configureOSSpecificCommand(cmd *exec.Cmd) {
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping this was abstracted in the syscall package (and that to be the reason to use syscall instead of golang.org/x/sys), but looks like it's not 😢 (otherwise we could've dropped the abstraction)

Well.. it is what it is 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth adding a comment why we're doing this (writing down the intent; it looks like you already have a good description in the commit message)? Or do you think it's clear from the code alone?

(A good way to judge: would you understand (without looking in history) 1 Year from now when looking at the code?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm too used to having to do dig through many years old commits to understand code that I don't even notice anymore 😞 I'll add some more comments!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I'm pretty sure we'd be able to trace it back if needed. I started to pay more attention to these things (sorry for being the victim of that), for all those times I had to spend "too much time" finding back the reason.

Having a comment also can help deciding making changes ("oh! thanks for the description; this is soooooo last year and really no longer needed. I can confidently remove this part").

@laurazard
Copy link
Collaborator Author

It's worth considering to refer to the commit (1554ac3) in the commit message, instead of the PR. Doing so prevents some noise if commits are cherry-picked or merged into other branches (which can be in any fork), plus it allows finding back the related changes without requiring access to GitHub.

That's funny, I usually do this because almost every time I'm doing Github archaeology, if I have to find the commit I want to go to the PR to get additional context. However, the noise and accessibility outside of Github are good arguments, and Github makes it pretty easy now to find the PR a commit came from, so I'll start doing what you're suggesting!

Copy link
Member

@neersighted neersighted left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent to me, especially with the addition of explicit signal-forwarding. Like @thaJeztah, my only suggestion is that we add comments explaining why we set the pgid/why we forward the signal in some cases, but not all.

@thaJeztah
Copy link
Member

Yup; I'm good with merging after the commit message is updated and a comment added 👍 (thought I'd wait for you to look at that)

Changes were made in 1554ac3 to provide
a mechanism for the CLI to notify running plugin processes that they
should exit, in order to improve the general CLI/plugin UX. The current
implementation boils down to:
1. The CLI creates a socket
2. The CLI executes the plugin
3. The plugin connects to the socket
4. (When) the CLI receives a termination signal, it uses the socket to
   notify the plugin that it should exit
5. The plugin's gets notified via the socket, and cancels it's `cmd.Context`,
   which then gets handled appropriately

This change works in most cases and fixes the issue it sets out to solve
(see: docker/compose#11292) however, in the case
where the user has a TTY attached and the plugin is not already handling
received signals, steps 4+ changes:
4. (When) the CLI receives a termination signal, before it can use the
   socket to notify the plugin that it should exit, the plugin process
   also receives a signal due to sharing the pgid with the CLI

Since we now have a proper "job control" mechanism, we can simplify the
scenarios by executing the plugins with their own process group id,
thereby removing the "double notification" issue and making it so that
plugins can handle the same whether attached to a TTY or not.

In order to make this change "plugin-binary" backwards-compatible, in
the case that a plugin does not connect to the socket, the CLI passes
the signal to the plugin process.

Co-authored-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Bjorn Neergaard <bjorn.neergaard@docker.com>
@neersighted neersighted force-pushed the signal-handling-fix-tty branch from c719dfb to ef5e5fa Compare January 12, 2024 20:53
@neersighted neersighted self-assigned this Jan 12, 2024
@neersighted
Copy link
Member

I've carried this with the requested comments and commit message change.

@neersighted neersighted requested a review from thaJeztah January 12, 2024 20:59
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah thaJeztah merged commit 688de6d into docker:master Jan 12, 2024
@neersighted neersighted deleted the signal-handling-fix-tty branch January 12, 2024 21:06
@thaJeztah
Copy link
Member

Looks like this introduced a regression with buildx running with BUILDX_EXPERIMENTAL enabled;

laurazard added a commit to laurazard/buildx that referenced this pull request Jan 17, 2024
See docker/cli#4599 and
docker/cli#4769.

Since we switched to using the cobra command context instead of
`appcontext`, we need to set up the signal handling that was being
provided by `appcontext`, as well as configuring the context with
the OTEL tracing utilities also used by `appcontext`.

This commit introduces `cobrautil.ConfigureContext` which implements
the pre-existing signal handling logic from `appcontext` and cancels
the command's context when a signal is received, as well as doing
the relevant OTEL config.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
laurazard added a commit to laurazard/buildx that referenced this pull request Jan 17, 2024
See docker/cli#4599 and
docker/cli#4769.

Since we switched to using the cobra command context instead of
`appcontext`, we need to set up the signal handling that was being
provided by `appcontext`, as well as configuring the context with
the OTEL tracing utilities also used by `appcontext`.

This commit introduces `cobrautil.ConfigureContext` which implements
the pre-existing signal handling logic from `appcontext` and cancels
the command's context when a signal is received, as well as doing
the relevant OTEL config.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
tonistiigi pushed a commit to laurazard/buildx that referenced this pull request Jan 20, 2024
See docker/cli#4599 and
docker/cli#4769.

Since we switched to using the cobra command context instead of
`appcontext`, we need to set up the signal handling that was being
provided by `appcontext`, as well as configuring the context with
the OTEL tracing utilities also used by `appcontext`.

This commit introduces `cobrautil.ConfigureContext` which implements
the pre-existing signal handling logic from `appcontext` and cancels
the command's context when a signal is received, as well as doing
the relevant OTEL config.

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
@thaJeztah thaJeztah added this to the 25.0.0 milestone Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/plugins kind/refactor PR's that refactor, or clean-up code status/2-code-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants