Skip to content

Conversation

jsternberg
Copy link
Collaborator

Invoking the shell will cause it to persist across the entire build and
to re-execute whenever the builder pauses at another location again.

This still requires using exec to launch the shell. Launching by frame
id is also removed since it no longer applies to this version.

Copy link
Collaborator

@ktock ktock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jsternberg jsternberg force-pushed the dap-persistent-exec branch 5 times, most recently from 20a1155 to c65f08b Compare August 4, 2025 14:46
@crazy-max
Copy link
Member

crazy-max commented Aug 4, 2025

Looks good on Linux and macOS but I don't have the prompt on Windows:

2025-07-31T14:31:44.161Z [debug] {"message":"DAP message will be received from the editor","dapMessage":{"command":"evaluate","arguments":{"expression":"exec","frameId":3,"context":"repl"},"type":"request","seq":11},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.181Z [debug] {"message":"DAP message has been sent to the editor","dapMessage":{"seq":34,"type":"request","command":"runInTerminal","arguments":{"kind":"integrated","cwd":"","args":["docker","buildx","dap","attach","C:\\Users\\crazy\\AppData\\Local\\Temp\\buildx-dap-exec4118734859\\s.sock"],"env":{"BUILDX_EXPERIMENTAL":"1"}}},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.191Z [debug] {"message":"DAP message will be received from the editor","dapMessage":{"type":"response","seq":12,"command":"runInTerminal","request_seq":34,"success":true,"body":{"shellProcessId":30624}},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.192Z [debug] {"message":"DAP message has been sent to the editor","dapMessage":{"seq":35,"type":"response","request_seq":11,"success":true,"command":"evaluate","body":{"result":"Started process attached to C:\\Users\\crazy\\AppData\\Local\\Temp\\buildx-dap-exec4118734859\\s.sock.","variablesReference":0}},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.192Z [debug] {"message":"DAP message will be received from the editor","dapMessage":{"command":"scopes","arguments":{"frameId":3},"type":"request","seq":13},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.192Z [debug] {"message":"DAP message has been sent to the editor","dapMessage":{"seq":36,"type":"response","request_seq":13,"success":true,"command":"scopes","body":{"scopes":[{"name":"Arguments","presentationHint":"arguments","variablesReference":16777217,"expensive":false}]}},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.193Z [debug] {"message":"DAP message will be received from the editor","dapMessage":{"command":"variables","arguments":{"variablesReference":16777217},"type":"request","seq":14},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
2025-07-31T14:31:44.193Z [debug] {"message":"DAP message has been sent to the editor","dapMessage":{"seq":37,"type":"response","request_seq":14,"success":true,"command":"variables","body":{"variables":[{"name":"platform","value":"linux/amd64","variablesReference":16777218},{"name":"exec","value":"/bin/sh -c echo foo","variablesReference":16777219}]}},"id":"3d979d6e-9864-444a-b86b-5121aeeb166e","name":"Docker: Build"}
capture-2025-07-31.163005.mp4

Works fine in Buildx 0.26.1:

image

@jsternberg jsternberg force-pushed the dap-persistent-exec branch 2 times, most recently from 3a66bc4 to 869fc6b Compare August 5, 2025 17:59
@jsternberg
Copy link
Collaborator Author

There's a bit of a weird thing that happens with this PR that didn't for the original version and I have an idea to fix it, but it might be difficult.

At the current moment, it's not possible to cancel the exec without terminating the build. And if you exit the container process from within the terminal, it only closes the container and doesn't close the attachment which will just hang until a new process comes up.

I think it might be good to differentiate these. If the container is the one who exits before the program is resumed, I think we should terminate the connection to the client so the attach command exits. If the program is resumed and the reason for the terminal dying is because the container was canceled, we should restore it on the next breakpoint. Basically, try to differentiate from a user initiated exit and a debugger initiated exit.

Might be difficult to do but I'll give it a try. I think it'll likely make a better user experience.

@jsternberg jsternberg force-pushed the dap-persistent-exec branch from 869fc6b to 885ab21 Compare August 7, 2025 16:51
@jsternberg
Copy link
Collaborator Author

I've made some changes to this. If the exec exits with a zero status, the exec is considered "complete" and it will release control of the attached shell. If the exec exits with a non-zero status or exits because it was canceled (such as resuming the build), it will restore itself after the next pause.

I've also changed it so the exec can be done again. If you try and exec while a shell is still attached, it will give the same error. If the previous shell exited and you do exec again, it will re-use the socket and connect again.

I also moved some of the logging messages to somewhere more appropriate. I added a "reason" to the message when it doesn't invoke a build container for a paused step. This should be an improved workflow.

@jsternberg jsternberg requested a review from rcjsuen August 7, 2025 16:54
@jsternberg jsternberg force-pushed the dap-persistent-exec branch from 885ab21 to eaf083d Compare August 7, 2025 17:23
@jsternberg
Copy link
Collaborator Author

Testing this out more and it might not work properly when execing on an error state. Moving this to draft for now.

@jsternberg jsternberg marked this pull request as draft August 7, 2025 17:49
@jsternberg jsternberg marked this pull request as ready for review August 7, 2025 19:47
@jsternberg
Copy link
Collaborator Author

This is ready for review. The pausing on exception was broken on something unrelated to this so I'll fix that separately.

@crazy-max
Copy link
Member

When exiting the shell and doing another exec:

image

I can't get the prompt:

image

@jsternberg jsternberg force-pushed the dap-persistent-exec branch from eaf083d to f210e12 Compare August 8, 2025 13:11
@jsternberg
Copy link
Collaborator Author

Fixed. This is because a paused a build step was only attempting to attach to the shell once. So when you exit and try to re-enter it didn't get the shell again.

I've now modified it so it attempts to reattach on a clean exit and it won't reattach on an exit with an error which includes the context being canceled.

Comment on lines +148 to +150
if len(cfg.Entrypoint) == 0 && len(cfg.Cmd) == 0 {
cfg.Entrypoint = []string{"/bin/sh"} // launch shell by default
cfg.Cmd = []string{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If /bin/sh is not there (such as scratch), I think we should not attach:

image

If we exec it returns instantly atm:

image

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would presumably happen for the WCOW case also I suppose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's supposed to wait. I must have broken that with the last change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crazy-max this is the output I get when trying to exec on a scratch container.

Attached to build process.
Build container is not executable. (reason: stat error: lstat bin/sh: no such file or directory)

This should prevent trying to run a container when the shell doesn't exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build container is not executable. (reason: stat error: lstat bin/sh: no such file or directory)

@jsternberg Hum I don't have this message when testing on my side:

capture-2025-08-11.191637.mp4

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be fixed. There was an error in StatFile where it would attempt to use the ref from the scratch volume and it resulted in a nil reference. Now it checks for that so the canInvoke method works properly without crashing.

@crazy-max crazy-max added this to the v0.27.0 milestone Aug 8, 2025
Invoking the shell will cause it to persist across the entire build and
to re-execute whenever the builder pauses at another location again.

This still requires using `exec` to launch the shell. Launching by frame
id is also removed since it no longer applies to this version.

Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
@tonistiigi tonistiigi merged commit 4f9f47d into docker:master Aug 13, 2025
138 checks passed
@jsternberg jsternberg deleted the dap-persistent-exec branch August 13, 2025 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants