-
Notifications
You must be signed in to change notification settings - Fork 572
dap: make exec shell persistent across the build #3341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
20a1155
to
c65f08b
Compare
3a66bc4
to
869fc6b
Compare
There's a bit of a weird thing that happens with this PR that didn't for the original version and I have an idea to fix it, but it might be difficult. At the current moment, it's not possible to cancel the exec without terminating the build. And if you exit the container process from within the terminal, it only closes the container and doesn't close the attachment which will just hang until a new process comes up. I think it might be good to differentiate these. If the container is the one who exits before the program is resumed, I think we should terminate the connection to the client so the attach command exits. If the program is resumed and the reason for the terminal dying is because the container was canceled, we should restore it on the next breakpoint. Basically, try to differentiate from a user initiated exit and a debugger initiated exit. Might be difficult to do but I'll give it a try. I think it'll likely make a better user experience. |
869fc6b
to
885ab21
Compare
I've made some changes to this. If the exec exits with a zero status, the exec is considered "complete" and it will release control of the attached shell. If the exec exits with a non-zero status or exits because it was canceled (such as resuming the build), it will restore itself after the next pause. I've also changed it so the exec can be done again. If you try and exec while a shell is still attached, it will give the same error. If the previous shell exited and you do exec again, it will re-use the socket and connect again. I also moved some of the logging messages to somewhere more appropriate. I added a "reason" to the message when it doesn't invoke a build container for a paused step. This should be an improved workflow. |
885ab21
to
eaf083d
Compare
Testing this out more and it might not work properly when execing on an error state. Moving this to draft for now. |
This is ready for review. The pausing on exception was broken on something unrelated to this so I'll fix that separately. |
eaf083d
to
f210e12
Compare
Fixed. This is because a paused a build step was only attempting to attach to the shell once. So when you exit and try to re-enter it didn't get the shell again. I've now modified it so it attempts to reattach on a clean exit and it won't reattach on an exit with an error which includes the context being canceled. |
if len(cfg.Entrypoint) == 0 && len(cfg.Cmd) == 0 { | ||
cfg.Entrypoint = []string{"/bin/sh"} // launch shell by default | ||
cfg.Cmd = []string{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would presumably happen for the WCOW case also I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to wait. I must have broken that with the last change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crazy-max this is the output I get when trying to exec on a scratch container.
Attached to build process.
Build container is not executable. (reason: stat error: lstat bin/sh: no such file or directory)
This should prevent trying to run a container when the shell doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build container is not executable. (reason: stat error: lstat bin/sh: no such file or directory)
@jsternberg Hum I don't have this message when testing on my side:
capture-2025-08-11.191637.mp4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should now be fixed. There was an error in StatFile
where it would attempt to use the ref from the scratch volume and it resulted in a nil reference. Now it checks for that so the canInvoke
method works properly without crashing.
Invoking the shell will cause it to persist across the entire build and to re-execute whenever the builder pauses at another location again. This still requires using `exec` to launch the shell. Launching by frame id is also removed since it no longer applies to this version. Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
f210e12
to
dbda218
Compare
Invoking the shell will cause it to persist across the entire build and
to re-execute whenever the builder pauses at another location again.
This still requires using
exec
to launch the shell. Launching by frameid is also removed since it no longer applies to this version.