Skip to content

Conversation

int128
Copy link
Member

@int128 int128 commented Jan 24, 2025

Problem to solve

Currently, the job is stuck for about 20 minutes when the pod is accidentally terminated. It finally causes the job timeout or "lost communication" error.

This changes RUNNER_MANUALLY_TRAP_SIG environment variable to the default value of upstream image. By this change, the job will be canceled shortly on pod termination.

Issue

Copy link

💡 If you need a new version, create a new release after merge.

@int128 int128 changed the title Cancel the current job when the pod is deleting Cancel current job shortly when pod is terminating Feb 5, 2025
@int128 int128 changed the title Cancel current job shortly when pod is terminating Remove RUNNER_MANUALLY_TRAP_SIG Feb 5, 2025
@int128 int128 changed the title Remove RUNNER_MANUALLY_TRAP_SIG Use default RUNNER_MANUALLY_TRAP_SIG Feb 5, 2025
@@ -42,14 +42,11 @@ COPY entrypoint.sh /

VOLUME /var/lib/docker

# docker-init sends the signal to children
ENV RUNNER_MANUALLY_TRAP_SIG=
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, RUNNER_MANUALLY_TRAP_SIG is set to 1.
https://github.com/actions/runner/blob/3486c54ccbb8181b3bb46e1a3f22c85f79aa5414/images/Dockerfile#L38

If RUNNER_MANUALLY_TRAP_SIG is set, run.sh waits for the listener when a signal is received.
https://github.com/actions/runner/blob/3486c54ccbb8181b3bb46e1a3f22c85f79aa5414/src/Misc/layoutroot/run.sh#L29

Copy link

💡 If you need a new version, create a new release after merge.

@int128 int128 marked this pull request as ready for review February 12, 2025 00:16
@int128 int128 requested a review from a team as a code owner February 12, 2025 00:16
@int128 int128 merged commit 6158596 into main Feb 12, 2025
11 checks passed
@int128 int128 deleted the int128/Cancel-the-current-job-when-the-pod-is-deleting branch February 12, 2025 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant