xtimer: Fix race condition in xtimer_msg_receive_timeout [backport 2021.04] #16376
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #16374
Contribution description
This PR fixes a rare race-condition in
xtimer_msg_receive_timeout
which can lead to corruption of the timer list and subsequent hard faults.The race condition is triggered when:
msg_send
.xtimer
which sents the timeout message expires and executes beforextimer_remove(t)
is called. This will cause the message queue for the thread to contain first a real message and second the timeout message. The timer will not be queued anymore, but the timeout message will still be in the queue.xtimer_msg_receive_timeout
function is called again. This will queue a new xtimer while the timeout message of the previous timer is still in the buffer._msg_wait
will see the old timeout message, think that the current xtimer has already expired and will not remove the timer. Whenxtimer_msg_receive_timeout
returns, the timer will still be queued. However, as it is allocated on the stack it is no longer valid. This causes thetimer_list_head
to now point to invalid memory. Crashes ensue.Testing procedure
This bug was found, validated, and fixed using a proprietary application. I have not written a separate example application which exhibits the problem which I could publish.