-
Notifications
You must be signed in to change notification settings - Fork 2.1k
xtimer: Fix race condition in xtimer_msg_receive_timeout #16374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This issue seems to be related to #13504 . |
I can confirm that with this #13345 (comment) no longer crashes. |
Description also makes total sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
@MichelRottleuthner do you want to give this a second look?
(ztimer also removes in any case) |
I now re-read the discussion in #13345 and I'm not sure if there's still a race when the timeout triggers between So I'd say let's go with this fix already (which was first discussed more than a year ago...). |
The unconditional remove makes sense to me.
Yes, AIUI the unpleasant queuing of the timer message could still happen. But I also agree with your reasoning that a copied message is preferable over a pointer to invalid stack memory...
Yep. |
Backport provided in #16376 |
Contribution description
This PR fixes a rare race-condition in
xtimer_msg_receive_timeout
which can lead to corruption of the timer list and subsequent hard faults.The race condition is triggered when:
msg_send
.xtimer
which sents the timeout message expires and executes beforextimer_remove(t)
is called. This will cause the message queue for the thread to contain first a real message and second the timeout message. The timer will not be queued anymore, but the timeout message will still be in the queue.xtimer_msg_receive_timeout
function is called again. This will queue a new xtimer while the timeout message of the previous timer is still in the buffer._msg_wait
will see the old timeout message, think that the current xtimer has already expired and will not remove the timer. Whenxtimer_msg_receive_timeout
returns, the timer will still be queued. However, as it is allocated on the stack it is no longer valid. This causes thetimer_list_head
to now point to invalid memory. Crashes ensue.Testing procedure
This bug was found, validated, and fixed using a proprietary application. I have not written a separate example application which exhibits the problem which I could publish.