Skip to content

Conversation

kaspar030
Copy link
Contributor

@kaspar030 kaspar030 commented Feb 28, 2020

Contribution description

Previously, xtimer_msg_receive_timeout() would always set a timer to send a timer message after timeout, then jump into msg_receive().
This PR returns early if there's already a message waiting.

Fixes #13345.

Testing procedure

Issues/PRs references

@kaspar030 kaspar030 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Area: timers Area: timer subsystems labels Feb 28, 2020
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that an early out makes sense but together with #13500 it seems like we have two racey proposals now 😆

@@ -163,6 +167,9 @@ int _xtimer_msg_receive_timeout64(msg_t *m, uint64_t timeout_ticks) {

int _xtimer_msg_receive_timeout(msg_t *msg, uint32_t timeout_ticks)
{
if (msg_try_receive(msg) == 1) {
return 1;
}
msg_t tmsg;
xtimer_t t;
_setup_timer_msg(&tmsg, &t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fix the problem. There is still a race if the timeout callback is triggered directly after msg_receive(); in _msg_wait but before xtimer_remove(t);. I'll update the code snippet in my other PR to show the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a race if the timeout callback is triggered directly after msg_receive(); in _msg_wait but before xtimer_remove(t);

Why?

This fix ensures that there are no queued messages (function doesn't pass the early out). Now, if any received message was not a timeout message, the xtimer is removed.

In case that removed xtimer has sent a message in between (after a "real" message, and with a queue present), that could now still be queued. But the early out prevents the function from misinterpreting that message as its own (which lead to the initial issue).

I'll update the code snippet in my other PR to show the problem.

Which one? I'm getting confused. :)

Copy link
Contributor

@MichelRottleuthner MichelRottleuthner Feb 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, if any received message was not a timeout message, the xtimer is removed.

Yes, but directly before the remove is executed, the timeout callback may still trigger.
So when receiving a proper msg after msg_receive(m);, but before the xtimer could be removed.
Basically directly before this line.

Which one? I'm getting confused. :)

#13500 (the collapsed code snippet)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but directly before the remove is executed, the timeout callback may still trigger.
So when receiving a proper msg after msg_receive(m);, but before the xtimer could be removed.

If another "proper" message is received, no harm done. It'll queue (or let the sender wait).
If xtimer triggers before it's "remove", the remove will run empty, but that is ok.

Worst case is that the message from that timer could get queued.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about another proper message but indeed the timeout message.
One could argue that it is okay to enqueue the timeout message but I think this is nowhere in the documentation and IMO not to be preferred.

Copy link
Contributor

@MichelRottleuthner MichelRottleuthner Feb 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't it be much more consequent to just always return a timeout message on timeout? With a timeout message waiting in the queue the timing behavior of following calls gets very "interesting" ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a timeout message waiting in the queue the timing behavior of following calls gets very "interesting" ;)

It would receive the timeout message, but not return the timeout return value.
Hopefully, all message handlers are written like:

switch(msg->type):
case EXPECTED:
//...
default:
  DEBUG("unexpected message type %i\n", msg->type);

But wouldn't it be much more consequent to just always return a timeout message on timeout?

You mean when a "proper" message has been received bit also the timeout triggered?
This is difficult to model without a messge queue, as the "proper" message needs to go somewhere. Also, technically, the message was received before the timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would receive the timeout message, but not return the timeout return value.

No. First, it would return 1, indicating a message was received. In the same step the timeout message is added to the queue. On the next call it will immediately return the timeout message without waiting for a timeout.

You mean when a "proper" message has been received bit also the timeout triggered?

I meant that if we use the returned message to signal a timeout, this should be always used for indicating a timeout and not either the return value or the timeout message indicates a timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. First, it would return 1, indicating a message was received. In the same step the timeout message is added to the queue. On the next call it will immediately return the timeout message without waiting for a timeout.

Yes, but in the second call, it would return "1", indicating that a message was received, vs. returning "-1". The received message would be the timeout message, though.

I meant that if we use the returned message to signal a timeout,

Ah, I didn't get that if. Why change the API? ALso, doesn't solve the problem, as then the former queued message would be returned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The received message would be the timeout message, though.

Yes, but it will return the timeout message immediately, that s the problem. And also it will return the timeout message for another call of msg_recv_timeout() i.e. when another fresh timeout should actually start.

Why change the API?

I don't want to^^ but using the message as timeout indication would be just that...

ALso, doesn't solve the problem, as then the former queued message would be returned.

So you agree it doesn't work as is, right?

@kaspar030 kaspar030 force-pushed the xtimer_msg_rx_timeout_rx_first branch from 037b269 to 021e516 Compare April 5, 2020 19:22
@kaspar030 kaspar030 added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Apr 5, 2020
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 21, 2021
@MrKevinWeiss
Copy link
Contributor

ping for status update.

@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@kaspar030
Copy link
Contributor Author

closing as xtimer is deprecated

@kaspar030 kaspar030 closed this May 31, 2022
@kaspar030 kaspar030 deleted the xtimer_msg_rx_timeout_rx_first branch May 31, 2022 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: timers Area: timer subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sys/xtimer: segmentation fault: in function xtimer_msg_received_timeout
3 participants