-
Notifications
You must be signed in to change notification settings - Fork 2.1k
core/sched_switch: fix crash with no active thread #20878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The function `sched_switch()` was implemented with the assumption that there will always be an active thread. This was true until threadless idle was implemented for Cortex M MCUs: If no thread is runnable and the running thread exists, there will be no active thread. If from ISR a thread is then unblocked, `sched_switch()` will be called without an active thread. This handles the corner case explicitly when the module `core_idle_thread` is not used (in other words: for threadless idle). Fixes RIOT-OS#20812
/* If a thread exists and no other thread is runnable, we may end up in | ||
* a situation with no active thread. This can not occur if there is an | ||
* idle thread, which is always runnable, though */ | ||
if (!IS_USED(MODULE_CORE_IDLE_THREAD) && unlikely(active_thread == NULL)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is correct, bit I'm not sure this is holding it right. thread_yield_higher()
can always be used and is actually the preferred choice. sched_switch()
was an optimization path for cases like unlocking a mutex or waking up another thread, from some thread, to only invoke the full scheduler if needed (as the target thread / priority is basically known).
If this is actually called from ISRs, that is the only way there is no active_thread
. It might make more sense to encode this properly. As in, if there is no active thread, the irq_is_in()
below becomes redundant, that is implicit. And in that case, sched_context_switch_request = 1
might be the correct thing to do.
Or, if irq_is_in()
, always trigger scheduler, otherwise, assume not in ISR and that there is an active_thread
.
(In theory, this function can be used to speed up task switching by quite a bit for some cases, b/c we could directly switch to the receiving thread, skipping register storing for the calling thread. we never implemented that, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sched_switch()
was an optimization path for cases like unlocking a mutex or waking up another thread, from some thread, to only invoke the full scheduler if needed (as the target thread / priority is basically known).
This makes it sound like an ugly hack / premature optimization. In normal workloads contention over a mutex is pretty unlikely and the hot path is the case a mutex_unlock()
will not find any waiters. And for real time behavior: Adding a sched_switch()
prior to the call to thread_yield_higher()
increases the worst case latency at the hope to occasionally omit the call to thread_yield_higher()
.
I wonder if just getting rid of sched_switch()
is the better approach here. See #20890 for removing it for mutex_unlock()
. There are a few other calls to sched_switch()
that I think might happen from IRQ context to also address to actually fix the issue.
I suppose we can close this now? |
If I recall correctly, the same issue could be triggered when calling So, not yet. Maybe calling |
Contribution description
The function
sched_switch()
was implemented with the assumption that there will always be an active thread. This was true until threadless idle was implemented for Cortex M MCUs: If no thread is runnable and the running thread exists, there will be no active thread. If from ISR a thread is then unblocked,sched_switch()
will be called without an active thread.This handles the corner case explicitly when the module
core_idle_thread
is not used (in other words: for threadless idle).Implementation choice
I first wanted to avoid adding the check for this very unlikely corner case to this hot path, and rather have in
cpu_switch_context_exit()
of the only offender to make sure to set the active thread to something reasonable. However, switching to a non-runnable thread without running it is something that I have no elegant, simple and maintainable solution for. So I decided to rather add a simple and maintainable check into a hot path.Testing procedure
See #20812
Issues/PRs references
Fixes #20812