Skip to content

Conversation

JulianHolzwarth
Copy link
Contributor

@JulianHolzwarth JulianHolzwarth commented Aug 9, 2019

HAS "REMOVE ME" COMMIT TO SHOW BUG
This includes pr: #10872
Tracking: pr #11660

Contribution description

Fixes BUG.

The function _mutex_timeout removes a thread from the waiting list of a mutex and unblocks the thread (sets status to STATUS_PENDING) when the thread is waiting for the mutex. This is used for xtimer_mutex_lock_timeout a mutex lock function with timeout. This pr fixes a bug that happens because the function _mutex_timeout never checks if the mutex has a waiting list. This pr adds this check.

When the Timer triggers after the mutex is acquired but before the timer is removed.. You can see the bug with the "REMOVE ME" commit (asserts when the mutex is acquired and the timer triggers after).
The Function (_mutex_timeout) does nothing when the thread got the mutex (and no thread is waiting for it mutex->queue.next == MUTEX_LOCKED ) and when the mutex is free (mutex->queue.next == NULL).

list_remove should not be called with a mutex where mutex->queue.next == MUTEX_LOCKED.

Testing procedure

BOARD=native make -C tests/xtimer_mutex_lock_timeout/ flash test
output:

...
> mutex_timeout_long_unlocked
mutex_timeout_long_unlocked
starting test: xtimer mutex lock timeout
OK

> mutex_timeout_long_locked
mutex_timeout_long_locked
starting test: xtimer mutex lock timeout
OK

> mutex_timeout_long_locked_low
mutex_timeout_long_locked_low
starting test: xtimer mutex lock timeout with thread
threads = 2
THREAD low prio: start
MAIN THREAD: calling xtimer_mutex_lock_timeout
OK
threads = 3
MAIN THREAD: waiting for created thread to end
THREAD low prio: exiting low
threads = 2


Revert the Revert "REMOVE ME" commit to see it work after fix.
output:

...
> mutex_timeout_long_unlocked
mutex_timeout_long_unlocked
starting test: xtimer mutex lock timeout
OK

> mutex_timeout_long_locked
mutex_timeout_long_locked
starting test: xtimer mutex lock timeout
OK

> mutex_timeout_long_locked_low
mutex_timeout_long_locked_low
starting test: xtimer mutex lock timeout with thread
threads = 2
THREAD low prio: start
MAIN THREAD: calling xtimer_mutex_lock_timeout
OK
threads = 3
MAIN THREAD: waiting for created thread to end
THREAD low prio: exiting low
threads = 2

Revert to REMOVE ME commit and run the test to see the bug.
output:

...
> mutex_timeout_long_unlocked
mutex_timeout_long_unlocked
starting test: xtimer mutex lock timeout
sys/xtimer/xtimer.c:252 => 0x565b8323
*** RIOT kernel panic:
FAILED ASSERTION.

*** halted.

Issues/PRs references

Depends on PR #11807
Tracking PR #11660
Includes PR #10872
my old pr fixing same bug: #11543

@JulianHolzwarth JulianHolzwarth marked this pull request as ready for review August 9, 2019 17:35
@JulianHolzwarth JulianHolzwarth force-pushed the pr/xtimer_mutex_lock_timeout/first_fix branch from de321cb to e6e0346 Compare August 9, 2019 17:37
@JulianHolzwarth JulianHolzwarth changed the title Pr/xtimer mutex lock timeout/first fix xtimer/xtimer.c: xtimer_mutex_lock_timeout fix Aug 9, 2019
@JulianHolzwarth JulianHolzwarth changed the title xtimer/xtimer.c: xtimer_mutex_lock_timeout fix sys/xtimer/xtimer.c: _mutex_timeout() bug fix Aug 9, 2019
@miri64 miri64 added Area: timers Area: timer subsystems Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) labels Sep 10, 2019
@JulianHolzwarth
Copy link
Contributor Author

rebase because of #12008 got merged

@JulianHolzwarth JulianHolzwarth force-pushed the pr/xtimer_mutex_lock_timeout/first_fix branch from e6e0346 to 0a3827d Compare October 11, 2019 16:33
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally found the time to test this one. I could reproduce the bug and verify that the fix is working. Nice one.
After struggling a bit to understand how the mutex/queue handling works here and what is implied by that, I wonder If we should just add some comments explaining what happens there to make it easier for the next one that looks at this code.

@MichelRottleuthner MichelRottleuthner added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines labels Oct 29, 2019
@JulianHolzwarth
Copy link
Contributor Author

@MichelRottleuthner I will create a function for removing a thread from a mutex list ( this is what the _mutex_timeout() function is doing now) and I will add more comments in there.
Can I rebase and remove the "remove me" commits?

@MichelRottleuthner
Copy link
Contributor

Can I rebase and remove the "remove me" commits?

Sure go ahead.

Pieter Willemsen and others added 2 commits November 27, 2019 14:48
Prevent a possible race condition when _mutex_timeout fires just after the
mutex was locked but before the xtimer was removed

The flow

int xtimer_mutex_lock_timeout(mutex_t *mutex, uint64_t timeout) {
   ...
    mutex_lock(mutex);
    /* mutex locked */
    /* _mutex_timeout fires and tries to remove thread from mutex queue */
    /* DEBUG: simulate callback call between lock and remove */
    xtimer_spin(xtimer_ticks_from_usec(timeout*2));
    xtimer_remove(&t);
    ...
}
no longer into the "if" when the mutex is free
@JulianHolzwarth JulianHolzwarth force-pushed the pr/xtimer_mutex_lock_timeout/first_fix branch from 0a3827d to 7cd3e8b Compare November 27, 2019 13:50
@JulianHolzwarth
Copy link
Contributor Author

@MichelRottleuthner I rebased. And removed REMOVE_ME commits

@MichelRottleuthner MichelRottleuthner added the Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines label Dec 4, 2019
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR was tested and changes are valid -> ACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: timers Area: timer subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants