Skip to content

sys/newlib_syscalls_default: fix race condition in __sinit() #20392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

maribu
Copy link
Member

@maribu maribu commented Feb 16, 2024

Contribution description

This eagerly calls __sinit() instead of lazy initialization upon the first call to stdio (e.g. puts(), printf()). The issue is that without locking (as is currently the case for all RIOT platforms but ESP) two concurrent "first calls" may result in concurrent initialization of the same structure and data corruption.

Testing procedure

Have two threads concurrently use stdio and the boot message disabled. This may result in data corruptions and crashes.

See #20067 for details on a reproducible crashing app.

Issues/PRs references

Fixes #20067

@github-actions github-actions bot added the Area: sys Area: System label Feb 16, 2024
@maribu maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR CI: full build disable CI build filter labels Feb 16, 2024
@maribu
Copy link
Member Author

maribu commented Feb 16, 2024

@mchesser: Could you give this a try?

I'm aware that it still is a pain in the ass that RIOT doesn't support (at least optional) reentrancy for platforms other than ESP. That is in issue that stills needs to be addressed.

However, even people desperate for saving RAM don't like crashes, and this hopefully makes the concurrent use of stdio crash-free.

@riot-ci
Copy link

riot-ci commented Feb 16, 2024

Murdock results

✔️ PASSED

375aed1 sys/newlib_syscalls_default: fix race condition in __sinit()

Success Failures Total Runtime
142848 0 142848 02h:14m:59s

Artifacts

This eagerly calls `__sinit()` instead of lazy initialization upon the
first call to stdio (e.g. `puts()`, `printf()`). The issue is that
without locking (as is currently the case for all RIOT platforms but
ESP) two concurrent "first calls" may result in concurrent
initialization of the same structure and data corruption.

Fixes RIOT-OS#20067
@maribu maribu force-pushed the sys/newlib_syscalls_default branch from ea98597 to 375aed1 Compare February 16, 2024 06:21
/* Also, make an exception for riotboot, which does not use stdio
* at all. This would pull in stdio and increase .text size
* significantly there */
if (!IS_USED(MODULE_RIOTBOOT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that riotboot would already do

DISBALE_MODULE += core_thread

but apparently not. That would save us some special casing in other places too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you open a PR to just do that? I could wait and rebase on top of that.

Copy link
Contributor

@benpicco benpicco Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

riotboot_dfu uses a thread (and thread flags), so might not be so easy.
But how about

Suggested change
if (!IS_USED(MODULE_RIOTBOOT)) {
if (!IS_USED(MODULE_STDIO_NULL)) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly that would not work: stdio_null just throws away the output after it has gone the whole way through newlib. So there is no way for newlib to know whether the output is actually printed or just thrown away.

It would be possible to implement stdio_null to wrap all calls to printf() and friends to a mockup implementation. (And if we assume that callers of printf() don't care about the return value, those could be lean.) But as of know, we still need to init that with stdio_null.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like hard-coding riotboot, but let's get this in since we don't have a better solution.

@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Feb 19, 2024
@maribu maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Feb 19, 2024
@maribu
Copy link
Member Author

maribu commented Feb 20, 2024

OK, full CI run with tests passed. I guess this is OK to merge.

Note: The CI run failed twice before due to:

  1. No USB connection to a samr21-xpro. (I'd say that is safe to ignore, either the USB cable was disconnected, flaky, or the issue is with the eDBG and not with the firmware on the MCU.)
  2. A test on native failed because some wake up was too slow. This could be due to load, or due to the general flakiness of native.

@benpicco benpicco added this pull request to the merge queue Feb 20, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 20, 2024
@maribu maribu added this pull request to the merge queue Feb 20, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 20, 2024
@benpicco benpicco added this pull request to the merge queue Feb 20, 2024
Merged via the queue into RIOT-OS:master with commit 9504e07 Feb 20, 2024
@maribu maribu deleted the sys/newlib_syscalls_default branch February 20, 2024 16:12
@maribu
Copy link
Member Author

maribu commented Feb 20, 2024

Thx :)

@MrKevinWeiss MrKevinWeiss added this to the Release 2024.04 milestone Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: sys Area: System CI: full build disable CI build filter CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CONFIG_SKIP_BOOT_MSG makes newlib stdio races more likely
4 participants