-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Suppress false positive warning about uninitialized entropy buffers #17627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think in general, initializing everything unless it's deliberately not done for performance reasons, is a good thing. One company I've worked for even had a strict "initialize everything" policy, no exceptions. Fixing random buffers because of valgrind false positives has some very bad precedent though, please review carefully. |
@laanwj To be fair this is not a Valgrind false positive? :) Valgrind false positives are very rare in my experience. |
It's a false positive because the buffer is actually filled in by the syscall but
Oh you're right, wrong tool. |
@laanwj Yes, but the false positive is in MemorySanitizer -- not Valgrind :) |
Concept ACK. Still requires close up review but I think it's good to keep that in mind |
Agree that it needs careful review. I can replace the Also note that the code this touches is relatively new (added by @sipa in #15250 and #17270), and deserves extra scrutiny anyway now that we've dropped the OpenSSL training wheels. The actual initialization before this commit took place in |
Concept ACK
Agree. I think it is worth doing. Some background reading that might be of interest regarding the pros/cons of default initialization in general:
Some quotes:
Note that there is a |
@Sjors Which subset of these changes are required to make MemorySanitizer happy? Not all these are technically needed, right? (I'm not opposing fixing it generally, but asking to get the full picture :)) |
src/random.cpp
Outdated
@@ -451,7 +451,7 @@ static void SeedFast(CSHA512& hasher) noexcept | |||
|
|||
static void SeedSlow(CSHA512& hasher) noexcept | |||
{ | |||
unsigned char buffer[32]; | |||
unsigned char buffer[32] = {0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@practicalswift once I fixed this one, it ran into unrelated use-of-uninitialized-value
problems. That could mean this is the only problem, or the other ones just didn't get a chance to get called before the sanitizer bailed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. I feel unqualified to ACK here, but I read the changes and the first 2 links provided by @practicalswift, built ccfed8a without sanitizer (--enable-debug only) with gcc version 8.3.0 (Debian 8.3.0-6) on Debian 4.19.37-5+deb10u2 (2019-08-08), ran tests, and am running bitcoind without issues and seeing the Feeding n bytes of dynamic environment data into RNG
logs with typical values for n (~16700) on this installation.
Initializing this would conceal from valgrind if changes in the code cased the random variable to not get filled which would be a moderately serious problem: Zero is not an acceptable random value. :) Since msan is actually incorrect due to a known and documented limitation in msan, might it not be better to use a suppression? I'm not a fan of suppressing actual errors-- but this isn't an error, it's a tool limit, and suppressing-by-initializing will reduce the effectiveness other tools without the same limit. |
That's a good point. does valgrind support syscalls? |
Maybe I can add |
Clarification regarding my Concept ACK: a suppression would be totally fine too. As long as we make it easier for everyone to use MemorySanitizer I'm happy: we need more MSAN testing :) |
605991f
to
a69bb00
Compare
I added asserts to ensure these buffers are actually populated on first use. I also added @gmaxwell's comment about valgrind. I'm not very familiar with how valgrind works. If it substitutes calls to I could pick a magic value instead, initialize with that and use == in the assert. |
Yes, valgrind doesn't have this issue.
But what if the input is legitly 0? These are not random values being fed in, but just some OS data, which may or may not consist of zeroes. So NACK on the asserts. |
src/random.cpp
Outdated
@@ -389,6 +391,8 @@ class RNGState { | |||
++m_counter; | |||
// Finalize the hasher | |||
hasher.Finalize(buf); | |||
// Ensure Finalize() populated buf | |||
assert(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: asserting the address of the buffer, on the stack, like you do here does nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, something like this works (?), though it's hideous, probably needs a macro, which adds even more complexity:
unsigned char zeros[sizeof(buf)] = {0};
assert(!memcmp(fbuf, zeros, sizeof(buf)));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny who you run into while googling that: https://rusty.ozlabs.org/?p=560 :-)
a69bb00
to
4cab23d
Compare
I've thought about it a bit and I'm principally opposed to adding asserts here. Now you're changing run-time behavior of the application to work around a problem in an analysis tool. I'm sorry to say this but : please fix the tool instead |
Echoing the NACK on the asserts :) |
4cab23d
to
605991f
Compare
@Sjors Have you considered taking the suppressions route instead? Perhaps that is less intrusive. |
Reverted to the original approach, which just initializes to zero. I'm not sure if intentially leaving them uninitialized and having the warnings suppressed if the right approach either. |
605991f
to
1143353
Compare
Why? :) |
It seems like we're choosing between two tools here:
There's gotta be a solution that involves defined behavior. That would make the code easier to reason about, and prevents folks on different operating systems and compiler optimization flags not noticing bugs (as happened with #17568 and #17449). |
In that case, let's at least not penalize the tool that actually gets things right (valgrind). So I'd vote for adding a MSAN surpression and just be done with it. It can be removed when the tool is fixed. |
Memory sanitizer cannot see through syscall. This caused a false positive use-of-uninitialized-value error for SYS_getrandom. After discussion in bitcoin#17627 it was decided not to initialize these variables to 0, but instead to suppress the warning.
4777647
to
53460b4
Compare
Memory sanitizer cannot see through syscall. This caused a false positive use-of-uninitialized-value error for SYS_getrandom. After discussion in bitcoin#17627 it was decided not to initialize these variables to 0, but instead to suppress the warning.
I added the suppression, as narrowly as possible. I also added comments to the uninitialized variables, pointing back to this discussion so it doesn't get repeated. |
Memory sanitizer cannot see through syscall. This caused a false positive use-of-uninitialized-value error for SYS_getrandom. After discussion in bitcoin#17627 it was decided not to initialize these variables to 0, but instead to suppress the warning.
53460b4
to
527f3ec
Compare
Again, this is a bug in the tool. These variables are not uninitialized when read. Initializing them with the wrong value will cause bugs in the future because tools that properly detect uninitialized reads will no longer yell at you. This point is repeated every couple of months. Might be a good candidate to add to the dev notes. |
# if __has_feature(memory_sanitizer) | ||
__attribute__((no_sanitize("memory"))) | ||
# endif | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a clang bug, right? Should link to the commit that fixed it. And since you are compiling clang from source you might as well use that commit and then remove this code blob here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory sanitizer appears to be a Google project, and they don't expect syscall
support to be fixed anytime soon. Only a more narrow case of getrandom
was fixed, which is of no use to us: google/sanitizers#852 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fine. I'd still prefer a suppression in a suppressions file if possible. Modifying the code like this makes it harder to read and potentially impossible to compile on some environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@practicalswift any idea how to achieve such a suppressions file? I could also put it in a macro, or we can do that if we ever need it again.
The outer #if defined(__has_feature)
is there to make sure other environments don't trip over it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sjors Sure! :) You need to use the sanitizer special case list functionality by passing -fsanitize-blacklist=test/sanitizer_suppressions/msan
as part of CFLAGS
/CXXFLAGS
. You need to create the file test/sanitizer_suppressions/msan
following the file format specified here: https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format
Unfortunately MemorySanitizer only supports suppression lists supplied at compile-time. Some of the other sanitizers allow suppression lists supplied at run-time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to punt on this until we need it in more places...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to punt on this until we need it in more places...
So this is the only place where MSAN special treatment is needed? That's pretty good, I had assumed it would be the first of many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know yet. The binary exits once it finds a single problem. The next problem in line is in a different file, but who know what happens after that.
There's only one place in our codebase where we use syscall
, so that's a reason for optimism.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
@Sjors Thanks for using a suppression instead. Is this the only suppression needed to make MSAN happy in your setup? |
@practicalswift no, it fails on other things, but those appear unrelated. I'll update #17620 later. |
It's not clear to me that everyone understands what's going on here. And maybe I caused some confusion by say that zero is not an acceptable random value. No constant is an acceptable random value. The variable is not undefined. It gets defined by the SYS_getrandom. Memcheck has a flaw/shortcoming where it doesn't know or assume that sys_getrandom initializes it so it falsely reports it as uninitialized. If it were actually undefined then that would mean that the getrandom call wasn't doing anything which would be a really serious problem because it would mean that the software getting random numbers it was expecting. Ignoring the other belt-and-suspenders randomness inputs, a failure to read random data into that variable would be a total break resulting in predictable keys. With the code as it currently is, should some bug be added so that the variable is not initialized at least valgrind will start yelling. If the code is changed to initialize the value with anything except a truly random value (such a zero or some random hash) then if a flaw that skips the randomness call is introduced the software will be just as vulnerable but in that case the flaw will be undetectable by valgrind. Similarly, compilers can often warn when they're certain a value will be undefined, and pre-emptive dummy initialization will break those warnings. An assertion would not be appropriate because any value is acceptable random value, so long as it's actually random and not some predictable constant. Another potential tool in the debate between "pre-initilize variables and avoid the risk of undefined values" and "don't pre-init and preserve valgrind/memcheck sensitivity to coding errors that cause the required "real initialization" is that valgrind.h has special macros that can be used to mark memory as undefined in valgrind's tracking although they're defined for language purposes. (VALGRIND_MAKE_MEM_UNDEFINED(x,y), see secp256k1/tests.c). I'm not sure if anything similar exists for memcheck. In some rare cases pre-initilization is a performance concern. But in most-- where the variable isn't in some inner loop and doesn't involve a bunch of malloc-- it's not. It is, however, very often a testing sensitivity concern. Of course, in some cases the code can be re-factored to do the proper 'final' initialization at declaration time then that's best. If there were a big swing in the project towards performing dummy initialization I think it would be really useful if it were always done via a macro that would allow disabling it for testing (or valgrind annotating it). |
You mean MemorySanitizer (MSAN) and not Memcheck (Valgrind), right? :)
From a security perspective doesn't it boil down to a trade-off between maximising testing sensitivity (achieved by not pre-initializing) vs minimising the risk of leaking non-dummy memory content (achieved by pre-initializing)? Is that a fair summary? :) Background context for others: Depending on code generation, a programmer who runs into undefined behavior by using an uninialized automatic variable may observe any previous value (including program secrets), or any value which the compiler saw fit to materialize on the stack or in a register (this could be to synthesize an immediate, to refer to code or data locations, to generate cookies, etc). |
Memory sanitizer cannot see through syscall. This caused a false positive use-of-uninitialized-value error for SYS_getrandom. After discussion in bitcoin#17627 it was decided not to initialize these variables to 0, but instead to suppress the warning.
527f3ec
to
31408ed
Compare
There is no risk of leaking anything here, and of the randomness is hashed before being used. The difference between initializing and not is that if a bug is introduced: With initializing a bug that hobbles the random input will not be detectable (except via review), without initializing the bug would be detectable by memcheck but there would also be undefined behaviour (however, on currently supported compilers/systems the undefined behaviour would be harmless esp compared to the no-randomness bug) until someone noticed the error report in memcheck and fixed it. |
FWIW I attempted to summarise this discussion in the review club notes as a resource and will probably update it with the points made here. Don't hesitate to suggest corrections or improvements in the repository or to me via IRC. https://bitcoincore.reviews/17639.html#what-can-we-do-to-mitigate-uninitialized-variables |
I was talking about the pros and cons of pre-initializing generally: in this specific case there is obviously no risk of leaking sensitive data :) Recent discussion on Twitter about pre-initialization in general with some people from Apple working on iPhone security and LLVM respectively that might be of interest for readers of this thread: "If you’ve got a security-sensitive codebase, you should be using -ftrivial-auto-init=pattern in Clang. In 2020, there’s no good reason for uninitialized variables to be exploitable." I would assume that they do developer builds without |
Closing in favor of the more precise fix in #18288. We should still document a policy around initialisation, perhaps requiring either initialisation or a comment that's intentionally not initialised. |
The memory sanitizer cannot see through
syscall
(see google/sanitizers#852). This caused a falsepositive
use-of-uninitialized-value
error forSYS_getrandom
.Given recent problematic real issues with uninitialized variables, I think it's worth getting rid of the false positives so developers, and maybe Travis, can run the sanitizer.
See #17620 for how to use the memory sanitizer.