-
Notifications
You must be signed in to change notification settings - Fork 37.7k
Fix various things -fsanitize complains about #9512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -13,7 +13,7 @@ | |||
BOOST_FIXTURE_TEST_SUITE(scriptnum_tests, BasicTestingSetup) | |||
|
|||
static const int64_t values[] = \ | |||
{ 0, 1, CHAR_MIN, CHAR_MAX, UCHAR_MAX, SHRT_MIN, USHRT_MAX, INT_MIN, INT_MAX, UINT_MAX, LONG_MIN, LONG_MAX }; | |||
{ 0, 1, CHAR_MIN, CHAR_MAX, UCHAR_MAX, SHRT_MIN, USHRT_MAX, INT_MIN, INT_MAX, UINT_MAX, (1LL << 40), (1LL << 40) - 1, -(1LL << 40), 1 - (1LL << 40)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe define these constants ourselves rather than inlining them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just replaced them with integer constants and added a comment.
} | ||
|
||
uint64_t static inline ReadLE64(const unsigned char* ptr) | ||
{ | ||
return le64toh(*((uint64_t*)ptr)); | ||
return ((uint64_t)(ptr[0])) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. This would also have failed on CPUs that don't support non-aligned reads (though I know of no such instances reported).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems ARM supports unaligned reads pretty much since the architecture we support.
@@ -13,7 +13,7 @@ | |||
BOOST_FIXTURE_TEST_SUITE(scriptnum_tests, BasicTestingSetup) | |||
|
|||
static const int64_t values[] = \ | |||
{ 0, 1, CHAR_MIN, CHAR_MAX, UCHAR_MAX, SHRT_MIN, USHRT_MAX, INT_MIN, INT_MAX, UINT_MAX, LONG_MIN, LONG_MAX }; | |||
{ 0, 1, CHAR_MIN, CHAR_MAX, UCHAR_MAX, SHRT_MIN, USHRT_MAX, INT_MIN, INT_MAX, UINT_MAX, (1LL << 40), (1LL << 40) - 1, -(1LL << 40), 1 - (1LL << 40)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the problem that sanitize reports here? I don't understand the rationale for using "40" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overflow of signed integers is undefined behaviour. These numbers are added/subtracted from eachother, so that behaviour triggers. However, it doesn't matter in practice, as we only use scriptnums up to 4 bytes (which can turn into 5 bytes after arithmetic).
I measure a 1.5% slowdown from these changes for the SHA256_32b benchmark (i7-6820HQ CPU @ 2.70GHz). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK.
It seems there are some extra bitmasks in the write procedures; I've marked where they are. Maybe the optimized missed these and that's your 1.5% ;)
} | ||
|
||
void static inline WriteLE16(unsigned char* ptr, uint16_t x) | ||
{ | ||
*((uint16_t*)ptr) = htole16(x); | ||
ptr[0] = x & 0xFF; | ||
ptr[1] = (x >> 8) & 0xFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the & 0xFF is redundant here, no? Unless you think there is some pipelining advantage...
ptr[0] = x & 0xFF; | ||
ptr[1] = (x >> 8) & 0xFF; | ||
ptr[2] = (x >> 16) & 0xFF; | ||
ptr[3] = (x >> 24) & 0xFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the & 0xFF is redundant here, no? Unless you think there is some pipelining advantage...
ptr[4] = (x >> 32) & 0xFF; | ||
ptr[5] = (x >> 40) & 0xFF; | ||
ptr[6] = (x >> 48) & 0xFF; | ||
ptr[7] = (x >> 56) & 0xFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the & 0xFF is redundant here, no? Unless you think there is some pipelining advantage...
} | ||
|
||
void static inline WriteBE32(unsigned char* ptr, uint32_t x) | ||
{ | ||
*((uint32_t*)ptr) = htobe32(x); | ||
ptr[0] = (x >> 24) & 0xFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the & 0xFF is redundant here, no? Unless you think there is some pipelining advantage...
} | ||
|
||
void static inline WriteBE64(unsigned char* ptr, uint64_t x) | ||
{ | ||
*((uint64_t*)ptr) = htobe64(x); | ||
ptr[0] = (x >> 56) & 0xFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the & 0xFF is redundant here, no? Unless you think there is some pipelining advantage...
@sipa it's frustrating that the compiler doesn't recognize that pattern and peephole optimize it on platforms that support unaligned reads. At the same time, we really should be using platform optimized SIMD sha256 on our primary platforms, so the performance of this code won't matter once we do. |
Specifying &0xff for an unsigned char operation will just be ignored by the compiler (this is a no-brainer optimization). I doubt that is the cause of the slowdown. I wonder, isn't there some compiler intrinsic for "non-aligned read of larger type" that could be detected by autoconf and used if available? |
Updated to use a different approach to avoiding unaligned access, which seems as fast as master for the SHA256_32b benchmark (or even faster, I consistently across different runs see a minimum duration speedup of 0.02-0.025%). I've also added a commit to remove the use of boost::dynamic_bitset (which ubsan complains about, though I can't see exactly how or why). Now all unit tests and rpc tests run succesfully with |
utACK. |
ACK. |
utACK 82e8baa |
82e8baa Avoid boost dynamic_bitset in rest_getutxos (Pieter Wuille) 99f001e Fix memory leak in multiUserAuthorized (Pieter Wuille) 5a0b7e4 Fix memory leak in net_tests (Pieter Wuille) 6b03bfb Fix memory leak in wallet tests (Pieter Wuille) f94f3e0 Avoid integer overflows in scriptnum tests (Pieter Wuille) 843c560 Avoid unaligned access in crypto i/o (Pieter Wuille)
82e8baa Avoid boost dynamic_bitset in rest_getutxos (Pieter Wuille) 99f001e Fix memory leak in multiUserAuthorized (Pieter Wuille) 5a0b7e4 Fix memory leak in net_tests (Pieter Wuille) 6b03bfb Fix memory leak in wallet tests (Pieter Wuille) f94f3e0 Avoid integer overflows in scriptnum tests (Pieter Wuille) 843c560 Avoid unaligned access in crypto i/o (Pieter Wuille)
Closes #1249 - Least Authority Issue C Backports the first commit from bitcoin/bitcoin#9512 to avoid unaligned access in crypto i/o.
82e8baa Avoid boost dynamic_bitset in rest_getutxos (Pieter Wuille) 99f001e Fix memory leak in multiUserAuthorized (Pieter Wuille) 5a0b7e4 Fix memory leak in net_tests (Pieter Wuille) 6b03bfb Fix memory leak in wallet tests (Pieter Wuille) f94f3e0 Avoid integer overflows in scriptnum tests (Pieter Wuille) 843c560 Avoid unaligned access in crypto i/o (Pieter Wuille)
…dablock committed on Jan 20, 2018 Use version 2 blocks for miner_tests … @codablock codablock committed on Jan 20, 2018 Merge bitcoin#7871: Manual block file pruning. … @laanwj @codablock laanwj authored and codablock committed on Jan 11, 2017 Merge bitcoin#9507: Fix use-after-free in CTxMemPool::removeConflicts() … @sipa @codablock sipa authored and codablock committed on Jan 11, 2017 Merge bitcoin#9297: Various RPC help outputs updated … @MarcoFalke @codablock MarcoFalke authored and codablock committed on Jan 12, 2017 Merge bitcoin#9416: travis: make distdir before make … @MarcoFalke @codablock MarcoFalke authored and codablock committed on Jan 12, 2017 Merge bitcoin#9520: Deprecate non-txindex getrawtransaction and bette… … @MarcoFalke @codablock MarcoFalke authored and codablock committed on Jan 12, 2017 Merge bitcoin#9518: Return height of last block pruned by pruneblockc… … @MarcoFalke @codablock MarcoFalke authored and codablock committed on Jan 12, 2017 Merge bitcoin#9472: Disentangle progress estimation from checkpoints … … @laanwj @codablock laanwj authored and codablock committed on Jan 12, 2017 Merge bitcoin#8883: Add all standard TXO types to bitcoin-tx … @laanwj @codablock laanwj authored and codablock committed on Jan 12, 2017 Merge bitcoin#9261: Add unstored orphans with rejected parents to rec… … @laanwj @codablock laanwj authored and codablock committed on Jan 12, 2017 Merge bitcoin#9468: [Depends] Dependency updates for 0.14.0 … @laanwj @codablock laanwj authored and codablock committed on Jan 12, 2017 Merge bitcoin#9222: Add 'subtractFeeFromAmount' option to 'fundrawtra… … @laanwj @codablock laanwj authored and codablock committed on Jan 12, 2017 Merge bitcoin#9490: Replace FindLatestBefore used by importmuti with … … @sipa @codablock sipa authored and codablock committed on Jan 13, 2017 Merge bitcoin#9469: [depends] Qt 5.7.1 … @laanwj @codablock laanwj authored and codablock committed on Jan 15, 2017 Merge bitcoin#9380: Separate different uses of minimum fees … @laanwj @codablock laanwj authored and codablock committed on Jan 16, 2017 Remove SegWit related code in dash-tx @codablock codablock committed on Sep 21, 2017 Merge bitcoin#9561: Wake message handling thread when we receive a ne… … @sipa @codablock sipa authored and codablock committed on Jan 17, 2017 Merge bitcoin#9508: Remove unused Python imports … @MarcoFalke @codablock MarcoFalke authored and codablock committed on Jan 18, 2017 Merge bitcoin#9512: Fix various things -fsanitize complains about
921c4a4 script: prevent UB when computing abs value for num opcode serialize (pierrenn) 58ebadb Fix memory leak in net_tests (Pieter Wuille) 786f396 Fix memory leak in wallet tests (random-zebra) d8b125d Avoid integer overflows in scriptnum tests (Pieter Wuille) Pull request description: Fix `scriptnum_tests` currently failing with debug enabled, due to integer overflow, backporting bitcoin#9512. Last commit coming from bitcoin#18413. ACKs for top commit: Fuzzbawls: ACK 921c4a4 furszy: utACK 921c4a4 and merging.. Tree-SHA512: 07827bfda430e0704b427e373700ff37049bddaa59a56d415618441cb63fb8500bdb19f9df834574ade4618f0825853e44255f3a304486b4bbcb60dbbb382938
memcpy()
ing into a local variable first (as accessing a type's representation through a char* is always allowed).Now all unit tests and rpc tests run succesfully with -fsanitize=address -fsanitize=undefined -fsanitize=leak in GCC 6.2.0.
Thanks to @kcc for pointing me to these tools.