-
Notifications
You must be signed in to change notification settings - Fork 37.7k
prevector: enforce is_trivially_copyable_v #24962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The prevector implementation currently can't be used with types that are not trivially copyable, due to the use of memmove. Trivially copyable implies that it is trivially destructible, see https://eel.is/c++draft/class.prop#1.3 That means that the checks for std::is_trivially_destructible are not necessary, and in fact where used it wouldn't be enough. E.g. in `erase(iterator, iterator)` the elements in range first-last are destructed, but it does not destruct elements left after `memmove`. This commit removes the checks for `std::is_trivially_destructible` and instead adds a `static_assert(std::is_trivially_copyable_v<T>);` to make sure `prevector` is only used with supported types.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
src/prevector.h
Outdated
memmove(&(*first), &(*last), endp - ((char*)(&(*last)))); | ||
char* endp = reinterpret_cast<char*>(end().ptr); | ||
_size -= last - p; | ||
memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr)); | |
std::memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr)); |
Maybe switch to std::memmove
, which comes with the documentation:
https://en.cppreference.com/w/cpp/string/byte/memmove
If the objects are potentially-overlapping or not TriviallyCopyable, the behavior of memmove is not specified and may be undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change all occurrences of memmove in that file, plus memcpy as well, right? Then I'll include cstring
instead of string.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, if the changes are more expensive, then maybe do it in another commit or pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the UB section of the documentation in https://en.cppreference.com/w/cpp/string/byte/memmove (C++ memmove) apply to https://en.cppreference.com/w/c/string/byte/memmove (C memmove)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the C version:
The behavior is undefined if access occurs beyond the end of the dest array.
The behavior is undefined if either dest or src is an invalid or null pointer. <-- also specified in the C++ memmove doc
The behavior is undefined if the size of the character array pointed to by dest < count <= destsz; in other words, an erroneous value of destsz does not expose the impending buffer overflow.
In the C++ version:
If the objects are potentially-overlapping or not TriviallyCopyable, the behavior of memmove is not specified and may be undefined. <-- not stated in the C version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with C, but maybe every class is trivially copyable in C?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C doesn't have classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right (interesting historic context here). So IIUC the premise in the pull description only applies to the C++ memmove and it may make sense to change to it at the same time (or not do this change at all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the premise in the pull description only applies to the C++ memmove and it may make sense to change to it at the same time (or not do this change at all).
I still think it is necessary to do the change, even when we stay with memmove
. The current implementation just does not work with complex objects. E.g. here is a small example:
BOOST_AUTO_TEST_CASE(Erase) {
prevector<8, std::string> pv;
pv.push_back("a");
pv.push_back("b");
pv.erase(pv.begin());
pv.push_back("X");
std::cout << pv.front() << std::endl;
}
One would expect that this prints "b", as "a" was deleted and "b" should now be the first element. But l get this output:
Running 1 test case...
X
free(): invalid pointer
unknown location(0): fatal error: in "prevector_tests/Erase": signal: SIGABRT (application abort requested)
So suddenly the first element is not "b" but "X". This is due to to small string optimization, the strings point to member data, and this is simply copied. Complex objects have to be moved for this to work.
here is another motivating example with std::swap
:
BOOST_AUTO_TEST_CASE(NonTrivial) {
prevector<8, std::string> a;
prevector<8, std::string> b;
a.push_back("a");
b.push_back("b");
std::cout << "a=" << a.front() << ", b=" << b.front() << std::endl;
std::swap(a, b);
std::cout << "a=" << a.front() << ", b=" << b.front() << std::endl;
}
Running that test produces this output:
Running 1 test case...
a=a, b=b
a=a, b=b
free(): invalid pointer
unknown location(0): fatal error: in "prevector_tests/NonTrivial": signal: SIGABRT (application abort requested)
So swap doesn't seem to do anything for these objects, and destructing will cause a SIGABRT.
I think we should either restrict prevector to only work with supported types, or to correctly implement it for nontrivial types. Fixing this is quite a bit more work though, and would need a lot more testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat followed up in #25172.
Light Concept ACK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1ed610d
I haven't reviewed the second commit and I wonder why no sanitizer complains about them. I presume the reason is that this is no stdlib iterator? See also: #21817 (comment) and #21869 |
I was wondering about that too, and thought this might be because the optimizer is just optimizing that away. After some googling though it seems this is actually allowed and no UB:
TL;DR: I was most likely wrong and it is fine to do |
Ok, maybe drop the second commit or move it to a follow-up pull? |
1ed610d
to
11e7908
Compare
is_trivially_copyable_v
, don't dereference iteratorsis_trivially_copyable_v
is_trivially_copyable_v
re-ACK modulo question in #24962 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 11e7908
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review ACK 11e7908 🏯
I checked that the code is unused at run-time. Also, in light of silent merge
conflicts, the added static_assert should ensure that any code that tried to
use it fails to compile. According to https://eel.is/c++draft/class#prop-1.3
is_trivially_copyable_v is false if is_trivially_destructible_v is false.
I have not verified that this is actually UB, but the provided example seems to
imply so. Regardless, removing code because it is unused seems good enough.
Show signature
Signature:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
review ACK 11e79084845a78e2421ea3abafe0de5a54ca2bde 🏯
I checked that the code is unused at run-time. Also, in light of silent merge
conflicts, the added static_assert should ensure that any code that tried to
use it fails to compile. According to https://eel.is/c++draft/class#prop-1.3
is_trivially_copyable_v is false if is_trivially_destructible_v is false.
I have not verified that this is actually UB, but the provided example seems to
imply so. Regardless, removing code because it is unused seems good enough.
-----BEGIN PGP SIGNATURE-----
iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUhd7gv/X+b9gxeWmyrk43wUYLfe9bXgwHPM3JRI068e7fVFeml8flhkQIYaY4eP
Km+G8XGwEf0TaTY2Y1ExmSxsBE3EKF9sYRLLga/FOeBamKpQwBHefNOQyPEdP9wr
KQSYv7mB27U6xt8tFv+p1i9Hr4LWQVEEWX2dYCl8qio+kObOQQFI4xhManXnp06d
+8Vp+7IbS2H9jLIPK+1byiozLWaT8pr15PRRgQL1XuvlBwEdoBmF/usMP/tV4MaC
Ns8CXIXB5guiA96HDSsK0dsDzQgSqrX8ZrqSP0007lkW4kOW/TqsxMQhtTbEppOO
BadS8y0QIBcrOl2WGICJlfSqm3BKb7/YCbR+jdAaS5LHT5YxM6/k4a0KjPt1ijpM
VibEmNkimCin3cpVbxAouptTdP1KyNoeTCrDU1yGM3o/tMAy0axjPhos9Iv2avvn
xtBGzMFe6XzRQR25mvXBsEd/v7Dbe/zBJPLRFGFr7NhR+2XBSq5B7ncaj9PwK5uy
r4uqloDb
=xRUi
-----END PGP SIGNATURE-----
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 11e7908 -- code review only
The prevector implementation currently can't be used with types that are not trivially copyable, due to the use of memmove. Trivially copyable implies that it is trivially destructible, see https://eel.is/c++draft/class.prop#1.3 That means that the checks for std::is_trivially_destructible are not necessary, and in fact where used it wouldn't be enough. E.g. in `erase(iterator, iterator)` the elements in range first-last are destructed, but it does not destruct elements left after `memmove`. This commit removes the checks for `std::is_trivially_destructible` and instead adds a `static_assert(std::is_trivially_copyable_v<T>);` to make sure `prevector` is only used with supported types. Github-Pull: bitcoin#24962 Rebased-From: 11e7908
11e7908 prevector: only allow trivially copyable types (Martin Leitner-Ankerl) Pull request description: prevector uses `memmove` to move around data, that means it can only be used with types that are trivially copyable. That implies that the types are trivially destructible, thus the checks for `is_trivially_destructible` are not needed. ACKs for top commit: w0xlt: ACK bitcoin@11e7908 MarcoFalke: review ACK 11e7908 🏯 ajtowns: ACK 11e7908 -- code review only Tree-SHA512: cbb4d8bfa095100677874b552d92c324c7d6354fcf7adab2ed52f57bd1793762871798b5288064ed1af2d2903a0ec9dbfec48d99955fc428f18cc28d6840dccc
prevector uses
memmove
to move around data, that means it can only be used with types that are trivially copyable. That implies that the types are trivially destructible, thus the checks foris_trivially_destructible
are not needed.