Skip to content

Conversation

martinus
Copy link
Contributor

@martinus martinus commented Apr 24, 2022

prevector uses memmove to move around data, that means it can only be used with types that are trivially copyable. That implies that the types are trivially destructible, thus the checks for is_trivially_destructible are not needed.

The prevector implementation currently can't be used with types that are
not trivially copyable, due to the use of memmove. Trivially copyable
implies that it is trivially destructible, see
https://eel.is/c++draft/class.prop#1.3

That means that the checks for std::is_trivially_destructible are not
necessary, and in fact where used it wouldn't be enough. E.g. in
`erase(iterator, iterator)` the elements in range first-last are destructed,
but it does not destruct elements left after `memmove`.

This commit removes the checks for `std::is_trivially_destructible`
and instead adds a `static_assert(std::is_trivially_copyable_v<T>);` to
make sure `prevector` is only used with supported types.
@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 25, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

src/prevector.h Outdated
memmove(&(*first), &(*last), endp - ((char*)(&(*last))));
char* endp = reinterpret_cast<char*>(end().ptr);
_size -= last - p;
memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr));
std::memmove(first.ptr, last.ptr, endp - reinterpret_cast<char*>(last.ptr));

Maybe switch to std::memmove, which comes with the documentation:

https://en.cppreference.com/w/cpp/string/byte/memmove

If the objects are potentially-overlapping or not TriviallyCopyable, the behavior of memmove is not specified and may be undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change all occurrences of memmove in that file, plus memcpy as well, right? Then I'll include cstring instead of string.h

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, if the changes are more expensive, then maybe do it in another commit or pull request.

Copy link
Member

@jonatack jonatack Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the UB section of the documentation in https://en.cppreference.com/w/cpp/string/byte/memmove (C++ memmove) apply to https://en.cppreference.com/w/c/string/byte/memmove (C memmove)?

Copy link
Member

@jonatack jonatack Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the C version:

The behavior is undefined if access occurs beyond the end of the dest array.

The behavior is undefined if either dest or src is an invalid or null pointer. <-- also specified in the C++ memmove doc

The behavior is undefined if the size of the character array pointed to by dest < count <= destsz; in other words, an erroneous value of destsz does not expose the impending buffer overflow.

In the C++ version:

If the objects are potentially-overlapping or not TriviallyCopyable, the behavior of memmove is not specified and may be undefined. <-- not stated in the C version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with C, but maybe every class is trivially copyable in C?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C doesn't have classes.

Copy link
Member

@jonatack jonatack Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right (interesting historic context here). So IIUC the premise in the pull description only applies to the C++ memmove and it may make sense to change to it at the same time (or not do this change at all).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the premise in the pull description only applies to the C++ memmove and it may make sense to change to it at the same time (or not do this change at all).

I still think it is necessary to do the change, even when we stay with memmove. The current implementation just does not work with complex objects. E.g. here is a small example:

BOOST_AUTO_TEST_CASE(Erase) {
    prevector<8, std::string> pv;
    pv.push_back("a");
    pv.push_back("b");
    pv.erase(pv.begin());
    pv.push_back("X");
    std::cout << pv.front() << std::endl;
}

One would expect that this prints "b", as "a" was deleted and "b" should now be the first element. But l get this output:

Running 1 test case...
X
free(): invalid pointer
unknown location(0): fatal error: in "prevector_tests/Erase": signal: SIGABRT (application abort requested)

So suddenly the first element is not "b" but "X". This is due to to small string optimization, the strings point to member data, and this is simply copied. Complex objects have to be moved for this to work.

here is another motivating example with std::swap:

BOOST_AUTO_TEST_CASE(NonTrivial) {
    prevector<8, std::string> a;
    prevector<8, std::string> b;
    a.push_back("a");
    b.push_back("b");

    std::cout << "a=" << a.front() << ", b=" << b.front() << std::endl;
    std::swap(a, b);
    std::cout << "a=" << a.front() << ", b=" << b.front() << std::endl;
}

Running that test produces this output:

Running 1 test case...
a=a, b=b
a=a, b=b
free(): invalid pointer
unknown location(0): fatal error: in "prevector_tests/NonTrivial": signal: SIGABRT (application abort requested)

So swap doesn't seem to do anything for these objects, and destructing will cause a SIGABRT.

I think we should either restrict prevector to only work with supported types, or to correctly implement it for nontrivial types. Fixing this is quite a bit more work though, and would need a lot more testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat followed up in #25172.

@theStack
Copy link
Contributor

Light Concept ACK
(I'm not deeply familiar with templates, type traits, etc. but from a higher level perspective, removing unused code and fixing UB seems like a good idea!)

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 1ed610d

@maflcko
Copy link
Member

maflcko commented Apr 29, 2022

I haven't reviewed the second commit and I wonder why no sanitizer complains about them. I presume the reason is that this is no stdlib iterator? See also: #21817 (comment) and #21869

@martinus
Copy link
Contributor Author

I wonder why no sanitizer complains about them

I was wondering about that too, and thought this might be because the optimizer is just optimizing that away. After some googling though it seems this is actually allowed and no UB:

TL;DR: I was most likely wrong and it is fine to do &*end()

@maflcko
Copy link
Member

maflcko commented Apr 29, 2022

Ok, maybe drop the second commit or move it to a follow-up pull?

@martinus martinus force-pushed the 2022-04_prevector_fixes branch from 1ed610d to 11e7908 Compare April 29, 2022 15:43
@martinus martinus changed the title prevector: enforce is_trivially_copyable_v, don't dereference iterators prevector: enforce is_trivially_copyable_v Apr 29, 2022
@martinus martinus changed the title prevector: enforce is_trivially_copyable_v prevector: enforce is_trivially_copyable_v Apr 29, 2022
@jonatack
Copy link
Member

re-ACK modulo question in #24962 (comment)

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 11e7908

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review ACK 11e7908 🏯

I checked that the code is unused at run-time. Also, in light of silent merge
conflicts, the added static_assert should ensure that any code that tried to
use it fails to compile. According to https://eel.is/c++draft/class#prop-1.3
is_trivially_copyable_v is false if is_trivially_destructible_v is false.

I have not verified that this is actually UB, but the provided example seems to
imply so. Regardless, removing code because it is unused seems good enough.

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

review ACK 11e79084845a78e2421ea3abafe0de5a54ca2bde 🏯

I checked that the code is unused at run-time. Also, in light of silent merge
conflicts, the added static_assert should ensure that any code that tried to
use it fails to compile. According to https://eel.is/c++draft/class#prop-1.3
is_trivially_copyable_v is false if is_trivially_destructible_v is false.

I have not verified that this is actually UB, but the provided example seems to
imply so. Regardless, removing code because it is unused seems good enough.
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUhd7gv/X+b9gxeWmyrk43wUYLfe9bXgwHPM3JRI068e7fVFeml8flhkQIYaY4eP
Km+G8XGwEf0TaTY2Y1ExmSxsBE3EKF9sYRLLga/FOeBamKpQwBHefNOQyPEdP9wr
KQSYv7mB27U6xt8tFv+p1i9Hr4LWQVEEWX2dYCl8qio+kObOQQFI4xhManXnp06d
+8Vp+7IbS2H9jLIPK+1byiozLWaT8pr15PRRgQL1XuvlBwEdoBmF/usMP/tV4MaC
Ns8CXIXB5guiA96HDSsK0dsDzQgSqrX8ZrqSP0007lkW4kOW/TqsxMQhtTbEppOO
BadS8y0QIBcrOl2WGICJlfSqm3BKb7/YCbR+jdAaS5LHT5YxM6/k4a0KjPt1ijpM
VibEmNkimCin3cpVbxAouptTdP1KyNoeTCrDU1yGM3o/tMAy0axjPhos9Iv2avvn
xtBGzMFe6XzRQR25mvXBsEd/v7Dbe/zBJPLRFGFr7NhR+2XBSq5B7ncaj9PwK5uy
r4uqloDb
=xRUi
-----END PGP SIGNATURE-----

@bitcoin bitcoin deleted a comment from Meru852 May 1, 2022
@fanquake fanquake requested a review from ajtowns May 16, 2022 13:50
Copy link
Contributor

@ajtowns ajtowns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 11e7908 -- code review only

@maflcko maflcko merged commit 07cb4de into bitcoin:master May 16, 2022
@martinus martinus deleted the 2022-04_prevector_fixes branch May 16, 2022 15:46
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request May 21, 2022
The prevector implementation currently can't be used with types that are
not trivially copyable, due to the use of memmove. Trivially copyable
implies that it is trivially destructible, see
https://eel.is/c++draft/class.prop#1.3

That means that the checks for std::is_trivially_destructible are not
necessary, and in fact where used it wouldn't be enough. E.g. in
`erase(iterator, iterator)` the elements in range first-last are destructed,
but it does not destruct elements left after `memmove`.

This commit removes the checks for `std::is_trivially_destructible`
and instead adds a `static_assert(std::is_trivially_copyable_v<T>);` to
make sure `prevector` is only used with supported types.

Github-Pull: bitcoin#24962
Rebased-From: 11e7908
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 28, 2022
11e7908 prevector: only allow trivially copyable types (Martin Leitner-Ankerl)

Pull request description:

  prevector uses `memmove` to move around data, that means it can only be used with types that are trivially copyable. That implies that the types are trivially destructible, thus the checks for `is_trivially_destructible` are not needed.

ACKs for top commit:
  w0xlt:
    ACK bitcoin@11e7908
  MarcoFalke:
    review ACK 11e7908 🏯
  ajtowns:
    ACK 11e7908 -- code review only

Tree-SHA512: cbb4d8bfa095100677874b552d92c324c7d6354fcf7adab2ed52f57bd1793762871798b5288064ed1af2d2903a0ec9dbfec48d99955fc428f18cc28d6840dccc
@bitcoin bitcoin locked and limited conversation to collaborators May 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants