Skip to content

Conversation

LarryRuane
Copy link
Contributor

The CBufferedFile object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the SetPos method.

Such rewinding is done in LoadExternalBlockFile() (currently the only user of this object), which deserializes a series of CBlock objects. If that function encounters something unexpected in the data stream, which is coming from a blocks/blk00???.dat file, it "rewinds" to an earlier position in the stream to try to get in sync again. The CBufferedFile object does not actually rewind its file offset; it simply repositions its internal offset, nReadPos, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

If LoadExternalBlockFile() needs to rewind (call blkdat.SetPos()), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the SetPos() sometimes works.

This PR adds a unit test for CBufferedFile that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand LoadExternalBlockFile() and for future users of this object.

This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's read method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/streams.h Outdated
} else {
nSrcPos += nBytes;
return true;
throw std::ios_base::failure(feof(src) ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK this hunk, it only changes formatting.

src/streams.h Outdated
}

public:
CBufferedFile(FILE *fileIn, uint64_t nBufSize, uint64_t nRewindIn, int nTypeIn, int nVersionIn) :
nType(nTypeIn), nVersion(nVersionIn), nSrcPos(0), nReadPos(0), nReadLimit(std::numeric_limits<uint64_t>::max()), nRewind(nRewindIn), vchBuf(nBufSize, 0)
nType(nTypeIn), nVersion(nVersionIn), nSrcPos(0), nReadPos(0),
nReadLimit(std::numeric_limits<uint64_t>::max()), nRewind(nRewindIn), vchBuf(nBufSize, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK this hunk, only splits the line in 2 lines.

return false;
} else if (nReadPos > nSrcPos) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep elses.

src/streams.h Outdated
if (nReadPos + nRewind < nSrcPos) {
nReadPos = nSrcPos - nRewind;
size_t bufsize = vchBuf.size();
if (nSrcPos > bufsize && nPos < nSrcPos - bufsize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong, vchBuf.size() if constant and the actual read position is obtained from nReadPos % vchBuf.size() - it's a ring buffer.

Copy link
Contributor Author

@LarryRuane LarryRuane Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this carefully, and I'm pretty sure it's right, please check again. I did simplify it slightly (without changing the behavior) in the second commit; it's now like this:

if (nPos + bufsize < nSrcPos) {

This says that the distance from the new (requested) read position, nPos, to the source pointer, nSrcPos must be no greater than bufsize. If nPos is further back than that, it will be pointing to bytes from a later write pass across the buffer and will get data that's too new. Maybe a small example will help. Suppose the file consists of a b c d e f g ... and the buffer is 5 bytes. After reading g (total of 7 bytes) from the file into the ring buffer, the buffer looks like this:

[ f g c d e ]
      ^
      nSrcPos = 7

The nReadPos (file read position) can be in the range 2 through 7 and we will get correct data. If nReadPos is less than 2, let's say it's 1, then reading at that position should return b but will instead will return g, which is incorrect (g is too new). The smallest nReadPos can be is 2, which is 7 - 5 (nSrcPos - bufsize).

The existing SetPos() is wrong; it should not depend on nRewind. That variable is only needed during filling, and prevents nSrcPos from getting too far ahead of nReadPos, because if that happens, then nReadPos can't back up. So, in this little example, if nRewind is 2, then Fill() will not let nSrcPos to get more than 5 (buffer size - rewind size) ahead of nReadPos. But, again, nRewind has no relevance to SetPos().

Copy link
Contributor

@mzumsande mzumsande Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that the first check in SetPos() is incorrect. This does not lead to problems in production thanks to the initialization values of CBufferedFile vchBuf.size()=2*MAX_BLOCK_SERIALIZED_SIZE=8000000 and nRewind=MAX_BLOCK_SERIALIZED_SIZE+8=4000008 in LoadExternalBlockFile().

Accordingly, nSrcPos = vchBuf.size() - nRewind = 3999992 < nRewind after the first buffer fill and the (erroneous) condition nReadPos + nRewind < nSrcPos in SetPos() is never true.

When I change LoadExternalBlockFile() slightly by initializing CBufferedFile with nRewind= MAX_BLOCK_SERIALIZED_SIZE-8 instead of MAX_BLOCK_SERIALIZED_SIZE+8, the functional test feature_reindex fails because now above condition can be true. Since all blocks in this test are much smaller than the maximum block size, this shouldn't happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzumsande, it's broken even with the current production initialization values. I made a temporary branch, LarryRuane@bc9c987, that simulates, after reading each block (during reindexing), what would happen if we really did need to rewind. (This temp branch doesn't include the changes from this PR.) bitcoind --reindex failed pretty quickly (at line 4405).

The temp branch actually verifies two things, (1) the call to SetPos() returns true (the production code doesn't look at the return value), which means that the requested position was accommodated (as it should always be in the current context), and, (2), after rewinding, we get the expected data (by reading one byte).

I ran the same test with the code in the current PR's branch, and it doesn't fail.

I think the easiest way to understand why the current code is broken is to note that Fill() can push nSrcPos far ahead of nReadPos. For best efficiency,Fill() always reads as much as it can from the file in a single read, so if there's a lot of "runway" ahead of it (lots of space to the end of the buffer), it will read a lot. It may not go quite to the end of the buffer, if doing so would violate the rewind guarantee, but it can go much farther ahead of nReadPos than the rewind amount. Now look again at the current SetPos():

if (nReadPos + nRewind < nSrcPos) {
    nReadPos = nSrcPos - nRewind;
    return false;
}

This condition could easily be true if nSrcPos has gotten far ahead of nReadPos. But that doesn't mean that nReadPos needs to be adjusted. Actually, this code could move nReadPos beyond the maximum position it's ever had, meaning that it's skipping data! You're at the farthest position you've ever reached, say, offset 40, and you request to rewind to position 35, and you end up at position 45! The condition is just wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I was more trying to understand why reindexing currently works in spite of the wrong condition. With different initialization values even no-rewind uses of SetPos like SetPos(current_position) could fail.

@LarryRuane LarryRuane changed the title util: CBufferedFile fixes and unit test [WIP] util: CBufferedFile fixes and unit test Aug 10, 2019
@LarryRuane LarryRuane force-pushed the CBufferedFile-fixes branch from 4ff2487 to 269efa1 Compare August 12, 2019 05:58
@LarryRuane
Copy link
Contributor Author

The second commit also adds a randomized test. It's pretty common for the kind of code under scrutiny here (the CBufferedFile object) to have subtle off-by-one and boundary condition bugs. A functional unit test like the one in the first commit is good, but it's hard to be sure the test covers all possible strange conditions. A random test will try many weird things that no person would ever think of.

I ran this random test several million iterations, which makes me extremely confident that the code under test is correct. Of course, whether the test code itself is correct is very crucial! So please review the test code, thanks.

@LarryRuane LarryRuane changed the title [WIP] util: CBufferedFile fixes and unit test util: CBufferedFile fixes and unit test Aug 12, 2019
@practicalswift
Copy link
Contributor

@LarryRuane I haven't looked at your changes yet but wanted to mention that you might want to check using contrib/devtools/test_deterministic_coverage.sh that the unit test is line coverage deterministic after this change.

Line coverage determinism is a necessary condition for meaningful line coverage measuring :-)

@LarryRuane
Copy link
Contributor Author

@practicalswift, thanks, I don't understand that script yet, but you reminded me that I have a question about the random test I wrote. It can trivially be made deterministic by seeding the random number generator at the beginning of the test. Would that be desirable, or not? It seems to me there are advantages either way. If the test is deterministic, then every time CI runs, it's not testing anything new; there could be a bug that by chance isn't uncovered (but would be by a different seed). On the other hand, if the test is non-deterministic, it's possible that a CI run would randomly fail for reasons having nothing to do with the PR that's causing CI to run. That would be very mysterious (at least until investigated).

If I had to decide, I'd say it's better for the test to be deterministic. But I don't know what the conventions are here. (Apologies in advance if this is documented somewhere.)

I'll also try to understand the coverage tool, looks very interesting, thank you.

@LarryRuane LarryRuane force-pushed the CBufferedFile-fixes branch from d4de597 to 864420b Compare August 12, 2019 15:44
@LarryRuane LarryRuane changed the title util: CBufferedFile fixes and unit test [WIP] util: CBufferedFile fixes and unit test Aug 12, 2019
@LarryRuane LarryRuane changed the title [WIP] util: CBufferedFile fixes and unit test util: CBufferedFile fixes and unit test Aug 12, 2019
size_t find = currentOffset + InsecureRandRange(8);
if (find >= fileSize)
find = fileSize - 1;
bf.FindByte(find);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing this I encountered the following UBSan warning:

test/streams_tests.cpp:396:29: runtime error: implicit conversion from type 'size_t' (aka 'unsigned long') of value 132 (64-bit, unsigned) to type 'char' changed the value to -124 (8-bit, signed)

@laanwj
Copy link
Member

laanwj commented Aug 14, 2019

It can trivially be made deterministic by seeding the random number generator at the beginning of the test. Would that be desirable, or not?

Deterministic tests are preferred in this project. No one really investigates individual random CI failures because there tends to be a lot of spurious failures that have nothing to do with the code, but with unreliability of the Travis platform such as timeouts.

@maflcko
Copy link
Member

maflcko commented Aug 14, 2019

I think you have two options:

  • Pick a random seed, print/log it at the beginning of the test in some way
  • Pick a constant as the seed

@mzumsande
Copy link
Contributor

Hi, I am currently trying to understand CBufferedFile and the tests, both confuse me still 😃. Seems like substantial parts of streams_buffered_file test are added first and removed again in a later commit. Could you squash commits that fix errors in earlier ones?

@LarryRuane LarryRuane force-pushed the CBufferedFile-fixes branch from ae82e1f to 1663586 Compare August 16, 2019 21:13
@LarryRuane
Copy link
Contributor Author

Could you squash commits that fix errors in earlier ones?

Yes, I ended up squashing down to a single commit, because there really are no phases or stages to this PR. Thanks for reviewing.

@mzumsande
Copy link
Contributor

Thanks for squashing - I agree that this fixes a bug, which hasn't lead to errors because of how CBufferedFile is initialized in LoadExternalBlockFile() (see comment above). Tests look good to me, will look at them in more detail (and also perform a reindex on testnet) in the next days.

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 24, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@laanwj
Copy link
Member

laanwj commented Sep 10, 2019

Looks good to me, thanks for adding the extensive testing.

ACK after squash. efd2474

@mzumsande
Copy link
Contributor

I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

laanwj added a commit that referenced this pull request Sep 26, 2019
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
@laanwj laanwj merged commit efd2474 into bitcoin:master Sep 26, 2019
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 26, 2019
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
@LarryRuane LarryRuane deleted the CBufferedFile-fixes branch November 7, 2019 00:00
zkbot added a commit to zcash/zcash that referenced this pull request Mar 3, 2020
DeckerSU added a commit to DeckerSU/KomodoOcean that referenced this pull request Mar 16, 2020
bitcoin/bitcoin#16577
zcash/zcash#4265

- added CBufferedFile fix (thx LarryRuane)
- added unit test (streams_tests.cpp rewritten for Google C++ Testing Framework
- additional komodo_block_load test (read 2 blocks from binary file)
- removed CBufferedFile::SetAnyPos method (had incorrect logic)
- removed komodo_index2pubkey33, komodo_blockload (as not-used and related to SetAnyPos)
-
DeckerSU added a commit to DeckerSU/komodo that referenced this pull request Mar 17, 2020
- added CBufferedFile fix (thx LarryRuane)
- added unit test (streams_tests.cpp rewritten for Google C++ Testing Framework)
- additional komodo_block_load test (read 2 blocks from binary file)

bitcoin/bitcoin#16577
zcash#4265
DeckerSU/KomodoOcean@9700247
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 23, 2020
Summary:
PR description:
> The CBufferedFile object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the SetPos method.
>
> Such rewinding is done in LoadExternalBlockFile() (currently the only user of this object), which deserializes a series of CBlock objects. If that function encounters something unexpected in the data stream, which is coming from a blocks/blk00???.dat file, it "rewinds" to an earlier position in the stream to try to get in sync again. The CBufferedFile object does not actually rewind its file offset; it simply repositions its internal offset, nReadPos, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.
>
> If LoadExternalBlockFile() needs to rewind (call blkdat.SetPos()), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the SetPos() sometimes works.
>
> This PR adds a unit test for CBufferedFile that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand LoadExternalBlockFile() and for future users of this object.
>
> This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).
>
> Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's read method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

This is a backport of Core [[bitcoin/bitcoin#16577 | PR16577]]

Test Plan: `ninja && ninja check`

Reviewers: O1 Bitcoin ABC, #bitcoin_abc, Fabien

Reviewed By: O1 Bitcoin ABC, #bitcoin_abc, Fabien

Subscribers: Fabien

Differential Revision: https://reviews.bitcoinabc.org/D8073
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 27, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 28, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 29, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 12, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 13, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 13, 2021
efd2474 util: CBufferedFile fixes (Larry Ruane)

Pull request description:

  The `CBufferedFile` object guarantees its user is able to "rewind" the data stream (that's being read from a file) up to a certain number of bytes, as specified by the user in the constructor. This guarantee is not honored due to a bug in the `SetPos` method.

  Such rewinding is done in `LoadExternalBlockFile()` (currently the only user of this object), which deserializes a series of `CBlock` objects. If that function encounters something unexpected in the data stream, which is coming from a `blocks/blk00???.dat` file, it "rewinds" to an earlier position in the stream to try to get in sync again. The `CBufferedFile` object does not actually rewind its file offset; it simply repositions its internal offset, `nReadPos`, to an earlier position within the object's private buffer; this is why there's a limit to how far the user may rewind.

  If `LoadExternalBlockFile()` needs to rewind (call `blkdat.SetPos()`), the stream may not be positioned as it should be, causing errors in deserialization. This need to rewind is probably rare, which is likely why this bug hasn't been noticed already. But if this object is used elsewhere in the future, this could be a serious problem, especially as, due to the nature of the bug, the `SetPos()` _sometimes_ works.

  This PR adds a unit test for `CBufferedFile` that fails due to this bug. (Until now it has had no unit tests.) The unit test provides good documentation and examples for developers trying to understand `LoadExternalBlockFile()` and for future users of this object.

  This PR also adds code to throw an exception from the constructor if the rewind argument is not less than the buffer size (since that doesn't make any sense).

  Finally, I discovered that the object is too restrictive in one respect: When the deserialization methods call this object's `read` method, a check ensures that the number of bytes being requested is less than the size of the buffer (adjusting for the rewind size), else it throws an exception. This restriction is unnecessary; the object being deserialized can be larger than the buffer because multiple reads from disk can satisfy the request.

ACKs for top commit:
  laanwj:
    ACK ~after squash.~ efd2474
  mzumsande:
    I had intended to follow up earlier on my last comment, ACK efd2474. I reviewed the code, ran tests and did a successful reindex on testnet with this branch.

Tree-SHA512: 695529e0af38bae2af4e0cc2895dda56a71b9059c3de04d32e09c0165a50f6aacee499f2042156ab5eaa6f0349bab6bcca4ef9f6f9ded4e60d4483beab7e4554
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Aug 11, 2021
ecde04a [Consensus] Bump Active Protocol version to 70923 for v5.3 (random-zebra)
b63e4f5 Consensus: Add v5.3 enforcement height for testnet. (furszy)
f44be94 Only relay IPv4, IPv6, Tor addresses (Pieter Wuille)
015298c fix: tor: Call event_base_loopbreak from the event's callback (furszy)
34ff7a8 Consensus: Add mnb ADDRv2 guard. (furszy)
b4515dc GUI: Present v3 onion addresses properly in MNs list. (furszy)
337d43d tests: don't export in6addr_loopback (Vasil Dimov)
2cde8e0 GUI: Do not show the tor v3 onion address in the topbar. (furszy)
0b5f406 Doc: update tor.md with latest upstream information. (furszy)
89df7f2 addrman: ensure old versions don't parse peers.dat (Vasil Dimov)
bb90c5c test: add getnetworkinfo network name regression tests (Jon Atack)
d8e01b5 rpc: update GetNetworksInfo() to not return unsupported networks (Jon Atack)
57fc7b0 net: update GetNetworkName() with all enum Network cases (Jon Atack)
647d60b tests: Modify rpc_bind to conform to bitcoin#14532 behaviour. (Carl Dong)
d4d6729 Allow running rpc_bind.py --nonloopback test without IPv6 (Kristaps Kaupe)
4a034d8 test: Add rpc_bind test to default-run tests (Wladimir J. van der Laan)
61a08af [tests] bind functional test nodes to 127.0.0.1  Prevents OSX firewall (Sjors Provoost)
6a4f1e0 test: Add basic addr relay test (furszy)
78aa61c net: Make addr relay mockable (furszy)
ba954ca Send and require SENDADDRV2 before VERACK (Pieter Wuille)
61c2ed4 Bump net protocol version + don't send 'sendaddrv2' to pre-70923 software (furszy)
ccd508a tor: make a TORv3 hidden service instead of TORv2 (Vasil Dimov)
6da9a14 net: advertise support for ADDRv2 via new message (furszy)
e58d5d0 Migrate to test_large_inv() to Misbehaving logging. (furszy)
d496b64 [QA] fix mininode CAddress ser/deser (Jonas Schnelli)
cec9567 net: CAddress & CAddrMan: (un)serialize as ADDRv2 Change the serialization of `CAddrMan` to serialize its addresses in ADDRv2/BIP155 format by default. Introduce a new `CAddrMan` format version (3). (furszy)
b8c1dda streams update: get rid of nType and nVersion. (furszy)
3eaa273 Support bypassing range check in ReadCompactSize (Pieter Wuille)
a237ba4 net: recognize TORv3/I2P/CJDNS networks (Vasil Dimov)
8e50853 util: make EncodeBase32 consume Spans (Sebastian Falbesoner)
1f67e30 net: CNetAddr: add support to (un)serialize as ADDRv2 (Vasil Dimov)
2455420 test: move HasReason so it can be reused (furszy)
d41adb4 util: move HasPrefix() so it can be reused (Vasil Dimov)
f6f86af Unroll Keccak-f implementation (Pieter Wuille)
45222e6 Implement keccak-f[1600] and SHA3-256 (Pieter Wuille)
08ad06d net: change CNetAddr::ip to have flexible size (furszy)
3337219 net: improve encapsulation of CNetAddr. (furszy)
910d5c4 test: Do not instantiate CAddrDB for static call (Hennadii Stepanov)
6b607ef Drop IsLimited in favor of IsReachable (Ben Woosley)
a40711b IsReachable is the inverse of IsLimited (DRY). Includes unit tests (marcaiaf)
8839828 net: don't accept non-left-contiguous netmasks (Vasil Dimov)
5d7f864 rpcbind: Warn about exposing RPC to untrusted networks (Luke Dashjr)
2a6abd8 CNetAddr: Add IsBindAny method to check for INADDR_ANY (Luke Dashjr)
4fdfa45 net: Always default rpcbind to localhost, never "all interfaces" (Luke Dashjr)
31064a8 net: Minor accumulated cleanups (furszy)
9f9c871 tests: Avoid using C-style NUL-terminated strings as arguments (practicalswift)
f6c52a3 tests: Add tests to make sure lookup methods fail on std::string parameters with embedded NUL characters (practicalswift)
a751b9b net: Avoid using C-style NUL-terminated strings as arguments in the netbase interface (furszy)
f30869d test: add IsRFC2544 tests (Mark Tyneway)
ed5abe1 Net: Proper CService deserialization + GetIn6Addr return false if addr isn't an IPv6 addr (furszy)
86d73fb net: save the network type explicitly in CNetAddr (Vasil Dimov)
ad57dfc net: document `enum Network` (Vasil Dimov)
cb160de netaddress: Update CNetAddr for ORCHIDv2 (Carl Dong)
c3c04e4 net: Better misbehaving logging (furszy)
3660487 net: Use C++11 member initialization in protocol (Marco)
082baa3 refactor: Drop unused CBufferedFile::Seek() (Hennadii Stepanov)
e2d776a util: CBufferedFile fixes (Larry Ruane)
6921f42 streams: backport OverrideStream class (furszy)

Pull request description:

  Conjunction of a large number of back ports, updates and refactorings that made with the final goal of implementing v3 Onion addresses support (BIP155 https://github.com/bitcoin/bips/blob/master/bip-0155.mediawiki) before the tor v2 addresses EOL, scheduled, by the Tor project, for (1) July 15th: v2 addr support removal from the code base, and (2) October 15th: v2 addr network disable, where **every peer in our network running under Tor will loose the connection and drop the network**.

  As BIP155 describes, this is introducing a new P2P message to gossip longer node addresses over the P2P network. This is required to support new-generation Onion addresses, I2P, and potentially other networks that have longer endpoint addresses than fit in the 128 bits of the current addr message.

  In order to achieve the end goal, had to:
  1.  Create Span class and push it up to latest Bitcoin implementation.
  2.  Update the whole serialization framework and every object using it up to latest Bitcoin implementation (3-4 years ahead of what we currently are in master).
  3.  Update the address manager implementing ASN-based bucketing of the network nodes.
  4.  Update and refactor the netAddress and address manager tests to latest Bitcoin implementation (4 years ahead of what we currently are in master).
  5.  Several util string, vector, encodings, parsing, hashing backports and more..

  Important note:
  This PR it is not meant to be merged as a standalone PR, will decouple smaller ones moving on. Adding on each sub-PR its own description isolated from this big monster.

  Second note:
  This is still a **work-in-progress**, not ready for testing yet. I'm probably missing to mention few PRs that have already adapted to our sources. Just making it public so can decouple the changes, we can start merging them and i can continue working a bit more confortable (rebase a +170 commits separate branch is not fun..).

  ### List of back ported and adapted PRs:

  Span and Serialization:
  ----------------
  *  bitcoin#12886.
  *  bitcoin#12916.
  *  bitcoin#13558.
  *  bitcoin#13697. (Only Span's commit 29943a9)
  *  bitcoin#17850.
  *  bitcoin#17896.
  *  bitcoin#12752.
  *  bitcoin#16577.
  *  bitcoin#16670. (without faebf62)
  *  bitcoin#17957.
  *  bitcoin#18021.
  *  bitcoin#18087.
  *  bitcoin#18112 (only from 353f376 that we don't support).
  *  bitcoin#18167.
  *  bitcoin#18317.
  *  bitcoin#18591 (only Span's commit 0fbde48)
  *  bitcoin#18468.
  *  bitcoin#19020.
  *  bitcoin#19032.
  *  bitcoin#19367.
  *  bitcoin#19387.

  Net, NetAddress and AddrMan:
  ----------------

  *  bitcoin#7932.
  *  bitcoin#10756.
  *  bitcoin#10765.
  *  bitcoin#12218.
  *  bitcoin#12855.
  *  bitcoin#13532.
  *  bitcoin#13575.
  *  bitcoin#13815.
  *  bitcoin#14532.
  *  bitcoin#15051.
  *  bitcoin#15138.
  *  bitcoin#15689.
  *  bitcoin#16702.
  *  bitcoin#17243.
  *  bitcoin#17345.
  *  bitcoin#17754.
  *  bitcoin#17758.
  *  bitcoin#17812.
  *  bitcoin#18023.
  *  bitcoin#18454.
  *  bitcoin#18512.
  *  bitcoin#19314.
  *  bitcoin#19687

  Keys and Addresses encoding:
  ----------------
  * bitcoin#11372.
  * bitcoin#17511.
  * bitcoin#17721.

  Util:
  ----------------
  * bitcoin#9140.
  * bitcoin#16577.
  * bitcoin#16889.
  * bitcoin#19593.

  Bench:
  ----------------
  * bitcoin#16299.

  BIP155:
  ----------------
  *  bitcoin#19351.
  *  bitcoin#19360.
  *  bitcoin#19534.
  *  bitcoin#19628.
  *  bitcoin#19841.
  *  bitcoin#19845.
  *  bitcoin#19954.
  *  bitcoin#19991 (pending).
  *  bitcoin#19845.
  *  bitcoin#20000 (pending).
  *  bitcoin#20120.
  *  bitcoin#20284.
  *  bitcoin#20564.
  *  bitcoin#21157 (pending).
  *  bitcoin#21564 (pending).
  *  Fully removed v2 onion addr support.
  *  Add hardcoded seeds.
  *  Add release-notes, changes to files.md and every needed documentation.

  I'm currently working on the PRs marked as "pending", this isn't over, but I'm pretty pretty close :). What a long road..

ACKs for top commit:
  random-zebra:
    utACK ecde04a
  Fuzzbawls:
    ACK ecde04a

Tree-SHA512: 82c95fbda76fce63f96d8a9af7fa9a89cb1e1b302b7891e27118a6103af0be23606bf202c7332fa61908205e6b6351764e2ec23d753f1e2484028f57c2e8b51a
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants