Skip to content

Conversation

l0rinc
Copy link
Contributor

@l0rinc l0rinc commented May 3, 2024

Split out the additional tests from the base58 optimization PR as suggested #29473 (comment)

@DrahtBot
Copy link
Contributor

DrahtBot commented May 3, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK edilmedeiros
Stale ACK tdb3

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30571 (test: [refactor] Use g_rng/m_rng directly by maflcko)
  • #30377 (refactor: Replace ParseHex with consteval ArrayFromHex by hodlinator)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot added the Tests label May 3, 2024
Copy link
Contributor

@tdb3 tdb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK for 9431bc9
Built and ran unit tests (all passed).
Left one nit, but the nit is outside the scope of this PR, so is probably better left to a separate PR.

Comment on lines +95 to +97
auto leadingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
auto trailingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Probably outside the scope of this PR (this PR is adding tests, so business logic changes are extraneous), but at first glance, seems like these statements could be simplified, since InsecureRandRange() can return 0, the std::string constructor can handle 0 count, and string operator+ can handle empty string addition. Maybe I'm missing something?

For example:

diff --git a/src/test/base58_tests.cpp b/src/test/base58_tests.cpp
index 49ef9ff5b5..beb5ef3335 100644
--- a/src/test/base58_tests.cpp
+++ b/src/test/base58_tests.cpp
@@ -92,8 +92,8 @@ BOOST_AUTO_TEST_CASE(base58_random_encode_decode_with_optional_spaces)
         auto zeroes = InsecureRandBool() ? InsecureRandRange(len + 1) : 0;
         auto data = Cat(std::vector<unsigned char>(zeroes, '\000'), g_insecure_rand_ctx.randbytes(len - zeroes));
 
-        auto leadingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
-        auto trailingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
+        auto leadingSpaces = std::string(InsecureRandRange(10), ' ');
+        auto trailingSpaces = std::string(InsecureRandRange(10), ' ');
         auto encoded = leadingSpaces + EncodeBase58Check(data) + trailingSpaces;
 
         std::vector<unsigned char> decoded;

Copy link
Contributor Author

@l0rinc l0rinc May 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking @tdb3, the results would be similar, but since I assumed the spaces are rare in reality, I gave it different odds.
In my impl the probability that we won't have any leading (or trailing) spaces was 50% + 50%*10%, in your impl it's 10%, so spaces would be in most samples.
I'm fine with both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's right (higher probability of no spaces over rand range alone). I don't have a preference, just an observation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like all this, but it would be better to explain in the PR description why it is important/beneficial to include this change since this increases the test complexity.

Probably would be even better to have a separate test case to check for leading and trailing spaces. This test case is intended to check for leading zeros and now it checks for three concerns (leading zeros, leading spaces, and trailing spaces) instead of one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it checks for three concerns

That's a property based test, i.e. for random valid inputs it checks that certain conditions hold - in this case a roundtrip, that decoding an encoded results in the original trimmed value.
It's meant to find corner cases that we haven't though of, that's its single concern, we have separate tests for each corner case we know about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added short explanations to the commit message

Copy link
Contributor

@edilmedeiros edilmedeiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Built and ran the unit tests.

The new messages seem to be the opposite of what the tests do, tough.

@l0rinc
Copy link
Contributor Author

l0rinc commented May 7, 2024

Thanks for the review @edilmedeiros, some of the existing BOOST_CHECK_MESSAGE messages are either in indicative or subjunctive grammatical moods (even in the same file, as you can see).
I'm fine with both, but if you think the subjunctive is more readable, I'll rephrase.

@edilmedeiros
Copy link
Contributor

Thanks for the review @edilmedeiros, some of the existing BOOST_CHECK_MESSAGE messages are either in indicative or subjunctive grammatical moods (even in the same file, as you can see). I'm fine with both, but if you think the subjunctive is more readable, I'll rephrase.

I have no personal preference about it, but this is not my point.

Take for instance Mismatch for test #2: expected 626262, got 626262' has passed.

How can (expected) 626262 not match (got) 626262?

@l0rinc l0rinc force-pushed the paplorinc/base58-tests branch from 9431bc9 to fc0cc2d Compare May 7, 2024 21:24
Copy link
Contributor

@edilmedeiros edilmedeiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave another deep look at this, thanks again for submitting this PR.

Beside the specific code comments, please add to the commit message why are you adding new test cases, what they improve in the test logic (and how they help the work on #29473).

}

BOOST_CHECK(!DecodeBase58("invalid"s, result, 100));
BOOST_CHECK(!DecodeBase58("invalid\0"s, result, 100));
BOOST_CHECK(!DecodeBase58("\0invalid"s, result, 100));

BOOST_CHECK(DecodeBase58("good"s, result, 100));
BOOST_CHECK( DecodeBase58("good"s, result, 100));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BOOST_CHECK( DecodeBase58("good"s, result, 100));
BOOST_CHECK(DecodeBase58("good"s, result, 100));

I understand the rational of aligning this with surrounding context, but does seem against guidelines and will look like a typo for others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks more like a typo, see lines 67 and 78.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about either, removed the space

EncodeBase58(sourcedata) == base58string,
strTest);
EncodeBase58(sourcedata) == base58string,
strTest << "\nEncoding `" << HexStr(Span(sourcedata)) << "` as `" << EncodeBase58(sourcedata) << "` should match `" << base58string << "`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
strTest << "\nEncoding `" << HexStr(Span(sourcedata)) << "` as `" << EncodeBase58(sourcedata) << "` should match `" << base58string << "`"
strTest << ": got \"" << EncodeBase58(sourcedata) << "\""

What about a little less verbose and taking advantage of the strTest string that has both input and expected outcome?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize this, thanks!

test/base58_tests.cpp:40: info: check '["271F359E","zzzzy"]: got "zzzzy"' has passed

BOOST_CHECK(DecodeBase58(base58string, result, 256));
BOOST_CHECK_MESSAGE(
result == expected,
strTest << "\nDecoding `" << base58string << "` as `" << HexStr(result) << "` should match `" << HexStr(expected) << "`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
strTest << "\nDecoding `" << base58string << "` as `" << HexStr(result) << "` should match `" << HexStr(expected) << "`"
strTest << ": got \"" << EncodeBase58(sourcedata) << "\""

Same suggestion as above.

Copy link
Contributor Author

@l0rinc l0rinc May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/base58_tests.cpp:65: info: check '["271F35A1","211112"]: got XXX "271f35a1"' has passed

Comment on lines +95 to +97
auto leadingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
auto trailingSpaces = InsecureRandBool() ? std::string(InsecureRandRange(10), ' ') : "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like all this, but it would be better to explain in the PR description why it is important/beneficial to include this change since this increases the test complexity.

Probably would be even better to have a separate test case to check for leading and trailing spaces. This test case is intended to check for leading zeros and now it checks for three concerns (leading zeros, leading spaces, and trailing spaces) instead of one.

auto ok = DecodeBase58Check(encoded, decoded, len + InsecureRandRange(257 - len));
BOOST_CHECK(ok);
BOOST_CHECK(data == decoded);
BOOST_CHECK_MESSAGE(!DecodeBase58Check(encoded, decoded, InsecureRandRange(len)), "Decoding should fail for smaller max_ret_len");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where there's a random parameter that's not reported making potential debugging impossible.

Also, the text is no good since max_ret_len is the maximum return size. Better something like Decoding exceeds xxx length. where xxx prints the random parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, added the values to the error message:

test/base58_tests.cpp:102: error: in "base58_tests/base58_random_encode_decode_with_optional_spaces": Decoding should fail for `invalidSmallResultLength` (61)
test/base58_tests.cpp:104: error: in "base58_tests/base58_random_encode_decode_with_optional_spaces": Decoding should succeed within sufficiently large result length (134)

BOOST_CHECK(ok);
BOOST_CHECK(data == decoded);
BOOST_CHECK_MESSAGE(!DecodeBase58Check(encoded, decoded, InsecureRandRange(len)), "Decoding should fail for smaller max_ret_len");
BOOST_CHECK_MESSAGE( DecodeBase58Check(encoded, decoded, len + InsecureRandRange(257 - len)), "Decoding should succeed within valid length range");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about unreported random parameter in the test. This is worse yet since there's a weird logic to get the param. It's good that you submitted this PR, the original test deserved a better look already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain in the PR first comment and in the commit message why are you adding these specific test cases, what do they improve in the existing test logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed review, will do that this week

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

@l0rinc l0rinc force-pushed the paplorinc/base58-tests branch 5 times, most recently from b1142e3 to 861ab92 Compare May 11, 2024 20:52
Add better errors for base58_EncodeBase58 and base58_DecodeBase58 to see the differing value in case of failure.

Extended the base58_random_encode_decode_with_optional_spaces property based test - containing a simple roundtrip with decoding validation - to stress the leading and trailing space parsing as well.

Also extended the base58_encode_decode.json file with a few corner cases - e.g. on a transition of power of 58 to check the boundaries.

Co-authored-by: Edil Medeiros <jose.edil@gmail.com>
@@ -26,17 +26,18 @@ BOOST_AUTO_TEST_CASE(base58_EncodeBase58)
UniValue tests = read_json(json_tests::base58_encode_decode);
for (unsigned int idx = 0; idx < tests.size(); idx++) {
const UniValue& test = tests[idx];
std::string strTest = test.write();
auto strTest = test.write();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes like adding auto instead of the actual type are mostly noise. I don't think there is precedence for merging these. Can you drop them again (here and in other places where it is the sole change on that line)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered the actual type to be just noise in these cases, but you seem to have a stronger preference for minimal diff, reverted.

std::vector<unsigned char> sourcedata = ParseHex(test[0].get_str());
std::string base58string = test[1].get_str();
auto encodedSource = EncodeBase58(ParseHex(test[0].get_str()));
auto base58string = test[1].get_str();
BOOST_CHECK_MESSAGE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are touching this, why not just make this BOOST_CHECK_EQUAL(EncodeBase58(sourcedata), base58string); and drop all the other noisy changes here?

@@ -56,8 +57,12 @@ BOOST_AUTO_TEST_CASE(base58_DecodeBase58)
}
std::vector<unsigned char> expected = ParseHex(test[0].get_str());
std::string base58string = test[1].get_str();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded whitespace change.

@@ -71,7 +76,7 @@ BOOST_AUTO_TEST_CASE(base58_DecodeBase58)

// check that DecodeBase58 skips whitespace, but still fails with unexpected non-whitespace at the end.
BOOST_CHECK(!DecodeBase58(" \t\n\v\f\r skip \r\f\v\n\t a", result, 3));
BOOST_CHECK( DecodeBase58(" \t\n\v\f\r skip \r\f\v\n\t ", result, 3));
BOOST_CHECK(DecodeBase58(" \t\n\v\f\r skip \r\f\v\n\t ", result, 3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whitespace was intentional such that the escape patterns can easily be compared with one another. Please leave it like it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was specifically requested, but I'll revert, since I liked the spaces: #30035 (comment)

BOOST_CHECK_MESSAGE(result.size() == expected.size() && std::equal(result.begin(), result.end(), expected.begin()), strTest);
BOOST_CHECK_MESSAGE(
result == expected,
strTest << ": got \"" << HexStr(result) << "\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this change is really worth it (and the one adding more context to the case above). There is no randomness involved here and the programmer will have to debug anyway if there is a failure. The other changes for printing some context do make sense, since there is randomness involved and the failure case may not be reproduced immediately.

auto ok = DecodeBase58Check(encoded, decoded, len + InsecureRandRange(257 - len));
BOOST_CHECK(ok);
BOOST_CHECK(data == decoded);
auto invalidSmallResultLength = InsecureRandRange(len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stick to the symbol naming conventions in https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c - specifically use snake_case everywhere.

Copy link
Contributor Author

@l0rinc l0rinc Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the code was using this style.

@@ -81,19 +86,26 @@ BOOST_AUTO_TEST_CASE(base58_DecodeBase58)
BOOST_CHECK(!DecodeBase58Check("3vQB7B6MrGQZaxCuFg4oh\0" "0IOl"s, result, 100));
}

BOOST_AUTO_TEST_CASE(base58_random_encode_decode)
BOOST_AUTO_TEST_CASE(base58_random_encode_decode_with_optional_spaces)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this change makes sense, since there is already other stuff added to the test data. I would just leave it as is.

@l0rinc
Copy link
Contributor Author

l0rinc commented Aug 15, 2024

I don't think this change makes sense

k, closing.

@l0rinc l0rinc closed this Aug 15, 2024
@@ -11,6 +11,13 @@
["ecac89cad93923c02321", "EJDM8drfXA6uyA"],
["10c8511e", "Rt5zm"],
["00000000000000000000", "1111111111"],
["00000000000000000000000000000000000000000000000000000000000000000000000000000000", "1111111111111111111111111111111111111111"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to just add the new data, no?

Copy link
Contributor Author

@l0rinc l0rinc Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipped the rest of the changes, moved these over to #30746

@TheCharlatan
Copy link
Contributor

I would have ACKed this if it were cleaned up a bit. The new test data and better error messages in the randomized tests are good changes.

@l0rinc l0rinc deleted the paplorinc/base58-tests branch August 29, 2024 17:32
achow101 added a commit that referenced this pull request Feb 14, 2025
…z (and padding) tests

f919d91 fuzz: Add fuzzing for max_ret_len in DecodeBase58/DecodeBase58Check (Lőrinc)
635bc58 test: Fuzz Base32/Base58/Base64 roundtrip conversions (Lőrinc)
5dd3a0d test: Extend base58_encode_decode.json with edge cases (Lőrinc)
ae40cf1 test: Add padding tests for Base32/Base64 (Lőrinc)

Pull request description:

  Added fuzzed roundtrips for `base[32|58|64]` encoding to make sure encoding/decoding are symmetric.
  Note that if we omit the padding in `EncodeBase32` we won't be able to decode it with `DecodeBase32`.
  Added dedicated padding tests to cover failure behavior
  Also moved over the Base58 json test edge cases from #30035

ACKs for top commit:
  hodlinator:
    re-ACK f919d91
  achow101:
    ACK f919d91

Tree-SHA512: 6a6c63d0a659b70d42aad7a8f37ce6e372756e2c88c84e7be5c1ff1f2a7c58860ed7113acbe1a9658a7d19deb91f0abe2ec527ed660335845cd1e0a9380b4295
@bitcoin bitcoin locked and limited conversation to collaborators Aug 29, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants