Several randomness improvements #29625

sipa · 2024-03-11T19:36:49Z

This PR contains a number of vaguely-related improvements to the random module.

The specific changes and more detailed rationale is in the commit messages, but the highlights are:

XoRoShiRo128PlusPlus (previously a test-only RNG) moves to random.h and becomes InsecureRandomContext, which is even faster than FastRandomContext but non-cryptographic. It also gets all helper randomness functions (randrange, fillrand, ...), making it a lot more succinct to use.
During tests, all randomness is made deterministic (except for GetStrongRandBytes) but non-repeating (like GetRand() used to be when g_mock_deterministic_tests was used), either fixed, or from a random seed (overridden by env var).
Several infrequently used top-level functions (GetRandMillis, GetRandMicros, GetExponentialRand) are converted into member functions of FastRandomContext (and InsecureRandomContext).
GetRand<T>() (without argument) can now return the maximum value of the type (previously e.g. GetRand<uint32_t>() would never return 0xffffffff).

DrahtBot · 2024-03-11T19:36:52Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	achow101, maflcko, hodlinator, dergoegge
Concept ACK	theuni
Stale ACK	EthanHeilman

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30377 (refactor: Make uint256S(const char*) consteval by hodlinator)
#30277 ([DO NOT MERGE] Erlay: bandwidth-efficient transaction relay protocol (Full implementation) by sr-gi)
#30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
#29543 (refactor: Avoid unsigned integer overflow in script/interpreter.cpp by hebasto)
#29536 (fuzz: fuzz connman with non-empty addrman + ASMap by brunoerg)
#29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
#26114 (net: Make AddrFetch connections to fixed seeds by mzumsande)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot · 2024-03-12T04:41:53Z

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

_{Debug: https://github.com/bitcoin/bitcoin/runs/22530315845}

dergoegge · 2024-03-12T10:09:39Z

Concept ACK

There are a few places in the fuzz tests where this will allow to easily replace FastRandomContext with a InsecureRandomContext, which is beneficial for performance (e.g. the addrman harnesses partially fill the addrman with addresses from rng) and we don't need cryptographic rng there anyway.

During tests, all randomness is made deterministic

Great, this should help with #29018

Sjors · 2024-03-12T15:11:44Z

What's the impact on the fuzz corpus of switching to a different (?) deterministic RNG?

dergoegge · 2024-03-12T15:53:35Z

What's the impact on the fuzz corpus of switching to a different (?) deterministic RNG?

I would expect that switching to a different rng should have no meaningful effect on the corpus itself. The corpus for a particular harness might change but the coverage for the code we intend to test should remain the same. This is because using rng in a fuzz harness only makes sense in very rare cases. It should never be used in a way that can significantly affect the coverage reached, otherwise there is no point in using a coverage-guided fuzzer, we could just pipe /dev/random to our harnesses.

For example, if we need to populate some data that we don't really expect to have an impact on the thing we are testing, we might use rng instead of consuming from the fuzz input (we do this in the p2p transport harnesses to fill message contents, which are essentially irrelevant to the transport logic).

Switching to deterministic rng can cause a corpus' coverage to grow because coverage-guided feedback loops start working more reliably when the code under test is deterministic. This can vary from harness to harness, but we've seen coverage-guided fuzzers find bugs once we've improved on determinism.

achow101 · 2024-07-01T19:14:12Z

ACK ce80942

maflcko

re-ACK ce80942 🐈

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK ce8094246ee95232e9d84f7e37f3c0a43ef587ce 🐈
iPUv215Td9GqZ8iEiD8qyKJufeqJCMlNIHYw/Ha/B/66jmXmcrePorRaTQ+YqnML9Nkr4A+IaY3dqwDPvXI5Cg==

Reminder for myself: #29625 (comment)

maflcko · 2024-07-03T14:30:40Z

src/net_processing.cpp

@@ -2522,7 +2522,7 @@ void PeerManagerImpl::ProcessGetBlockData(CNode& pfrom, Peer& peer, const CInv&
                if (a_recent_compact_block && a_recent_compact_block->header.GetHash() == pindex->GetBlockHash()) {
                    MakeAndPushMessage(pfrom, NetMsgType::CMPCTBLOCK, *a_recent_compact_block);
                } else {
-                    CBlockHeaderAndShortTxIDs cmpctblock{*pblock, GetRand<uint64_t>()};
+                    CBlockHeaderAndShortTxIDs cmpctblock{*pblock, FastRandomContext().rand64()};


nit: (haven't tried), but in a follow-up this could use m_rng, because the lock is already taken IIRC.

Done in #30393

hodlinator

ACK ce80942

Mainly code review. Did not attempt to check thread safety compliance. (Passed make check & test/functional/test_runner.py).

`e2d1f84` - random: make GetRand() support entire range (incl. max)

The commit message title probably should be "random: make GetRand<T>() support entire range (incl. max)", since the overloads taking parameters still are exclusive at the end. It felt dangerous that GetRand<T>(void) behavior differed in this way from the overloads - happy others suggested the calls be inlined as randrange() and rand<T>() are much clearer!

hodlinator · 2024-07-02T19:41:16Z

src/random.h

+        if (bits == 64) return Impl().rand64();
+        uint64_t ret;
+        if (bits <= bitbuf_size) {
+            // If there is enough entropy left in bitbuf, return its bottom bits bits.


In 21ce9d8: nit: typo - "bits bits"

It's not actually a typo, it means "Return the bottom bits many bits", but I do see why it'd confusing. Will address if I have to repush.

hodlinator · 2024-07-03T19:55:37Z

src/random.cpp

+    //   is F(x) = -log(1 - x).
+    //
+    // Combining the two, and using log1p(x) = log(1 + x), we obtain the following:
+    return -std::log1p((uniform >> 11) * -0x1.0p-53);


In d5fcbe9: Always throwing away 11 bits of entropy in the new version compared to 0 before the PR. I guess you want to preserve the expression from the linked site and it's unclear whether not throwing away the bits would be more performant.

Yeah, I wanted to keep MakeExponential as stand-alone reviewable as possible. I don't think any of the call sites are so performance-critical that the difference matters.

hodlinator · 2024-07-03T21:15:41Z

src/test/util/net.cpp

@@ -126,7 +126,7 @@ std::vector<NodeEvictionCandidate> GetRandomNodeEvictionCandidates(int n_candida
            /*fRelevantServices=*/random_context.randbool(),
            /*m_relay_txs=*/random_context.randbool(),
            /*fBloomFilter=*/random_context.randbool(),
-            /*nKeyedNetGroup=*/random_context.randrange(100),
+            /*nKeyedNetGroup=*/random_context.randrange(100u),


In e2d1f84: Now we are in C++20 land, it might be time to use real designated initializers instead of comments when touching blocks like these?

.id=id, .m_connected=std::chrono::seconds{random_context.randrange(100)}, .m_min_ping_time=std::chrono::microseconds{random_context.randrange(100)}, .m_last_block_time=std::chrono::seconds{random_context.randrange(100)}, .m_last_tx_time=std::chrono::seconds{random_context.randrange(100)}, .fRelevantServices=random_context.randbool(), .m_relay_txs=random_context.randbool(), .fBloomFilter=random_context.randbool(), .nKeyedNetGroup=random_context.randrange(100u), .prefer_evict=random_context.randbool(), .m_is_local=random_context.randbool(), .m_network=ALL_NETWORKS[random_context.randrange(ALL_NETWORKS.size())], .m_noban=false, .m_conn_type=ConnectionType::INBOUND,

This is already a list initialization, so I don't think clang-tidy can pick up the named args at all. Happy to review a follow-up, if you decide to open one.

Follow-up: #30397

hodlinator · 2024-07-03T21:57:43Z

src/txmempool.cpp

    if (m_opts.check_ratio == 0) return;

-    if (GetRand(m_opts.check_ratio) >= 1) return;
+    if (FastRandomContext().randrange(m_opts.check_ratio) >= 1) return;


In ddc184d: The commit message in e2d1f84 made me do a survey of (mis)uses of GetRand() . 2 similar cases stood out to me at first, but they appear correct after some noodling. This is one of them and the other is check_block_index in validation.cpp.

(check_ratio is often set to 1 (always) or 0 (never)).

Sharing the same return path actually makes the behavior slightly quicker for me to grok:

if (m_opts.check_ratio == 0 || FastRandomContext().randrange(m_opts.check_ratio) >= 1) return;

dergoegge

utACK ce80942

…ction 6ecda04 random: drop ad-hoc Shuffle in favor of std::shuffle (Pieter Wuille) da28a26 bench random: benchmark more functions, and add InsecureRandomContext (Pieter Wuille) 0a9bbc6 random bench refactor: move to new bench/random.cpp (Pieter Wuille) Pull request description: This adds benchmarks for various operations on `FastRandomContext` and `InsecureRandomContext`, and then removes the ad-hoc `Shuffle` functions, now that it appears that standard library `std::shuffle` has comparable performance. The other reason for keeping `Shuffle`, namely the fact that libstdc++ used self-move (which debug mode panics on) has been fixed as well (see #29625 (comment)). ACKs for top commit: achow101: ACK 6ecda04 hodlinator: ACK 6ecda04 dergoegge: Code review ACK 6ecda04 Tree-SHA512: 2560b7312410581ff2b9bd0716e0f1558d910b5eadb9544785c972384985ac0f11f72d6b2797cfe2e7eb71fa57c30cffd98cc009cb4ee87a18b1524694211417

e233ec0 refactor: Use designated initializer (Hodlinator) Pull request description: Block was recently touched (e2d1f84) and the codebase recently switched to C++20 which allows this to improve robustness. Follow-up suggested in #29625 (comment) ACKs for top commit: maflcko: ACK e233ec0 Tree-SHA512: ce3a18f513421e923710a43c8f97db1badb7ff5c6bdbfd62d9543312d2225731db5c14bef16feb47c43b84fad4dc24485086634b680feba422d2b7b363e13fa6

hebasto · 2024-07-14T06:51:42Z

Ported to the CMake-based build system in hebasto#264.

maflcko · 2024-07-31T10:52:24Z

src/random.cpp

-    uint256 ret;
-    rng.Keystream(MakeWritableByteSpan(ret));
-    return ret;
+    ProcRand(nullptr, 0, RNGLevel::PERIODIC, /*always_use_real_rng=*/false);


test-only follow-up question: Just a nit, because this only affects tests, but for consistency, I couldn't figure out why the periodic seed does not use the real RNG. The result is only used internally, so it should be fine, in line with the other calls. For example, SeedStrengthen, which is called as part of the periodic seeding and also uses the real RNG. Also, if this were problematic and made tests non-deterministic, the same issues should appear when someone in the tests called GetStrongRandBytes or Random_SanityCheck, no?

DrahtBot added the CI failed label Mar 12, 2024

This was referenced Mar 12, 2024

net: Make AddrFetch connections to fixed seeds #26114

Closed

Enable HW-accelerated implementations of SHA256 for MSVC builds #24773

Closed

sipa force-pushed the 202403_rand_rework branch 2 times, most recently from b8d2aa9 to 3ad67b0 Compare March 12, 2024 20:55

This was referenced Mar 13, 2024

scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] #29641

Draft

Drop log category in SeedStartup #29480

Closed

sipa force-pushed the 202403_rand_rework branch 8 times, most recently from c07a68c to b5c10a4 Compare March 13, 2024 19:32

DrahtBot mentioned this pull request Mar 14, 2024

Make (Read/Write)BinaryFile work with char vector, use AutoFile #29229

Closed

sipa force-pushed the 202403_rand_rework branch 5 times, most recently from 019d483 to 1b75d68 Compare March 14, 2024 20:30

random: replace construct/assign with explicit Reseed()

ce80942

sipa force-pushed the 202403_rand_rework branch from aeb4de2 to ce80942 Compare July 1, 2024 16:40

DrahtBot requested a review from maflcko July 1, 2024 19:14

DrahtBot mentioned this pull request Jul 2, 2024

refactor: Replace ParseHex with consteval ""_hex literals #30377

Merged

maflcko approved these changes Jul 3, 2024

View reviewed changes

maflcko requested review from dergoegge, EthanHeilman and theuni and removed request for EthanHeilman, theuni and dergoegge July 3, 2024 14:31

hodlinator approved these changes Jul 3, 2024

View reviewed changes

dergoegge approved these changes Jul 4, 2024

View reviewed changes

fanquake merged commit 5c0cd20 into bitcoin:master Jul 4, 2024

maflcko mentioned this pull request Jul 4, 2024

refactor: use existing RNG object in ProcessGetBlockData #30393

Merged

hebasto added the Needs CMake port label Jul 4, 2024

sipa mentioned this pull request Jul 5, 2024

random: add benchmarks and drop unnecessary Shuffle function #30396

Merged

hodlinator mentioned this pull request Jul 5, 2024

refactor: Use designated initializer in test/util/net.cpp #30397

Merged

ryanofsky mentioned this pull request Jul 9, 2024

logging: Replace LogError and LogWarning with LogAlert #30364

Closed

Sjors mentioned this pull request Jul 11, 2024

Stratum v2 Noise Protocol #29346

Closed

hebasto mentioned this pull request Jul 14, 2024

cmake: Regular rebasing of the cmake-staging branch hebasto/bitcoin#264

Closed

hebasto removed the Needs CMake port label Jul 14, 2024

maflcko reviewed Jul 31, 2024

View reviewed changes

This was referenced Aug 22, 2024

Unit test failures when using multiple jobs and RANDOM_CTX_SEED #30696

Closed

test: Fix RANDOM_CTX_SEED use with parallel tests #30737

Closed

bitcoin locked and limited conversation to collaborators Jul 31, 2025

Several randomness improvements #29625

Several randomness improvements #29625

Uh oh!

Conversation

sipa commented Mar 11, 2024

Uh oh!

DrahtBot commented Mar 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

DrahtBot commented Mar 12, 2024

Uh oh!

dergoegge commented Mar 12, 2024

Uh oh!

Sjors commented Mar 12, 2024

Uh oh!

dergoegge commented Mar 12, 2024

Uh oh!

achow101 commented Jul 1, 2024

Uh oh!

maflcko left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hodlinator left a comment

Choose a reason for hiding this comment

e2d1f84 - random: make GetRand() support entire range (incl. max)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dergoegge left a comment

Choose a reason for hiding this comment

Uh oh!

hebasto commented Jul 14, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DrahtBot commented Mar 11, 2024 •

edited

Loading

maflcko left a comment •

edited

Loading

`e2d1f84` - random: make GetRand() support entire range (incl. max)