Skip to content

Conversation

ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented May 5, 2024

Closes #3005

As noted in that issue, we currently are doing extra CPU overhead and mutex contention for getting a random number. This PR removes this overhead by making every performance sensitive thread have its own rand instance.

In a subsequent PR, we can cleanup all the testing usages, and likely just entirely delete our internal randomness package. I didn't do that in this PR to keep it straightforward to verify.


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments
  • Title follows the Conventional Commits spec

@ValarDragon ValarDragon requested a review from a team as a code owner May 5, 2024 23:47
@ValarDragon ValarDragon requested a review from a team May 5, 2024 23:47
@ValarDragon ValarDragon changed the title Make every gossip thread use its own randomness instance, reducing contention perf: Make every gossip thread use its own randomness instance, reducing contention May 5, 2024
Comment on lines 314 to 316
// NOTE: This relies on the os's random number generator.
// For real security, we should salt that with some seed.
// See github.com/cometbft/cometbft/crypto for a more secure reader.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this comment, but will leave to the general cleanup PR for this package to remove.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment does not make sense given the preamble of this source file...

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ValarDragon ❤️

Comment on lines +45 to +46
// G404: Use of weak random number generator (math/rand instead of crypto/rand)
//nolint:gosec
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the way we create the local Rand object also has this nolint, so its not newly introduced.

Copy link

@cason cason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a synchronized random source here does not make sense.

The solution, however, is pretty complex, but it is what we can do right now. I would also check other internal uses of this internal random library to understand when we actually need a thread-safe version of it.

@@ -87,6 +87,24 @@ func TestRngConcurrencySafety(_ *testing.T) {
wg.Wait()
}

// Makes a new stdlib random instance 100 times concurrently.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we test that cRandBytes is thread safe instead?

Comment on lines 314 to 316
// NOTE: This relies on the os's random number generator.
// For real security, we should salt that with some seed.
// See github.com/cometbft/cometbft/crypto for a more secure reader.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment does not make sense given the preamble of this source file...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow impressed that the bit array method is only used here. I wonder whether implementing part of this logic here, instead of there, could be a better solution.

At the end, we are counting the 1 bits, then getting a random index from zero to this number, then retrieving the i-th 1 bit. The random part we could implement outside the bit array library.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented May 7, 2024

Every other usage from what I can tell can just be replaced with math.Rand, but wanted to do that in a separate PR. (Math.rand's global instance is thread safe, by internally doing CAS operations and a retry loop with no wait time)

@cason
Copy link

cason commented May 8, 2024

The problem of math/rand is the setup/seed used. Paying the synchronization costs does not make much sense in most of our use cases.

@melekes melekes added the backport-to-v1.x Tell Mergify to backport the PR to v1.x label May 9, 2024
@melekes melekes added this pull request to the merge queue May 9, 2024
Merged via the queue into cometbft:main with commit df28e73 May 9, 2024
mergify bot pushed a commit that referenced this pull request May 9, 2024
…ing contention (#3006)

Closes #3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit df28e73)

# Conflicts:
#	internal/consensus/reactor.go
melekes added a commit that referenced this pull request May 9, 2024
…ing contention (backport #3006) (#3045)

Closes #3005 

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #3006 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request May 26, 2024
…ing contention (cometbft#3006)

Closes cometbft#3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
mergify bot pushed a commit to osmosis-labs/cometbft that referenced this pull request May 28, 2024
…ing contention (cometbft#3006)

Closes cometbft#3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit f55b9f4)
PaddyMc added a commit to osmosis-labs/cometbft that referenced this pull request May 28, 2024
…… (backport #77) (#82)

* perf: Make every gossip thread use its own randomness instance, reducing contention (cometbft#3006)

Closes cometbft#3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit f55b9f4)

* Add Changelgo

(cherry picked from commit ce04f04)

# Conflicts:
#	CHANGELOG.md

* Fix changelog further

(cherry picked from commit bd34ce6)

# Conflicts:
#	CHANGELOG.md

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Dev Ojha <dojha@berkeley.edu>
Co-authored-by: PaddyMc <paddymchale@hotmail.com>
PaddyMc added a commit to osmosis-labs/cometbft that referenced this pull request Aug 19, 2024
…… (backport #77) (#82)

* perf: Make every gossip thread use its own randomness instance, reducing contention (cometbft#3006)

Closes cometbft#3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit f55b9f4)

* Add Changelgo

(cherry picked from commit ce04f04)

* Fix changelog further

(cherry picked from commit bd34ce6)

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Dev Ojha <dojha@berkeley.edu>
Co-authored-by: PaddyMc <paddymchale@hotmail.com>
ValarDragon added a commit to osmosis-labs/cometbft that referenced this pull request Aug 19, 2024
…… (backport #77) (#82) (#132)

* perf: Make every gossip thread use its own randomness instance, reducing contention (cometbft#3006)

Closes cometbft#3005

As noted in that issue, we currently are doing extra CPU overhead and
mutex contention for getting a random number. This PR removes this
overhead by making every performance sensitive thread have its own rand
instance.

In a subsequent PR, we can cleanup all the testing usages, and likely
just entirely delete our internal randomness package. I didn't do that
in this PR to keep it straightforward to verify.

---

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [ ] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit f55b9f4)

* Add Changelgo

(cherry picked from commit ce04f04)

* Fix changelog further

(cherry picked from commit bd34ce6)

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Dev Ojha <dojha@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-v1.x Tell Mergify to backport the PR to v1.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete usage of comet's libs/rand, make each gossip routine maintain its own rand instance.
3 participants