-
Notifications
You must be signed in to change notification settings - Fork 97
Fix the cache cleaning of multiaddresses on peer dial #6873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the cache cleaning of multiaddresses on peer dial #6873
Conversation
📝 WalkthroughWalkthroughThis pull request modifies how peer discovery events are handled. In the discovery module, the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
📜 Recent review detailsConfiguration used: .coderabbit.yaml ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (8)
🔇 Additional comments (3)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
a822d66
to
71d86ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
transport/p2p/src/behavior/discovery.rs (5)
23-23
: Consider removing or repurposing this empty enum.Having an empty public enum may cause confusion and might be unnecessary if no variants are planned. You could remove it entirely or add a placeholder variant for clarity.
34-34
: Reevaluate using a single address per peer.Storing only one multiaddress in
all_peers
could discard potentially useful addresses. Consider storing a collection (e.g.,HashSet<Multiaddr>
) to handle multiple valid addresses per peer.
116-128
: Potentially re-adding stale addresses on dial failures.Reinserting the multiaddress into the swarm is useful for reliability; however, consider how to detect and retire stale addresses if multiple attempts fail. Otherwise, you may repeatedly retry dead endpoints.
168-173
: Address might be outdated or incomplete.Re-pushing the multiaddress on
PeerDiscovery::Allow
is practical but could lead to attempting outdated addresses. More granular logic (e.g., verifying success before re-appending) may be beneficial.
190-197
: Consider handling all announced addresses instead of just the last.Relying solely on
multiaddresses.last()
discards earlier addresses, which might still be valid. You could iterate through all addresses or store them in a more comprehensive data structure.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
transport/p2p/src/behavior/discovery.rs
(6 hunks)transport/p2p/src/swarm.rs
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: hoprd / docker
- GitHub Check: hopli / docker
- GitHub Check: Docs / Rust docs
- GitHub Check: tests-smoke-websocket
- GitHub Check: tests-unit-nightly
- GitHub Check: tests-unit
- GitHub Check: tests-smart-contracts
- GitHub Check: Linter
🔇 Additional comments (2)
transport/p2p/src/behavior/discovery.rs (1)
55-55
: Same concern regarding multiple addresses.This initialization aligns with the current single-address-per-peer approach. However, if you decide to store multiple addresses per peer, update this initial value accordingly.
transport/p2p/src/swarm.rs (1)
501-501
: Verify the impact of ignoring discovery events.Dropping all
Discovery
events may limit visibility into newly announced peers or addresses. Ensure upstream logic sufficiently covers these cases and that no critical network behavior is lost.
The latest dAppNode package for DAppNodePackage-Hopr has been built and uploaded with IPFS hash: /ipfs/QmTrntCxz63TpcQtK2fLaZ8yxhZ4iBn62UR7qCXJhJdAAT |
cbcd1ad
to
f360243
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
transport/p2p/src/lib.rs (1)
87-89
: LGTM! Field order is crucial for correct peer address handling.The placement of the
discovery
field at the end of the struct is essential. When fields are dropped, they are dropped in reverse order of declaration. By placingdiscovery
last, we ensure it's dropped first, allowing it to properly handle peer addresses that might have been removed by request_response components.This ordering dependency should be documented in the module-level documentation to ensure it's not accidentally changed in future refactors.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
transport/p2p/src/lib.rs
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: hoprd / docker
- GitHub Check: hopli / docker
- GitHub Check: tests-smoke-websocket
- GitHub Check: tests-smart-contracts
- GitHub Check: tests-unit-nightly
- GitHub Check: tests-unit
- GitHub Check: Docs / Rust docs
- GitHub Check: Linter
🔇 Additional comments (2)
transport/p2p/src/lib.rs (2)
115-164
: LGTM! Initialization order is correct.The initialization order in the
new
method is correct, withdiscovery
being initialized first. This maintains proper functionality while working with the new field order.
76-90
:❓ Verification inconclusive
Verify thread safety of peer address handling.
The field ordering ensures correct drop order, but we should verify that peer address updates are properly synchronized between request_response components and discovery.
Let's check for any potential race conditions in the discovery module:
🏁 Script executed:
#!/bin/bash # Description: Check for synchronization mechanisms in discovery module # Look for mutex/atomic usage around peer address handling rg -A 5 'Mutex|RwLock|atomic|sync' transport/p2p/src/behavior/discovery.rsLength of output: 76
Manual verification needed for thread safety in peer address updates.
The discovery module does not appear to use explicit synchronization mechanisms (e.g. Mutex, RwLock, atomics) per our search. While the field ordering in
HoprNetworkBehavior
ensures a deterministic drop order (with discovery being dropped last), it's unclear if the design sufficiently protects concurrent peer address modifications when request_response components remove addresses and discovery re-adds them. Please manually verify that the peer address handling is thread safe under concurrent scenarios.
The latest dAppNode package for DAppNodePackage-Hopr has been built and uploaded with IPFS hash: /ipfs/QmZdkrQzTrrVKvEpjgNJy8Z9yWVBDwDZWBRxnFKGSXUTSQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
transport/p2p/src/behavior/discovery.rs (2)
23-23
: Consider documenting or removing the empty enum.The
Event
enum is now empty after removing theNewPeerMultiddress
variant. If it's no longer needed, consider removing it entirely. Otherwise, add documentation explaining why it's kept empty.
34-34
: Add documentation for theall_peers
field.This field is crucial for the new caching behavior. Please add documentation explaining its purpose in retaining peer addresses across dial attempts.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
transport/p2p/src/behavior/discovery.rs
(7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: hopli / docker
- GitHub Check: hoprd / docker
- GitHub Check: Docs / Rust docs
- GitHub Check: tests-unit
- GitHub Check: tests-smart-contracts
- GitHub Check: tests-smoke-websocket
- GitHub Check: tests-unit-nightly
- GitHub Check: Linter
The latest dAppNode package for DAppNodePackage-Hopr has been built and uploaded with IPFS hash: /ipfs/QmSfcXY3kCERLfioZBHwdddeeqKw5diQSyrReDpeahgyo1 |
The latest dAppNode package for DAppNodePackage-Hopr has been built and uploaded with IPFS hash: /ipfs/QmcMUsCFmSkhJCXCVMVjJTwwW5EocHTrnqRZhQobEvCDdz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crate version bump missing
The latest dAppNode package for DAppNodePackage-Hopr has been built and uploaded with IPFS hash: /ipfs/QmU8SCyvDZw8oPZwdamWK5GWazxDuNcqHK88NcnqpsnzwA |
As identified by @tolbrino the swarm voids the cache for the peer id that failed to dial, causing the multiaddress to be further unknown for redials, causing an incremental decline in peer connectability.
To solve this, the node now holds the entire announced peer list and on failed dials reinserts the multiaddress for a peer that failed to be dialed.
Closes #6869