-
Notifications
You must be signed in to change notification settings - Fork 902
Error from RPC send_response
when request doesn't exist on the active inbound requests
#7663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mergify
merged 4 commits into
sigp:release-v7.1.0
from
jxs:handle-rpc-send-response-errors
Jul 9, 2025
Merged
Error from RPC send_response
when request doesn't exist on the active inbound requests
#7663
mergify
merged 4 commits into
sigp:release-v7.1.0
from
jxs:handle-rpc-send-response-errors
Jul 9, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
send_response
when request doesn't exist on the active inbound requests.send_response
when request doesn't exist on the active inbound requests
michaelsproul
added a commit
that referenced
this pull request
Jun 29, 2025
… inbound requests (#7663) Squashed commit of the following: commit 0ce690c Author: João Oliveira <hello@jxs.pt> Date: Fri Jun 27 22:50:11 2025 +0100 Error from RPC `send_response` when request doesn't exist on the active inbound requests. And handle the error from the main service to check if it may be a data race or a critical bug commit 522e00f Author: Jimmy Chen <jimmy@sigmaprime.io> Date: Sat Jun 28 05:01:46 2025 +1000 Fix incorrect `waker` update condition (#7656) This bug was first found and partially fixed by @VolodymyrBg in #7317 - this PR applies the same fix everywhere else. The old logic updated the waker when it already matched the context, and did nothing when it was stale: ```rust if waker.will_wake(cx.waker()) { self.waker = Some(cx.waker().clone()); } ``` This is the wrong way around. We only want to update the waker if it doesn't match the current context: ```rust if !waker.will_wake(cx.waker()) { self.waker = Some(cx.waker().clone()); } ``` I don't think we've ever noticed any issues, but it’s a subtle bug that could lead to missed wakeups. commit 83cad25 Author: Jimmy Chen <jimmy@sigmaprime.io> Date: Sat Jun 28 04:21:17 2025 +1000 Fix Rust 1.88 clippy errors & execution engine tests (#7657) Fix Rust 1.88 clippy errors. commit 9b1f3ed Author: Pawan Dhananjay <pawandhananjay@gmail.com> Date: Thu Jun 26 17:26:38 2025 -0700 Add gossip check (#7652) N/A Add an additional gossip condition. commit a0a6b93 Author: Daniel Knopik <107140945+dknopik@users.noreply.github.com> Date: Wed Jun 25 08:22:24 2025 +0200 Do not compute sync selection proofs for the sync duty at the current slot (#7551) commit 8e3c5d1 Author: chonghe <44791194+chong-he@users.noreply.github.com> Date: Wed Jun 25 13:33:17 2025 +0800 Rust 1.89 compiler lint fix (#7644) Fix lints for Rust 1.89 beta compiler commit 56b2d4b Author: Eitan Seri-Levi <eserilev@ucsc.edu> Date: Tue Jun 24 09:29:10 2025 +0300 Remove instrumenting log level (#7636) #7155 Theres some additional places we set instrumenting log levels that wasn't covered in #7620 commit fd643c3 Author: Michael Sproul <michael@sigmaprime.io> Date: Mon Jun 23 23:11:46 2025 +1000 Un-ignore EF test for v1.6.0-alpha.1 (#7632) Closes: - #7547 Run the test that was previously ignored when we were between spec versions. commit cef04ee Author: chonghe <44791194+chong-he@users.noreply.github.com> Date: Mon Jun 23 16:37:49 2025 +0800 Implement `validator_identities` Beacon API endpoint (#7462) * #7442
michaelsproul
added a commit
that referenced
this pull request
Jul 1, 2025
… inbound requests (#7663) Squashed commit of the following: commit 0ce690c Author: João Oliveira <hello@jxs.pt> Date: Fri Jun 27 22:50:11 2025 +0100 Error from RPC `send_response` when request doesn't exist on the active inbound requests. And handle the error from the main service to check if it may be a data race or a critical bug commit 522e00f Author: Jimmy Chen <jimmy@sigmaprime.io> Date: Sat Jun 28 05:01:46 2025 +1000 Fix incorrect `waker` update condition (#7656) This bug was first found and partially fixed by @VolodymyrBg in #7317 - this PR applies the same fix everywhere else. The old logic updated the waker when it already matched the context, and did nothing when it was stale: ```rust if waker.will_wake(cx.waker()) { self.waker = Some(cx.waker().clone()); } ``` This is the wrong way around. We only want to update the waker if it doesn't match the current context: ```rust if !waker.will_wake(cx.waker()) { self.waker = Some(cx.waker().clone()); } ``` I don't think we've ever noticed any issues, but it’s a subtle bug that could lead to missed wakeups. commit 83cad25 Author: Jimmy Chen <jimmy@sigmaprime.io> Date: Sat Jun 28 04:21:17 2025 +1000 Fix Rust 1.88 clippy errors & execution engine tests (#7657) Fix Rust 1.88 clippy errors. commit 9b1f3ed Author: Pawan Dhananjay <pawandhananjay@gmail.com> Date: Thu Jun 26 17:26:38 2025 -0700 Add gossip check (#7652) N/A Add an additional gossip condition. commit a0a6b93 Author: Daniel Knopik <107140945+dknopik@users.noreply.github.com> Date: Wed Jun 25 08:22:24 2025 +0200 Do not compute sync selection proofs for the sync duty at the current slot (#7551) commit 8e3c5d1 Author: chonghe <44791194+chong-he@users.noreply.github.com> Date: Wed Jun 25 13:33:17 2025 +0800 Rust 1.89 compiler lint fix (#7644) Fix lints for Rust 1.89 beta compiler commit 56b2d4b Author: Eitan Seri-Levi <eserilev@ucsc.edu> Date: Tue Jun 24 09:29:10 2025 +0300 Remove instrumenting log level (#7636) #7155 Theres some additional places we set instrumenting log levels that wasn't covered in #7620 commit fd643c3 Author: Michael Sproul <michael@sigmaprime.io> Date: Mon Jun 23 23:11:46 2025 +1000 Un-ignore EF test for v1.6.0-alpha.1 (#7632) Closes: - #7547 Run the test that was previously ignored when we were between spec versions. commit cef04ee Author: chonghe <44791194+chong-he@users.noreply.github.com> Date: Mon Jun 23 16:37:49 2025 +0800 Implement `validator_identities` Beacon API endpoint (#7462) * #7442
0ce690c
to
c422ed2
Compare
when request doesn't exist on the active inbound requests. And handle the error from the main service to check if it may be a data race or a critical bug
c422ed2
to
361ebf4
Compare
michaelsproul
added a commit
that referenced
this pull request
Jul 3, 2025
… inbound requests (#7663) Squashed commit of the following: commit 361ebf4 Author: João Oliveira <hello@jxs.pt> Date: Fri Jun 27 22:50:11 2025 +0100 Error from RPC `send_response` when request doesn't exist on the active inbound requests. And handle the error from the main service to check if it may be a data race or a critical bug
ackintosh
reviewed
Jul 7, 2025
ackintosh
reviewed
Jul 7, 2025
Some required checks have failed. Could you please take a look @jxs? 🙏 |
Merged
michaelsproul
added a commit
that referenced
this pull request
Jul 9, 2025
… inbound requests (#7663) Squashed commit of the following: commit 093b659 Author: ackintosh <sora.akatsuki@gmail.com> Date: Wed Jul 9 07:30:59 2025 +0900 Respect peer_disconnected when counting active requests commit 29686e8 Author: João Oliveira <hello@jxs.pt> Date: Tue Jul 8 00:44:45 2025 +0100 address Akihito review commit 9482db0 Author: João Oliveira <hello@jxs.pt> Date: Mon Jul 7 16:15:21 2025 +0100 do not remove on disconnect, rather mark the request commit 361ebf4 Author: João Oliveira <hello@jxs.pt> Date: Fri Jun 27 22:50:11 2025 +0100 Error from RPC `send_response` when request doesn't exist on the active inbound requests. And handle the error from the main service to check if it may be a data race or a critical bug
michaelsproul
approved these changes
Jul 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging ahead of release ✅
6 tasks
ethDreamer
pushed a commit
to ethDreamer/lighthouse
that referenced
this pull request
Jul 28, 2025
…ve inbound requests (sigp#7663) Lighthouse is currently loggign a lot errors in the `RPC` behaviour whenever a response is received for a request_id that no longer exists in active_inbound_requests. This is likely due to a data race or timing issue (e.g., the peer disconnecting before the response is handled). This PR addresses that by removing the error logging from the RPC layer. Instead, RPC::send_response now simply returns an Err, shifting the responsibility to the main service. The main service can then determine whether the peer is still connected and only log an error if the peer remains connected. Thanks @ackintosh for helping debug!
ethDreamer
pushed a commit
to ethDreamer/lighthouse
that referenced
this pull request
Jul 28, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Addressed
Lighthouse is currently loggign a lot errors in the
RPC
behaviour whenever a response is received for a request_id that no longer exists in active_inbound_requests. This is likely due to a data race or timing issue (e.g., the peer disconnecting before the response is handled).This PR addresses that by removing the error logging from the RPC layer. Instead, RPC::send_response now simply returns an Err, shifting the responsibility to the main service. The main service can then determine whether the peer is still connected and only log an error if the peer remains connected.
Thanks @ackintosh for helping debug!