-
Notifications
You must be signed in to change notification settings - Fork 901
Description
When processing light client optimistic updates we are returning a UnknownBlockParentRoot
error for cases where we do know the parent block:
lighthouse/beacon_node/beacon_chain/src/light_client_optimistic_update_verification.rs
Lines 70 to 72 in 029b4f2
if canonical_root != head_block.message().parent_root() { | |
return Err(Error::UnknownBlockParentRoot(canonical_root)); | |
} |
The effect of this is that we will requeue messages unnecessarily (instead of ignoring them immediately):
lighthouse/beacon_node/network/src/network_beacon_processor/gossip_methods.rs
Lines 2051 to 2077 in 029b4f2
LightClientOptimisticUpdateError::UnknownBlockParentRoot(parent_root) => { | |
metrics::inc_counter( | |
&metrics::BEACON_PROCESSOR_REPROCESSING_QUEUE_SENT_OPTIMISTIC_UPDATES, | |
); | |
debug!( | |
self.log, | |
"Optimistic update for unknown block"; | |
"peer_id" => %peer_id, | |
"parent_root" => ?parent_root | |
); | |
if let Some(sender) = reprocess_tx { | |
let processor = self.clone(); | |
let msg = ReprocessQueueMessage::UnknownLightClientOptimisticUpdate( | |
QueuedLightClientUpdate { | |
parent_root, | |
process_fn: Box::new(move || { | |
processor.process_gossip_optimistic_update( | |
message_id, | |
peer_id, | |
light_client_optimistic_update, | |
None, // Do not reprocess this message again. | |
seen_timestamp, | |
) | |
}), | |
}, | |
); |
The reason the check is wrong is that it is only checking the update's canonical_root against the head's parent root. This is correct in the case where e.g. our head is 1 slot behind the update, so we have update
at slot 31, head at slot 31, and slot_31_root != slot_30_root
. That's fine. The cases that are dodgy are if we have old light client updates, like:
update
slot is 31, head slot is 33. In this case the parent of the head is the slot 32 block, which is not the slot 31 block, so we returnUnknownBlockParentRoot
and requeue the message. Requeueing does not help, because when we re-process this message a slot later, the head will either be the same or even newer, and the check will still fail.
The reprocess queue requeues messages a maximum of once, so the impact of this issue is quite minimal: we will requeue some messages unnecessarily and keep them around for up to 12s. This is also only likely to occur if we are lagging on processing updates a little.
We can detect the frequency of this occuring by looking for the log:
Not sending light client update because it had been reprocessed
There were only 33 instances of this per node prior to the fix made here:
But we should continue to monitor, and eventually implement a fix. Perhaps checking the slot of the update to work out whether it is too old and can just be ignored.