Improve range sync with PeerDAS

## Roadmap

1. Downscore peers for invalid data https://github.com/sigp/lighthouse/pull/7352
  - Track who send what in batches. We need change `PeerId` for `PeerGroup` which allows to track who send each column
    - Partially implement in https://github.com/sigp/lighthouse/pull/6922 which removes the tracking a `PeerId -> request` inside the batches
  - Make beacon processor errors more descriptive such that sync can know which column caused the RpcBlock to be invalid
2. Downscore peers for custody failures
  - Send requests to peers that are part of the SyncingChain peerset, instead of the global pool. This will cause SyncingChains to error frequently with `NoPeer` errors
    - Modify syncing batches to allow them to stay in the `AwaitingDownload` state when they have no peers
    - Remove `good_peers_on_sampling_subnets`
    - Extras:
      - Consider adding a fallback mechanism where we fetch from the global peer set, but only for those that are synced up to the requested batch. And in that case don't penalize custody failures
      - Assume all finalized peers are in the same chain and improve SyncingChain grouping
      - Implement StatusV2
  - Change by_range sync download algorithm to fetch blocks first, then columns. Use the blocks are source of truth to match against columns and penalize custody failures. 
    - V1: Assume block peer is honest, and believe the blocks are canonical
    - V2: Send blocks to the processor to verify correct proposer index and proposer signature (significantly increases the cost of an attack). Note: this will require way more complex logic, similar to lookup sync. Note^2: batch syncing will no longer download and process in parallel. Because we need to processing is sequential and we need to process blocks before completing a batch download.
3. Request invidual columns when a peer fails to serve the columns
  - To retry each by_range request individually we need to be able to trigger them individually https://github.com/sigp/lighthouse/pull/6497
  - Add a custody_by_range meta-request similar to [beacon_node/network/src/sync/network_context/custody.rs](https://github.com/sigp/lighthouse/blob/unstable/beacon_node/network/src/sync/network_context/custody.rs)
  - Deprecate current `block_sidecar_coupling` logic for a meta-request similar to lookup sync or custody by root.
4. Reconstruct if we can't download all columns that we need but we have >= 50%
5. Improve peer selection logic: what peer to select next for column requests? i.e. if a peer has a custody failure, should we never request from it, prevent requests to it for some time?

### Extras

6. Refactor `verify_kzg_for_rpc_blocks` outside of the da_checker, it does not use the cache
7. Change RpcBlock to holding a Vec of DataColumnSidecars so we don't need a spec reference

### Why requesting individual requests is useful

Currently range sync and backfill sync fetch blocks and blobs from the network with this sequence:
- Out of the pool of peers in the chain (peers that agree on some fork-choice state) select ONE peer
- Immediately issue `block_by_range` and `blobs_by_range` to the SAME ONE peer
- If any of those requests error, fail BOTH requests and retry with another peer

This strategy is not optimal but good enough for now. However with PeerDAS, the worst-case number of requests per batch increases from 2 (blocks + blobs) to 2 + `DATA_COLUMN_SIDECAR_SUBNET_COUNT / CUSTODY_REQUIREMENT = 2+32` (if not connected to any larger node.

If we extend the current paradigm, a single failure on a columns_by_range request will trigger a retry of all 34 requests. Not optimal 😅 

A solution is to make the "components_by_range" request able to retry each individual request. This is what block lookup requests do, where each component (block, blobs, custody) has its own state and retry count. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve range sync with PeerDAS #6258

Roadmap

Extras

Why requesting individual requests is useful

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve range sync with PeerDAS #6258

Description

Roadmap

Extras

Why requesting individual requests is useful

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions