-
Notifications
You must be signed in to change notification settings - Fork 902
Closed
Labels
Description
I've been trying to sync a fresh supernode to peerdas-devnet-4
, and finalized sync seems to frequently get stuck.
I've observed RPC requests sent and completed but eventually the batch failed with RPC Error - but i don't see any RPC error for this batch during this period.
See some partial logs below
- 15 requests were sent for this batch epoch = 41, range request id = 35
- 15 requests were completed
- and a
Batch failed. RPC Error
followed immediately, with no RPC error logs found in this period.
I'm suspecting we have an RPC error log suspressed somewhere, it's not obvious from the log what the rpc error is, and the node just keep retrying to sync the same batch and unable to make progress.
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 36/35/RangeSync/41/1, peer: 16Uiu2HAmDdyTRG4dESR8v9nnAbdwuqWXHDQ8hLwhgmSDV1mooiw9, epoch: 41, slots: 32, method: BlocksByRange, service: sync, module: network::sync::network_context:788
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 37/35/RangeSync/41/1, peer: 16Uiu2HAmH5PwwsfXLCGqSi2xFgPu7Zy9xrzXKTqc5ZwxkkdhYbVy, columns: [105, 37, 35, 36, 76, 54, 96, 26], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 38/35/RangeSync/41/1, peer: 16Uiu2HAm9PijSZpm5QUphXRoBtkhUZPkGJ4Rgxk4Bny91oZPYZLG, columns: [100, 3, 13, 126, 63, 7, 103, 85, 41], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 39/35/RangeSync/41/1, peer: 16Uiu2HAm35d5GWdmRH2XtjM4PMi6MXyCSwXgZi5KwkGNijbSmARB, columns: [89, 8, 79, 110, 75, 15, 109, 58, 1], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 40/35/RangeSync/41/1, peer: 16Uiu2HAmDdyTRG4dESR8v9nnAbdwuqWXHDQ8hLwhgmSDV1mooiw9, columns: [93, 45, 46, 60, 47, 115], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 41/35/RangeSync/41/1, peer: 16Uiu2HAkzUzciBZ9mW18RAwhGowdz6SRbt36gS1NZupVXUkGENW7, columns: [18, 43, 57, 30, 120, 39, 95, 2, 56, 98, 127], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 42/35/RangeSync/41/1, peer: 16Uiu2HAm4SC4chiQADwP7M8xu2nigb7GsabhKfULHZ2gMAcw6fmD, columns: [24, 4, 71, 102, 44, 21, 118, 87, 11, 53], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 43/35/RangeSync/41/1, peer: 16Uiu2HAmNSdQ4yXgyBUpQ5yGRE32L4t4kkQahfaHCxcmZHbtT7md, columns: [99, 49, 6, 80, 40, 114, 61], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 44/35/RangeSync/41/1, peer: 16Uiu2HAmAAZ5wP6fvpe1b9tWNmgA2Wn8MsrNYVhkw5WohcAoaHKR, columns: [25, 121, 104, 108, 111, 48, 92, 14], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 45/35/RangeSync/41/1, peer: 16Uiu2HAmJyaVGkRGR9ACqompkoge8T2x4KFH4KbzDkh7zz6uN2JX, columns: [125, 77, 17, 74, 23, 123, 72, 84, 59, 124, 107], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 46/35/RangeSync/41/1, peer: 16Uiu2HAmChnSSEeAB9w5Gw2zqF23ufsrPAUqZWWo3mUgJhWJWRP4, columns: [112, 62, 91, 32, 42], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 47/35/RangeSync/41/1, peer: 16Uiu2HAmR4AjDzeAMa8ud5E4L3tu92D3GfBJ3z92MbqUEwHiFgnN, columns: [69, 70, 116, 83, 119, 94, 31, 0, 9, 68, 64, 65, 90], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 48/35/RangeSync/41/1, peer: 16Uiu2HAkwLdYWXA39aykj6uLy6knHo7V6CaMVU7Xq2smc6Ysw5eh, columns: [10, 16, 19, 27, 117, 113, 12, 50, 97, 29], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 49/35/RangeSync/41/1, peer: 16Uiu2HAmNBbHKNxDDgZ1wjRRaodUavr9oiQ7awRcmT2m42VATZpn, columns: [88, 67, 101, 55, 122, 82, 5, 81, 20, 51], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:17.144 DEBG Sync RPC request sent, id: 50/35/RangeSync/41/1, peer: 16Uiu2HAmLCfZdW9qfBvmvVVCKSPDKNeyAZTgx3eZSxF11Ao85JCb, columns: [33, 38, 86, 66, 73, 52, 22, 78, 106, 34, 28], epoch: 41, slots: 32, method: DataColumnsByRange, service: sync, module: network::sync::network_context:870
Feb 12 04:38:19.875 DEBG Sync RPC request completed, count: 209, method: DataColumnsByRange, id: 45/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:23.952 DEBG Sync RPC request completed, count: 190, method: DataColumnsByRange, id: 48/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:24.485 DEBG Sync RPC request completed, count: 209, method: DataColumnsByRange, id: 41/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:25.326 DEBG Sync RPC request completed, count: 190, method: DataColumnsByRange, id: 42/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:25.433 DEBG Sync RPC request completed, count: 143, method: DataColumnsByRange, id: 39/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:26.803 DEBG Sync RPC request completed, count: 95, method: DataColumnsByRange, id: 46/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:27.350 DEBG Sync RPC request completed, count: 209, method: DataColumnsByRange, id: 50/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:28.968 DEBG Sync RPC request completed, count: 133, method: DataColumnsByRange, id: 43/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:29.719 DEBG Sync RPC request completed, count: 152, method: DataColumnsByRange, id: 44/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:31.173 DEBG Sync RPC request completed, count: 152, method: DataColumnsByRange, id: 37/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:31.272 DEBG Sync RPC request completed, count: 171, method: DataColumnsByRange, id: 38/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:32.228 DEBG Sync RPC request completed, count: 190, method: DataColumnsByRange, id: 49/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:35.236 DEBG Sync RPC request completed, count: 247, method: DataColumnsByRange, id: 47/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:38.363 DEBG Sync RPC request completed, count: 27, method: BlocksByRange, id: 36/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:38.363 DEBG Sync RPC request completed, count: 114, method: DataColumnsByRange, id: 40/35/RangeSync/41/1, service: sync, module: network::sync::network_context:1128
Feb 12 04:38:38.396 DEBG Batch failed. RPC Error, batch_state: [D,D,p,p,p,], request_id: 35, peer_id: 16Uiu2HAmDdyTRG4dESR8v9nnAbdwuqWXHDQ8hLwhgmSDV1mooiw9, batch_state: Downloading(16Uiu2HAmDdyTRG4dESR8v9nnAbdwuqWXHDQ8hLwhgmSDV1mooiw9, 35), batch_epoch: 41, chain: 1, s
ervice: range_sync, service: sync, module: network::sync::range_sync::chain:867
Full logs attached, I restarted the node a few times, so you may want to search for the timestamps from the partial logs above:
beacon.log.zip
The kurtosis config to sync peerdas-devnet-4:
participants:
- cl_type: lighthouse
cl_image: sigp/lighthouse:latest-unstable
cl_extra_params:
- --subscribe-all-data-column-subnets
- cl_type: lighthouse
cl_image: sigp/lighthouse:latest-unstable
network_params:
network: peerdas-devnet-4
snooper_enabled: false
additional_services:
- prometheus_grafana