-
Notifications
You must be signed in to change notification settings - Fork 742
Closed
Labels
affects-5.4This bug affects the 5.4.x(LTS) versions.This bug affects the 5.4.x(LTS) versions.affects-6.1This bug affects the 6.1.x(LTS) versions.This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.This bug affects the 8.5.x(LTS) versions.report/customerCustomers have encountered this bug.Customers have encountered this bug.severity/majortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.
Description
Bug Report
Lines 149 to 169 in 40fe83e
if err != nil { | |
if ev, ok := status.FromError(err); ok { | |
if ev.Code() == codes.Canceled { | |
return | |
} | |
} | |
log.Error("server failed to establish sync stream with leader", zap.String("server", s.server.Name()), zap.String("leader", s.server.GetLeader().GetName()), errs.ZapError(err)) | |
time.Sleep(time.Second) | |
continue | |
} | |
log.Info("server starts to synchronize with leader", zap.String("server", s.server.Name()), zap.String("leader", s.server.GetLeader().GetName()), zap.Uint64("request-index", s.history.getNextIndex())) | |
for { | |
resp, err := stream.Recv() | |
if err != nil { | |
s.streamingRunning.Store(false) | |
log.Error("region sync with leader meet error", errs.ZapError(errs.ErrGRPCRecv, err)) | |
if err = stream.CloseSend(); err != nil { | |
log.Error("failed to terminate client stream", errs.ZapError(errs.ErrGRPCCloseSend, err)) | |
} | |
time.Sleep(time.Second) | |
break |
Right now, we use sleep to prevent retrying to sync with the leader too frequently. However, it will indeed affect the PD election time if the etcd leader changes. The follower must wait for 1s to exit the loop before starting the election. See both line 156 and line 168. We can use the context to make it exit ASAP.
Here is the case that slows down the leader election:
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:1348] [\"ec5a786a755b23bd [term 74] received MsgTimeoutNow from 81a4031a31407385 and starts an election to get leadership.\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:719] [\"ec5a786a755b23bd became candidate at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:830] [\"ec5a786a755b23bd received MsgVoteResp from ec5a786a755b23bd at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:817] [\"ec5a786a755b23bd [logterm: 74, index: 671118265] sent MsgVote request to 6b3fd9f5a8c74ff3 at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:817] [\"ec5a786a755b23bd [logterm: 74, index: 671118265] sent MsgVote request to 81a4031a31407385 at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:817] [\"ec5a786a755b23bd [logterm: 74, index: 671118265] sent MsgVote request to ad4c4e35a9bd4181 at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [raft.go:817] [\"ec5a786a755b23bd [logterm: 74, index: 671118265] sent MsgVote request to f587050a2df8fa3f at term 75\"]",
"[2024/11/14 18:56:49.246 +00:00] [INFO] [node.go:333] [\"raft.node: ec5a786a755b23bd lost leader 81a4031a31407385 at term 75\"]",
"[2024/11/14 18:56:49.249 +00:00] [INFO] [raft.go:830] [\"ec5a786a755b23bd received MsgVoteResp from 81a4031a31407385 at term 75\"]",
"[2024/11/14 18:56:49.249 +00:00] [INFO] [raft.go:1295] [\"ec5a786a755b23bd has received 2 MsgVoteResp votes and 0 vote rejections\"]",
"[2024/11/14 18:56:49.250 +00:00] [INFO] [raft.go:830] [\"ec5a786a755b23bd received MsgVoteResp from 6b3fd9f5a8c74ff3 at term 75\"]",
"[2024/11/14 18:56:49.250 +00:00] [INFO] [raft.go:1295] [\"ec5a786a755b23bd has received 3 MsgVoteResp votes and 0 vote rejections\"]",
"[2024/11/14 18:56:49.250 +00:00] [INFO] [raft.go:771] [\"ec5a786a755b23bd became leader at term 75\"]",
"[2024/11/14 18:56:49.250 +00:00] [INFO] [node.go:327] [\"raft.node: ec5a786a755b23bd elected leader ec5a786a755b23bd at term 75\"]",
"[2024/11/14 18:56:49.284 +00:00] [INFO] [leadership.go:375] [\"current leadership is deleted\"] [revision=647767897] [leader-key=/pd/7366397889337882103/leader] [purpose=\"leader election\"]",
"[2024/11/14 18:56:49.284 +00:00] [ERROR] [client.go:161] [\"region sync with leader meet error\"] [error=\"[PD:grpc:ErrGRPCRecv]receive response error: rpc error: code = Canceled desc = context canceled\"]",
"[2024/11/14 18:56:50.284 +00:00] [INFO] [server.go:1669] [\"pd leader has changed, try to re-campaign a pd leader\"]",
"[2024/11/14 18:56:50.284 +00:00] [INFO] [server.go:1706] [\"start to campaign PD leader\"] [campaign-leader-name=xxx.pd-1]",
"[2024/11/14 18:56:50.286 +00:00] [INFO] [lease.go:66] [\"lease granted\"] [lease-id=4720215488460578587] [lease-timeout=3] [purpose=\"leader election\"]",
"[2024/11/14 18:56:50.288 +00:00] [INFO] [leadership.go:182] [\"check campaign resp\"] [resp=\"{\\\"header\\\":{\\\"cluster_id\\\":8535908639037064845,\\\"member_id\\\":17031057339485725629,\\\"revision\\\":64776
7925,\\\"raft_term\\\":75},\\\"succeeded\\\":true,\\\"responses\\\":[{\\\"Response\\\":{\\\"ResponsePut\\\":{\\\"header\\\":{\\\"revision\\\":647767925}}}}]}\"]",
"[2024/11/14 18:56:50.288 +00:00] [INFO] [leadership.go:191] [\"write leaderData to leaderPath ok\"] [leader-key=/pd/7366397889337882103/leader] [purpose=\"leader election\"]",
"[2024/11/14 18:56:50.288 +00:00] [INFO] [server.go:1732] [\"campaign PD leader ok\"] [campaign-leader-name=xxx.pd-1]",
Metadata
Metadata
Assignees
Labels
affects-5.4This bug affects the 5.4.x(LTS) versions.This bug affects the 5.4.x(LTS) versions.affects-6.1This bug affects the 6.1.x(LTS) versions.This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.This bug affects the 8.5.x(LTS) versions.report/customerCustomers have encountered this bug.Customers have encountered this bug.severity/majortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.