Skip to content

unexpected election when one of followers is network partition with leader #9020

@lhy1024

Description

@lhy1024

Enhancement Task

An unexpected election occurs when one of the followers experiences a network partition from the leader.

If the etcd client is using a follower endpoint and encounters a failure when attempting to save a timestamp, the allocator_manager will reset the leader.

[2025/01/21 20:39:49.149 +08:00] [WARN] [lease.go:187] ["lease keep alive failed"] [purpose="leader election"] [start=2025/01/21 20:39:46.149 +08:00] [error="context deadline exceeded"]
[2025/01/21 20:39:53.032 +08:00] [WARN] [etcd_kv.go:180] ["txn runs too slow"] [response=null] [cost=5.721552451s] [error="rpc error: code = Unavailable desc = error reading from server: read tcp [10.200.26.212:34736](http://10.200.26.212:34736/)->[10.200.27.59:2379](http://10.200.27.59:2379/): read: connection timed out"]
[2025/01/21 20:39:53.032 +08:00] [WARN] [tso.go:333] ["save timestamp failed"] [] [timestamp-path=timestamp] [error="rpc error: code = Unavailable desc = error reading from server: read tcp [10.200.26.212:34736](http://10.200.26.212:34736/)->[10.200.27.59:2379](http://10.200.27.59:2379/): read: connection timed out"]
[2025/01/21 20:39:53.032 +08:00] [WARN] [allocator_manager.go:289] ["failed to update allocator's timestamp"] [] [name=tc-pd-1] [error="rpc error: code = Unavailable desc = error reading from server: read tcp [10.200.26.212:34736](http://10.200.26.212:34736/)->[10.200.27.59:2379](http://10.200.27.59:2379/): read: connection timed out"]

In fact, manual retry is needed, next request succeeds with the new endpoint, according to etcd-io/etcd#8711

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.severity/majortype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions