-
Notifications
You must be signed in to change notification settings - Fork 104
Description
How to categorize this issue?
/area control-plane
/kind bug
What happened:
A specific gardener e2e kind test is failing often - Shoot Tests Hibernated Shoot [It] Create, Migrate and Delete [Shoot, control-plane-migration, hibernated]
Creation, Migration and hibernation steps succeed. To do the deletion of the migrated shoot which is currently hibernated, you need to wake up the etcd-cluster. At this stage the etcd cluster is not getting ready.
In one such occurrence we see the following logs in etcd-events-2 (backup-restore container):
2025-02-17T12:45:52.969873914Z stderr F 2025-02-17 12:45:52.968607 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:52.970531124Z stderr F 2025-02-17 12:45:52.970317 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.055124837Z stderr F 2025-02-17 12:45:53.054945 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.062374513Z stderr F 2025-02-17 12:45:53.062106 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.153435731Z stderr F 2025-02-17 12:45:53.153314 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.160917167Z stderr F 2025-02-17 12:45:53.160807 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.251792044Z stderr F 2025-02-17 12:45:53.251680 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.264667024Z stderr F 2025-02-17 12:45:53.264552 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
For complete logs see: etcd-events-2-backup-restore.log
You would typically see cluster ID mismatch
in the 3 scenarios that are documented here.
Prior to starting the embedded etcd process, initialization is triggered by etcd-wrapper. Once the initialization succeeds, etcd-wrapper requests for etcd config. etcd-backup-restore computes the etcd config here. One of the key parameters in the etcd config is to determine the initial-cluster-state
which is done here to distinguish if this member bootstraps/joins a new cluster or joins an existing cluster.
If member list API call fails (see IsLearnerPresent) due to any reason then this function correctly returns an error which is swallowed by the calling function (see here) and the calling function assumes initial-cluster-state=new
. This is done for 0->3 replicas
bootstrap case because while bootstrapping a new cluster etcd Member API calls will never succeed. Even in case of errors, we have to serve the config with initial-cluster-state=new
to let the bootstrap succeed.
However, the above code-flow has a negative consequence as well. Consider the following case:
- Data directory of one of the etcd member gets corrupted while bringing up the cluster from
0->3
. - Etcd-backup-restore validates the data directory and finds it corrupt. It will trigger the single member restoration (see this for more information).
- As part of single-member-restoration, it will add this member as a learner after which it will trigger the initialization. Once initialization is successful, it will serve an etcd config.
- While computing the
initial-cluster-state
if there is an error while making the etcd Member API call (due to transient quorum loss - possible due to VPA eviction etc.) then it assumesinitial-cluster-state
asnew
. This will causeCluster ID mismatch
as this state for alearner
as it's not the correct inital-cluster state. - This will force this member to create a new member ID which will never match with the member IDs that are known by other 2 members of the etcd cluster. Once it dials the other 2 members then they will reject the call with the
Cluster ID mismatch
response.
What you expected to happen:
initial-cluster-state
should always be computed correctly.