Skip to content

The latest etcdctl seems buggy: etcdctl will request fail when the first etcd cluster endpoint down #11176

@lining2020x

Description

@lining2020x

This issue is related to kubernetes/kubernetes#72102. I can still reproduce that problem with 'etcdctl' of the latest release etcd(v3.4.1).

Environment:

[root@ln-node0 etcd-test# ./etcd-v3.4.1/etcdctl version
etcdctl version: 3.4.1
API version: 3.4

[root@ln-node0 etcd-test]# ./etcd-v3.4.1/etcd -version
etcd Version: 3.4.1
Git SHA: a14579fbf
Go Version: go1.12.9
Go OS/Arch: linux/amd64

How to reproduce the problems:
The repro methods comes from #10911 (comment) and kubernetes/kubernetes#72102 (comment)

  1. Create a 3 node etcd cluster with TLS enabled. Each certificate should only contain the name/IP of the node that will be serving it.
  2. Close the first etcd node
  3. Use etcdctl to request

What you expected to happen:
I expect the 3rd step to be OK

1 # ETCDCTL_API=3 ./etcd-v3.4.1/etcdctl --endpoints=https://ln-node1:2379,https://ln-node2:2379,https://ln-node3:2379 --cacert=/tmp/etcd/ca.pem --cert=/tmp/etcd/client.pem --key=/tmp/etcd/client-key.pem put bar foo
OK

2 # ssh ln-node1 systemctl stop etcd@ln-node1

3 # ETCDCTL_API=3 ./etcd-v3.4.1/etcdctl --endpoints=https://ln-node1:2379,https://ln-node2:2379,https://ln-node3:2379 --cacert=/tmp/etcd/ca.pem --cert=/tmp/etcd/client.pem --key=/tmp/etcd/client-key.pem put bar foo
{"level":"warn","ts":"2019-09-24T15:59:20.611+0800","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-3b69f8a2-9b9a-4399-aedd-6244012167a0/ln-node1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 10.150.7.194:2379: connect: connection refused\""}

4 # ETCDCTL_API=3 ./etcd-v3.3.13/etcdctl --endpoints=https://ln-node1:2379,https://ln-node2:2379,https://ln-node3:2379 --cacert=/tmp/etcd/ca.pem --cert=/tmp/etcd/client.pem --key=/tmp/etcd/client-key.pem put bar foo
OK

As the above shows, the etcdctl v3.4.1 can't work but v3.3.13 can. (It's very strange).

Here is the script I was using to setup the etcd cluster.

[root@ln-node0 etcd-test]# cat etcd-issue-repro.sh 
HOST1=10.150.7.194
HOST2=10.150.7.131
HOST3=10.150.7.132

NAME1=ln-node1
NAME2=ln-node2
NAME3=ln-node3

HOSTS=(${HOST1} ${HOST2} ${HOST3})
NAMES=(${NAME1} ${NAME2} ${NAME3})

rm -rf /tmp/etcd
mkdir -p /tmp/etcd && cd /tmp/etcd

# generate the root CA
cat >ca-config.json <<EOF
{
    "signing": {
        "default": {
            "expiry": "87600h"
        },
        "profiles": {
            "server": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            },
            "client": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}
EOF
cat >ca-csr.json <<EOF
{
    "CN": "etcd",
    "key": {
        "algo": "rsa",
        "size": 2048
    }
}
EOF
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

# generate the CA for client using the root CA
cat >client.json <<EOF
{
    "CN": "client cn",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": "CA",
            "ST": "San Francisco"
        }
    ]
}
EOF
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client


# generate CA for each etcd cluste node
for i in "${!HOSTS[@]}"; do
	HOST=${HOSTS[$i]}
	NAME=${NAMES[$i]}

	cat >config.json <<EOF
{
    "CN": "etcd cn",
    "hosts": [
        "${NAME}",
        "${HOST}"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": "CA",
            "ST": "San Francisco"
        }
    ]
}
EOF
	cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server config.json | cfssljson -bare server-${NAME}
	cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer config.json | cfssljson -bare peer-${NAME}

	ssh ${HOST} "rm -rf /etc/etcd/pki && mkdir -p /etc/etcd/pki"
	scp ca.pem server-${NAME}.pem server-${NAME}-key.pem peer-${NAME}.pem peer-${NAME}-key.pem ${HOST}:/etc/etcd/pki/
done


# generate etcd systemd service unit for each node
for i in "${!HOSTS[@]}"; do
	HOST=${HOSTS[$i]}
	NAME=${NAMES[$i]}
	ssh ${HOST} "systemctl stop etcd@${NAME}"
	ssh ${HOST} "rm -rf /var/lib/etcd && mkdir -p /var/lib/etcd"
	ssh ${HOST} "cat > /etc/systemd/system/etcd@.service <<EOF
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd

[Service]
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/usr/local/bin/etcd \
    --name etcd-%H \
    --data-dir /var/lib/etcd \
    --initial-advertise-peer-urls https://${HOST}:2380 \
    --listen-peer-urls https://${HOST}:2380 \
    --listen-client-urls https://${HOST}:2379 \
    --advertise-client-urls https://${HOST}:2379 \
    --initial-cluster-token etcd-cluster-token \
    --initial-cluster etcd-${NAME1}=https://${HOST1}:2380,etcd-${NAME2}=https://${HOST2}:2380,etcd-${NAME3}=https://${HOST3}:2380 \
    --initial-cluster-state new \
    --cert-file=/etc/etcd/pki/server-%H.pem --key-file=/etc/etcd/pki/server-%H-key.pem \
    --client-cert-auth --trusted-ca-file=/etc/etcd/pki/ca.pem \
    --peer-client-cert-auth --peer-trusted-ca-file=/etc/etcd/pki/ca.pem \
    --peer-cert-file=/etc/etcd/pki/peer-%H.pem --peer-key-file=/etc/etcd/pki/peer-%H-key.pem

[Install]
WantedBy=multi-user.target
EOF"
	ssh ${HOST} "cat  /etc/systemd/system/etcd@.service"
done

# start etcd
ssh ${NAME1} systemctl daemon-reload
ssh ${NAME2} systemctl daemon-reload
ssh ${NAME3} systemctl daemon-reload

ssh ${NAME1} systemctl restart etcd@${NAME1} &
ssh ${NAME2} systemctl restart etcd@${NAME2} &
ssh ${NAME3} systemctl restart etcd@${NAME3}


echo ""
echo "RUN THE FOLLOWING COMMAND TO VERIFY:"
echo "ETCDCTL_API=3 etcdctl --endpoints=https://${NAME1}:2379,https://${NAME2}:2379,https://${NAME3}:2379 --cacert=/tmp/etcd/ca.pem --cert=/tmp/etcd/client.pem --key=/tmp/etcd/client-key.pem  put foo bar"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions