test: Run kubectl in test VM for older K8s releases #11072

jrajahalme · 2020-04-20T16:38:05Z

kubectl is guaranteed to be compatible with limited number of earlier
releases:

kubectl is supported within one minor version (older or newer) of
kube-apiserver.

So far we have been using the latest kubectl in the host (now 1.18) to
control clusters from K8s 1.11 to 1.18. This did not work any more
with 'istioctl', which complained about "kubectl not being found in
$PATH". When pairing istioctl with the same version of kubectl as the
cluster this started working again.

Downgrading kubectl in the test hosts may not be practical, but the CI
infra also supports running kubectl in the cluster's master node
(k8s1). This is triggered via the value of the Ginkgo
'cilium.kubeconfig' option. When 'cilium.kubeconfig' is non-empty, it
is assumed to be a path to a valid kubeconfig for connecting kubectl
in the host to the test cluster. When 'cilium.kubeconfig' is empty, CI
Ginkgo helpers assume that kubectl should be run on
"k8s1". 'test/vagrant-ci-start.sh' expects KUBECONFIG environment
variable to be set to the path to the file into which the kubeconfig
fetched from the test cluster should be stored. Modify
'test/vagrant-ci-start.sh' to accept undefined KUBECONFIG to signify
the need to run kubectl in the test cluster's master node (k8s1).

Finally, remove setting of both the KUBECONFIG environment variable
before calling 'vagrant-ci-start.sh' and the Ginkgo option
'cilium.kubeconfig' in 'ginkgo-kubernetes-all.Jenkinsfile' which is
used to run the CI K8s test suite on k8s versions from 1.11 to
1.17. This way we always use the kubectl installed as part of the
testing cluster itself, in the master node. This solves the
compatibility problem with istioctl and should help guard that we have
not introduced any kubectl syntax that would not be compatible with
the target k8s version.

Signed-off-by: Jarno Rajahalme jarno@covalent.io

coveralls · 2020-04-20T19:16:30Z

Coverage increased (+0.01%) to 44.783% when pulling f755217 on pr/jrajahalme/k8s-kubectl-not-found-debug into 822a64e on master.

aanm · 2020-04-23T10:12:48Z

this PR has been marked as a draft PR since it had a WIP label. Please click in "Ready for review" [below vvv ] once the PR is ready to be reviewed. CI will still run for draft PRs.

jrajahalme · 2020-04-23T16:58:37Z

@aanm istioctl failing with K8s 1.14 and 1.15 may be due to the installed kubectl being version 1.18. When testing locally on Mac I had to downgrade kubectl to match the server version to make it work. Do you know how I could downgrade kubectl, or maybe we should do it universally for older k8s versions?

Ref:

kubectl
kubectl is supported within one minor version (older or newer) of kube-apiserver.

Example:

kube-apiserver is at 1.18
kubectl is supported at 1.19, 1.18, and 1.17

aanm · 2020-04-23T17:09:57Z

@aanm istioctl failing with K8s 1.14 and 1.15 may be due to the installed kubectl being version 1.18. When testing locally on Mac I had to downgrade kubectl to match the server version to make it work. Do you know how I could downgrade kubectl, or maybe we should do it universally for older k8s versions?

Ref:

kubectl
kubectl is supported within one minor version (older or newer) of kube-apiserver.
Example:
kube-apiserver is at 1.18
kubectl is supported at 1.19, 1.18, and 1.17

Downgrade kubectl where? in istioctl?

jrajahalme · 2020-04-23T17:16:38Z

istioctl is complaining that it can't find kubectl in $PATH. My hunch is that the installed kubectl version (1.18) is found, but then fails due to some compatibility reason. I'm assuming istioctl is trying to invoke kubectl from the $PATH, but the installed kubectl version (at /usr/local/bin/kubectl) is version 1.18, even though the VM is booted with K8S_VERSION=1.1[45].

This is the output of the commands I added in this PR for debugging purposes:

time="2020-04-20T19:07:53Z" level=debug msg="running command: which kubectl"
cmd: "which kubectl" exitCode: 0 duration: 1.683695ms stdout:
/usr/local/bin/kubectl

time="2020-04-20T19:07:53Z" level=debug msg="running command: echo $PATH"
cmd: "echo $PATH" exitCode: 0 duration: 892.9µs stdout:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

time="2020-04-20T19:07:53Z" level=debug msg="running command: `which kubectl` version"
cmd: "`which kubectl` version" exitCode: 0 duration: 51.389712ms stdout:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.10", GitCommit:"575467a0eaf3ca1f20eb86215b3bde40a5ae617a", GitTreeState:"clean", BuildDate:"2019-12-11T12:32:32Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

jrajahalme · 2020-04-23T17:17:52Z

@aanm Do you know why the kubectl reports client version 1.18 on a test VM with K8S_VERSION=1.14??

jrajahalme · 2020-04-23T17:22:36Z

VM start scripts shows:

11:58:16      k8s1-1.14: Setting up kubectl (1.14.10-00) ...

But then when running tests kubectl version reports client version 1.18???

aanm · 2020-04-23T17:50:34Z

VM start scripts shows:
11:58:16      k8s1-1.14: Setting up kubectl (1.14.10-00) ...
But then when running tests kubectl version reports client version 1.18???

IIRC @jrajahalme that is the version of the kubectl installed in the nodes (not VMs)

jrajahalme · 2020-04-23T18:45:21Z

@aanm Exactly, istioctl is run in the host, just like any kubectl command run in the test?

jrajahalme · 2020-04-23T18:46:45Z

So I conflated the VM k8s install to the host kubectl above, sorry.

jrajahalme · 2020-04-23T18:59:59Z

This PR adds:

		res = kubectl.ExecShort("`which kubectl` version")

Which reports client version 1.18. It would make sense that this is coming from the host, but kubectl is initialized as:

		kubectl = helpers.CreateKubectl(helpers.K8s1VMName(), logger)

aanm · 2020-04-23T19:11:12Z

This PR adds:
		res = kubectl.ExecShort("`which kubectl` version")
Which reports client version 1.18. It would make sense that this is coming from the host, but kubectl is initialized as:
		kubectl = helpers.CreateKubectl(helpers.K8s1VMName(), logger)

@jrajahalme and isn't that VM running k8s 1.18?

jrajahalme · 2020-04-23T19:14:59Z

@aanm I don't know, is it different from the ones started under K8S_VERSION=1.14? If so, is there a way to start K8s1VM with K8s 1.14 instead?

jrajahalme · 2020-04-23T19:17:46Z

And why would we configure k8s1 and k8s2 with different versions of Kubernetes?

jrajahalme · 2020-04-23T23:21:11Z

test-missed-k8s

jrajahalme · 2020-04-24T02:34:03Z

test-missed-k8s

Add the missing file suffix (.sh) to print-node-ip calls in Jenkinsfile. This prevents unnecessary Cilium compilation and helps speed up test runs. Add OSX support to 'test/print-node-ip.sh'. Use simpler 'cut' instead of 'awk' for Linux. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

kubectl is guaranteed to be compatible with limited number of earlier releases: > kubectl is supported within one minor version (older or newer) of > kube-apiserver. So far we have been using the latest kubectl in the host (now 1.18) to control clusters from K8s 1.11 to 1.18. This did not work any more with 'istioctl', which complained about "kubectl not being found in $PATH". When pairing istioctl with the same version of kubectl as the cluster this started working again. Downgrading kubectl in the test hosts may not be practical, but the CI infra also supports running kubectl in the cluster's master node (k8s1). This is triggered via the value of the Ginkgo 'cilium.kubeconfig' option. When 'cilium.kubeconfig' is non-empty, it is assumed to be a path to a valid kubeconfig for connecting kubectl in the host to the test cluster. When 'cilium.kubeconfig' is empty, CI Ginkgo helpers assume that kubectl should be run on "k8s1". 'test/vagrant-ci-start.sh' expects KUBECONFIG environment variable to be set to the path to the file into which the kubeconfig fetched from the test cluster should be stored. Modify 'test/vagrant-ci-start.sh' to accept undefined KUBECONFIG to signify the need to run kubectl in the test cluster's master node (k8s1). Finally, remove setting of both the KUBECONFIG environment variable before calling 'vagrant-ci-start.sh' and the Ginkgo option 'cilium.kubeconfig' in 'ginkgo-kubernetes-all.Jenkinsfile' which is used to run the CI K8s test suite on k8s versions from 1.11 to 1.17. This way we always use the kubectl installed as part of the testing cluster itself, in the master node. This solves the compatibility problem with istioctl and should help guard that we have not introduced any kubectl syntax that would not be compatible with the target k8s version. Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

jrajahalme · 2020-04-24T06:05:14Z

test-me-please

jrajahalme · 2020-04-24T06:05:25Z

never-tell-me-the-odds

nebril

Great job with finding the root cause of the issue!

While this change fixes the problem, I believe that going back to "the old way" of running kubectl via SSH in isn't the way to go.

SSH-specific execution code is left in the test framework exclusively to support Runtime tests. Historically, SSH to vagrant VM was unreliable and made debugging harder. As we retire Runtime tests, we should be able to delete this code with minimal pain.

I would strongly prefer fixing this by adding multiple versions of kubectl binaries on CI nodes and manipulating PATH. I can add downloading of all needed kubectl binaries into node provisioning scripts, after that this change would just need symlinking to them appropriately and setting the PATH.

Also, thanks for fixing print-node-ip bugs, I am not sure how these could have persisted for so long

jrajahalme · 2020-04-24T16:48:52Z

test-me-please

jrajahalme · 2020-04-24T17:17:12Z

@nebril Running K8s CI tests locally on a Mac host without Jenkins, when started with test/vagrant-local-start.sh depends on running the Ginkgo Execs in via SSH. I'd like to keep this functionality if possible.

I'll revise this PR after you have made the old kubectl versions available. We probably should use an old version for the k8s 1.11 tests in the regular ginkgo k8s test suite as well.

christarazi

Good find!

nebril · 2020-04-27T14:33:56Z

after discussing offline let's merge this as an immediate mitigation

jrajahalme added wip release-note/ci This PR makes changes to the CI. labels Apr 20, 2020

jrajahalme requested review from a team as code owners April 20, 2020 16:38

jrajahalme force-pushed the pr/jrajahalme/k8s-kubectl-not-found-debug branch 2 times, most recently from 061b97b to 720cb3d Compare April 20, 2020 16:46

aanm marked this pull request as draft April 23, 2020 10:12

aanm removed the wip label Apr 23, 2020

jrajahalme force-pushed the pr/jrajahalme/k8s-kubectl-not-found-debug branch from 720cb3d to 76d2ee0 Compare April 23, 2020 23:20

jrajahalme force-pushed the pr/jrajahalme/k8s-kubectl-not-found-debug branch 6 times, most recently from 6abed98 to 62d5c00 Compare April 24, 2020 01:34

jrajahalme changed the title ~~CI: Istio debugging~~ test: Run kubectl in test VM for older K8s releases Apr 24, 2020

jrajahalme added 3 commits April 23, 2020 23:03

test: Remove dead code from get-vagrant-kubeconfig.sh

ca8b693

Signed-off-by: Jarno Rajahalme <jarno@covalent.io>

jrajahalme force-pushed the pr/jrajahalme/k8s-kubectl-not-found-debug branch from cb2b100 to f755217 Compare April 24, 2020 06:04

jrajahalme added kind/bug/CI This is a bug in the testing code. priority/release-blocker labels Apr 24, 2020

jrajahalme marked this pull request as ready for review April 24, 2020 06:07

nebril requested changes Apr 24, 2020

View reviewed changes

jrajahalme mentioned this pull request Apr 24, 2020

Add policy support for named ports #11092

Merged

christarazi approved these changes Apr 24, 2020

View reviewed changes

nebril merged commit 5cdde10 into master Apr 27, 2020

nebril deleted the pr/jrajahalme/k8s-kubectl-not-found-debug branch April 27, 2020 14:34

nebril mentioned this pull request Apr 27, 2020

Use kubectl in proper version for older k8s versions on host #11172

Closed

This was referenced Apr 27, 2020

test: disable istio tests in k8s 1.14 and 1.15 #11065

Closed

CI: K8sIstioTest Istio Bookinfo Demo Tests bookinfo inter-service connectivity #11063

Closed

jrajahalme added the needs-backport/1.7 label Apr 29, 2020

jrajahalme mentioned this pull request Apr 29, 2020

v1.7 backports 2020-04-22 #11109

Merged

jrajahalme added backport-pending/1.7 and removed needs-backport/1.7 labels Apr 29, 2020

joestringer mentioned this pull request Apr 29, 2020

v1.7 backports 2020-04-28 #11200

Merged

joestringer added backport-done/1.7 and removed backport-pending/1.7 labels Apr 29, 2020

test: Run kubectl in test VM for older K8s releases #11072

test: Run kubectl in test VM for older K8s releases #11072

Conversation

jrajahalme commented Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aanm commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

aanm commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

aanm commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

aanm commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 23, 2020

Uh oh!

jrajahalme commented Apr 24, 2020

Uh oh!

jrajahalme commented Apr 24, 2020

Uh oh!

jrajahalme commented Apr 24, 2020

Uh oh!

nebril left a comment

Choose a reason for hiding this comment

Uh oh!

jrajahalme commented Apr 24, 2020

Uh oh!

jrajahalme commented Apr 24, 2020

Uh oh!

christarazi left a comment

Choose a reason for hiding this comment

Uh oh!

nebril commented Apr 27, 2020

Uh oh!

Uh oh!

jrajahalme commented Apr 20, 2020 •

edited

Loading

coveralls commented Apr 20, 2020 •

edited

Loading

jrajahalme commented Apr 23, 2020 •

edited

Loading