kubeadm: add a upgrade health check that deploys a Job #81319

neolit123 · 2019-08-12T23:23:19Z

What this PR does / why we need it:

Add a new preflight check for upgrade that runs the pause container
with -v in a Job.
Wait for the Job to complete and return an error after N seconds.
Manually clean the Job because we don't have the TTL controller
enabled in kubeadm yet (it's still alpha).

Which issue(s) this PR fixes:

Special notes for your reviewer:
"fixes" a security audit item, although not super critical as the upgrade process will catch further problems in failing components.

Does this PR introduce a user-facing change?:

kubeadm: add a upgrade health check that deploys a Job

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

@kubernetes/sig-cluster-lifecycle-pr-reviews
/kind feature
/kind cleanup
/area security
/priority backlog

k8s-ci-robot · 2019-08-12T23:24:10Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cmd/kubeadm/OWNERS~~ [neolit123]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

neolit123 · 2019-08-12T23:27:18Z

/hold

rosti

Thanks @neolit123 !
I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.
In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.
The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).
Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

neolit123 · 2019-08-13T13:11:15Z

I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.

we can hold for N seconds and see if the Pods are still running after this period, but then the crash loop period is unknown, so this is yet another arbitrary constant.

In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.

IMO, CNI demands for separate check that ensures that the Pods have networking.
it's doable, but not sure we should add it, or in general, check if the cluster is perfectly healthy.

The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).

then again if the liveness probes are working as expected the Pods should be in a Running state.

Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

yet, --config it's not a recommended pattern as we've agreed.
it should not be used to modify or fix the cluster state.
IMO we should deprecate --config and move reconf to the planned reconf feature.

i must admit my biggest concern is not that this change is checking a Running state, but rather that it's checking the Running state on all CP Pods, which can break potential parallel CP node upgrades.

@rosti do you have a proposal on how to better approach the security audit ticket or are your comments suggesting that you do not agree with it and we should take no action?

neolit123 · 2019-08-13T13:21:51Z

/retest

neolit123 · 2019-08-13T23:02:22Z

if this PR merges we can do a health check on the local components /health* endpoints:
#81385

rosti · 2019-08-14T11:21:14Z

I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.

we can hold for N seconds and see if the Pods are still running after this period, but then the crash loop period is unknown, so this is yet another arbitrary constant.

That sounds a bit like reimplementing the liveness probes in a different way.

In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.

IMO, CNI demands for separate check that ensures that the Pods have networking.
it's doable, but not sure we should add it, or in general, check if the cluster is perfectly healthy.

That's my point. We should see what we need to check and what not. Obviously, we cannot ensure a fully operational cluster and it's not our job to verify that.

The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).

then again if the liveness probes are working as expected the Pods should be in a Running state.

If a liveness probe fails, it will restart the container in question according to the restart policy. The pod will be in running state if there is at least one container, that's starting, running or restarting.

Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

yet, --config it's not a recommended pattern as we've agreed.
it should not be used to modify or fix the cluster state.
IMO we should deprecate --config and move reconf to the planned reconf feature.

+1 to that.

i must admit my biggest concern is not that this change is checking a Running state, but rather that it's checking the Running state on all CP Pods, which can break potential parallel CP node upgrades.

Indeed, only a single instance of every component is required at a minimum, to claim that we are operational.

@rosti do you have a proposal on how to better approach the security audit ticket or are your comments suggesting that you do not agree with it and we should take no action?

Actually I deliberated quite a lot on this one and I couldn't find a solution, that seemed reliable enough to me.

neolit123 · 2019-08-21T17:20:44Z

this PR should be closed as the approach is not ideal, but instead i'm going to refactor it to use another approach as discussed during the office hours.

neolit123 · 2019-08-27T22:23:47Z

@rosti @kubernetes/sig-cluster-lifecycle-pr-reviews
updated the PR to use a DS with pause image as @fabriziopandini recommended.

neolit123 · 2019-08-27T22:40:50Z

/retest

rosti

Thanks @neolit123 !

I don't like the idea of using a daemon set, a job would be much better in my opinion. We just want to verify that we can start something on the cluster.

A simple Job, like this one, should do the trick:

apiVersion: batch/v1
kind: Job
metadata:
  name: health-check
  namespace: kube-system
spec:
  backoffLimit: 0
  activeDeadlineSeconds: 30
  template:
    spec:
      restartPolicy: Never
      terminationGracePeriodSeconds: 0
      securityContext:
        runAsUser: 999
        runAsGroup: 999
        runAsNonRoot: true
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: health-check
        image: k8s.gcr.io/pause:3.1
        args: ["-v"]

We may also want to have the health checks with a random suffix, to ensure name uniqueness.
/hold

cmd/kubeadm/app/phases/upgrade/health.go

neolit123 · 2019-08-28T12:56:29Z

I don't like the idea of using a daemon set, a job would be much better in my opinion. We just want to verify that we can start something on the cluster.

wish you expressed your comments during our discussion with @fabriziopandini during the office hours, last week (was it).

rosti · 2019-08-28T13:12:26Z

@neolit123 we can go with a daemon set for 1.16, but let's copy over the spec and s/prepull/health-check over it.
Later on, we can turn it into a job.

neolit123 · 2019-08-28T13:21:58Z

we can go with a daemon set for 1.16, but let's copy over the spec and s/prepull/health-check over it.

i will try to experiment with Job later this week.

/remove-kind feature

neolit123 · 2019-11-22T00:44:17Z

updated to use a Job for the preflight check.

rosti

Thanks @neolit123 !
Looks good and I have just non-blocking nits.
I would have used the TTL controller (even though it's still in alpha), but it's fine like this too.
/lgtm

cmd/kubeadm/app/phases/upgrade/health.go

- Add a new preflight check for upgrade that runs the pause container with -v in a Job. - Wait for the Job to complete and return an error after N seconds. - Manually clean the Job because we don't have the TTL controller enabled in kubeadm yet (it's still alpha).

neolit123 · 2019-11-22T16:08:28Z

cmd/kubeadm/app/phases/upgrade/health.go


-	// If client.Discovery().RESTClient() is nil, the fake client is used, and that means we are dry-running. Just proceed
+	// If client.Discovery().RESTClient() is nil, the fake client is used.
+	// Return early because the kubeadm dryrun dynamic client only handles the core/v1 GroupVersion.


had to add this.

the dynamic client fails when trying to get batch/v1 Job here:

kubernetes/cmd/kubeadm/app/util/apiclient/clientbacked_dryrun.go

Line 77 in fa1ca34

unstructuredObj, err := clg.dynamicClient.Resource(action.GetResource()).Namespace(action.GetNamespace()).Get(action.GetName(), metav1.GetOptions{})

i think we should stop using the dynamic client for dryrun, but this is a much larger refactor.
until then we have to skip the logic in this preflight check on dryrun.

+1 we can have a backlog issue for this so an eager contributor can try it.

neolit123 · 2019-11-22T16:09:05Z

@rosti updated. see latest comment above.

neolit123 · 2019-11-22T16:09:23Z

/hold cancel

rosti

Thanks @neolit123 !
/lgtm

rosti · 2019-11-26T17:30:51Z

cmd/kubeadm/app/phases/upgrade/health.go


-	// If client.Discovery().RESTClient() is nil, the fake client is used, and that means we are dry-running. Just proceed
+	// If client.Discovery().RESTClient() is nil, the fake client is used.
+	// Return early because the kubeadm dryrun dynamic client only handles the core/v1 GroupVersion.


+1 we can have a backlog issue for this so an eager contributor can try it.

neolit123 · 2019-11-26T17:43:40Z

I logged the issue, a couple of days ago. I think.

abdennour · 2021-08-07T10:50:10Z

Getting weird issue related to this health-check :

FATAL: [preflight] Some fatal errors occurred:
[ERROR CreateJob]: could not delete Job "upgrade-health-check" in the namespace "kube-system": jobs.batch "upgrade-health-check" not found

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 12, 2019

k8s-ci-robot requested review from kad and timothysc August 12, 2019 23:24

k8s-ci-robot added the area/kubeadm label Aug 12, 2019

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2019

rosti reviewed Aug 13, 2019

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 13, 2019

neolit123 mentioned this pull request Aug 13, 2019

kubeadm: use etcd's /health endpoint for it's liveness probe #81385

Merged

neolit123 changed the title ~~kubeadm: add upgrade preflight check for "Running" control-plane Pods~~ kubeadm: add a upgrade health check that deploys a DaemonSet Aug 27, 2019

neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from 6064dca to c1160de Compare August 27, 2019 22:23

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2019

rosti suggested changes Aug 28, 2019

View reviewed changes

cmd/kubeadm/app/phases/upgrade/health.go Outdated Show resolved Hide resolved

cmd/kubeadm/app/phases/upgrade/health.go Outdated Show resolved Hide resolved

k8s-ci-robot removed the kind/feature Categorizes issue or PR as related to a new feature. label Aug 28, 2019

timothysc removed their request for review September 6, 2019 16:14

timothysc assigned rosti Sep 6, 2019

neolit123 mentioned this pull request Nov 6, 2019

TOB-K8S-009: Superficial health check provides false sense of safety #81141

Closed

neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from c1160de to d777c91 Compare November 22, 2019 00:43

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 22, 2019

neolit123 changed the title ~~kubeadm: add a upgrade health check that deploys a DaemonSet~~ kubeadm: add a upgrade health check that deploys a Job Nov 22, 2019

neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch 2 times, most recently from 2adc722 to 29bad3c Compare November 22, 2019 00:47

rosti approved these changes Nov 22, 2019

View reviewed changes

cmd/kubeadm/app/phases/upgrade/health.go Outdated Show resolved Hide resolved

cmd/kubeadm/app/phases/upgrade/health.go Show resolved Hide resolved

cmd/kubeadm/app/phases/upgrade/health.go Show resolved Hide resolved

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2019

neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from 29bad3c to 906d315 Compare November 22, 2019 16:04

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2019

neolit123 commented Nov 22, 2019

View reviewed changes

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2019

rosti approved these changes Nov 26, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 26, 2019

k8s-ci-robot merged commit 2bc3804 into kubernetes:master Nov 26, 2019

k8s-ci-robot added this to the v1.18 milestone Nov 26, 2019

cji mentioned this pull request Dec 5, 2019

Kubernetes 3rd Party Security Audit Findings #81146

Open

kubeadm: add a upgrade health check that deploys a Job #81319

kubeadm: add a upgrade health check that deploys a Job #81319

Uh oh!

Conversation

neolit123 commented Aug 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Aug 12, 2019

Uh oh!

neolit123 commented Aug 12, 2019

Uh oh!

rosti left a comment

Choose a reason for hiding this comment

Uh oh!

neolit123 commented Aug 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neolit123 commented Aug 13, 2019

Uh oh!

neolit123 commented Aug 13, 2019

Uh oh!

rosti commented Aug 14, 2019

Uh oh!

neolit123 commented Aug 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neolit123 commented Aug 27, 2019

Uh oh!

neolit123 commented Aug 27, 2019

Uh oh!

rosti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

neolit123 commented Aug 28, 2019

Uh oh!

rosti commented Aug 28, 2019

Uh oh!

neolit123 commented Aug 28, 2019

Uh oh!

neolit123 commented Nov 22, 2019

Uh oh!

rosti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neolit123 Nov 22, 2019

Choose a reason for hiding this comment

Uh oh!

rosti Nov 26, 2019

Choose a reason for hiding this comment

Uh oh!

neolit123 commented Nov 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neolit123 commented Nov 22, 2019

Uh oh!

rosti left a comment

Choose a reason for hiding this comment

Uh oh!

rosti Nov 26, 2019

Choose a reason for hiding this comment

Uh oh!

neolit123 commented Nov 26, 2019 via email

Uh oh!

abdennour commented Aug 7, 2021

Uh oh!

Uh oh!

neolit123 commented Aug 12, 2019 •

edited

Loading

neolit123 commented Aug 13, 2019 •

edited

Loading

neolit123 commented Aug 21, 2019 •

edited

Loading

neolit123 commented Nov 22, 2019 •

edited

Loading