Skip to content

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Aug 12, 2019

What this PR does / why we need it:

  • Add a new preflight check for upgrade that runs the pause container
    with -v in a Job.
  • Wait for the Job to complete and return an error after N seconds.
  • Manually clean the Job because we don't have the TTL controller
    enabled in kubeadm yet (it's still alpha).

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#1717

Special notes for your reviewer:
"fixes" a security audit item, although not super critical as the upgrade process will catch further problems in failing components.

Does this PR introduce a user-facing change?:

kubeadm: add a upgrade health check that deploys a Job

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@kubernetes/sig-cluster-lifecycle-pr-reviews
/kind feature
/kind cleanup
/area security
/priority backlog

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. area/security cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Aug 12, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 12, 2019
@k8s-ci-robot k8s-ci-robot requested review from kad and timothysc August 12, 2019 23:24
@neolit123
Copy link
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2019
Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neolit123 !
I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.
In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.
The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).
Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

@neolit123
Copy link
Member Author

neolit123 commented Aug 13, 2019

I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.

we can hold for N seconds and see if the Pods are still running after this period, but then the crash loop period is unknown, so this is yet another arbitrary constant.

In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.

IMO, CNI demands for separate check that ensures that the Pods have networking.
it's doable, but not sure we should add it, or in general, check if the cluster is perfectly healthy.

The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).

then again if the liveness probes are working as expected the Pods should be in a Running state.

Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

yet, --config it's not a recommended pattern as we've agreed.
it should not be used to modify or fix the cluster state.
IMO we should deprecate --config and move reconf to the planned reconf feature.

i must admit my biggest concern is not that this change is checking a Running state, but rather that it's checking the Running state on all CP Pods, which can break potential parallel CP node upgrades.

@rosti do you have a proposal on how to better approach the security audit ticket or are your comments suggesting that you do not agree with it and we should take no action?

@neolit123
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 13, 2019
@neolit123
Copy link
Member Author

if this PR merges we can do a health check on the local components /health* endpoints:
#81385

@rosti
Copy link
Contributor

rosti commented Aug 14, 2019

I am not sure if this is going to pull us through. What if the pods are in a crash loop?
They may be reported as all running, but restart in a few seconds. Hence, the cluster won't be operational.

we can hold for N seconds and see if the Pods are still running after this period, but then the crash loop period is unknown, so this is yet another arbitrary constant.

That sounds a bit like reimplementing the liveness probes in a different way.

In fact, what is an operational cluster? If all of the control plane pods are running, but there is no CNI plugin installed, the cluster is still as good as useless.

IMO, CNI demands for separate check that ensures that the Pods have networking.
it's doable, but not sure we should add it, or in general, check if the cluster is perfectly healthy.

That's my point. We should see what we need to check and what not. Obviously, we cannot ensure a fully operational cluster and it's not our job to verify that.

The closest we can get to answering the original question is to use the liveness probe where available (but that may pose its own set of security risks if not done properly).

then again if the liveness probes are working as expected the Pods should be in a Running state.

If a liveness probe fails, it will restart the container in question according to the restart policy. The pod will be in running state if there is at least one container, that's starting, running or restarting.

Having a check, that spans over more than the API server and etcd, means that a misconfigured scheduler, controller manager, proxy, etc. will be more difficult to fix, than just executing kubeadm upgrade apply --config=fixed-config.yaml

yet, --config it's not a recommended pattern as we've agreed.
it should not be used to modify or fix the cluster state.
IMO we should deprecate --config and move reconf to the planned reconf feature.

+1 to that.

i must admit my biggest concern is not that this change is checking a Running state, but rather that it's checking the Running state on all CP Pods, which can break potential parallel CP node upgrades.

Indeed, only a single instance of every component is required at a minimum, to claim that we are operational.

@rosti do you have a proposal on how to better approach the security audit ticket or are your comments suggesting that you do not agree with it and we should take no action?

Actually I deliberated quite a lot on this one and I couldn't find a solution, that seemed reliable enough to me.

@neolit123
Copy link
Member Author

neolit123 commented Aug 21, 2019

this PR should be closed as the approach is not ideal, but instead i'm going to refactor it to use another approach as discussed during the office hours.

@neolit123 neolit123 changed the title kubeadm: add upgrade preflight check for "Running" control-plane Pods kubeadm: add a upgrade health check that deploys a DaemonSet Aug 27, 2019
@neolit123 neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from 6064dca to c1160de Compare August 27, 2019 22:23
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2019
@neolit123
Copy link
Member Author

@rosti @kubernetes/sig-cluster-lifecycle-pr-reviews
updated the PR to use a DS with pause image as @fabriziopandini recommended.

@neolit123
Copy link
Member Author

/retest

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neolit123 !

I don't like the idea of using a daemon set, a job would be much better in my opinion. We just want to verify that we can start something on the cluster.

A simple Job, like this one, should do the trick:

apiVersion: batch/v1
kind: Job
metadata:
  name: health-check
  namespace: kube-system
spec:
  backoffLimit: 0
  activeDeadlineSeconds: 30
  template:
    spec:
      restartPolicy: Never
      terminationGracePeriodSeconds: 0
      securityContext:
        runAsUser: 999
        runAsGroup: 999
        runAsNonRoot: true
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: health-check
        image: k8s.gcr.io/pause:3.1
        args: ["-v"]

We may also want to have the health checks with a random suffix, to ensure name uniqueness.
/hold

@neolit123
Copy link
Member Author

I don't like the idea of using a daemon set, a job would be much better in my opinion. We just want to verify that we can start something on the cluster.

wish you expressed your comments during our discussion with @fabriziopandini during the office hours, last week (was it).

@rosti
Copy link
Contributor

rosti commented Aug 28, 2019

@neolit123 we can go with a daemon set for 1.16, but let's copy over the spec and s/prepull/health-check over it.
Later on, we can turn it into a job.

@neolit123
Copy link
Member Author

we can go with a daemon set for 1.16, but let's copy over the spec and s/prepull/health-check over it.

i will try to experiment with Job later this week.

/remove-kind feature

@k8s-ci-robot k8s-ci-robot removed the kind/feature Categorizes issue or PR as related to a new feature. label Aug 28, 2019
@timothysc timothysc removed their request for review September 6, 2019 16:14
@neolit123 neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from c1160de to d777c91 Compare November 22, 2019 00:43
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 22, 2019
@neolit123 neolit123 changed the title kubeadm: add a upgrade health check that deploys a DaemonSet kubeadm: add a upgrade health check that deploys a Job Nov 22, 2019
@neolit123
Copy link
Member Author

updated to use a Job for the preflight check.

@neolit123 neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch 2 times, most recently from 2adc722 to 29bad3c Compare November 22, 2019 00:47
Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neolit123 !
Looks good and I have just non-blocking nits.
I would have used the TTL controller (even though it's still in alpha), but it's fine like this too.
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2019
- Add a new preflight check for upgrade that runs the pause container
with -v in a Job.
- Wait for the Job to complete and return an error after N seconds.
- Manually clean the Job because we don't have the TTL controller
enabled in kubeadm yet (it's still alpha).
@neolit123 neolit123 force-pushed the 1.16-kubeadm-upgrade-health-check branch from 29bad3c to 906d315 Compare November 22, 2019 16:04
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2019

// If client.Discovery().RESTClient() is nil, the fake client is used, and that means we are dry-running. Just proceed
// If client.Discovery().RESTClient() is nil, the fake client is used.
// Return early because the kubeadm dryrun dynamic client only handles the core/v1 GroupVersion.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had to add this.

the dynamic client fails when trying to get batch/v1 Job here:

unstructuredObj, err := clg.dynamicClient.Resource(action.GetResource()).Namespace(action.GetNamespace()).Get(action.GetName(), metav1.GetOptions{})

i think we should stop using the dynamic client for dryrun, but this is a much larger refactor.
until then we have to skip the logic in this preflight check on dryrun.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we can have a backlog issue for this so an eager contributor can try it.

@neolit123
Copy link
Member Author

neolit123 commented Nov 22, 2019

@rosti updated. see latest comment above.

@neolit123
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 22, 2019
Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neolit123 !
/lgtm


// If client.Discovery().RESTClient() is nil, the fake client is used, and that means we are dry-running. Just proceed
// If client.Discovery().RESTClient() is nil, the fake client is used.
// Return early because the kubeadm dryrun dynamic client only handles the core/v1 GroupVersion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we can have a backlog issue for this so an eager contributor can try it.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 26, 2019
@neolit123
Copy link
Member Author

neolit123 commented Nov 26, 2019 via email

@k8s-ci-robot k8s-ci-robot merged commit 2bc3804 into kubernetes:master Nov 26, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Nov 26, 2019
@abdennour
Copy link

Getting weird issue related to this health-check :

FATAL: [preflight] Some fatal errors occurred:
[ERROR CreateJob]: could not delete Job "upgrade-health-check" in the namespace "kube-system": jobs.batch "upgrade-health-check" not found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm area/security cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubernetes Security Assesment: evaluate superficial health check
4 participants