Skip to content

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented Sep 17, 2020

This is a direct backport to 1.8.

Fixes: #13192

Fix bug in EKS environments where Cilium agents never become ready due to a missing CiliumNode CRD schema property

@christarazi christarazi added area/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Sep 17, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.8 kind/backports This PR provides functionality previously merged into master. labels Sep 17, 2020
@christarazi
Copy link
Member Author

test-backport-1.8

@christarazi christarazi marked this pull request as ready for review September 17, 2020 04:49
@christarazi christarazi requested a review from a team as a code owner September 17, 2020 04:49
@christarazi christarazi requested review from aanm and a team September 17, 2020 04:49
@sayboras
Copy link
Member

Just a question on how we decide to do direct backport instead of backporting from upstream PR :). Thanks.

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please write a good release note for our users

@@ -31,7 +31,7 @@ const (

// CustomResourceDefinitionSchemaVersion is semver-conformant version of CRD schema
// Used to determine if CRD needs to be updated in cluster
CustomResourceDefinitionSchemaVersion = "1.21.1"
CustomResourceDefinitionSchemaVersion = "1.21.2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This versioning is only used for Cilium [C]CNP. Lines 261-263 are missing this annotation so the "upgrade" of this CRD won't work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch. How should we workaround this then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just need to add the annotation in the Cilium Node. AFAIK, the update logic will update a CRD if the old one does not contain the annotation but the new one does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we'll need another fix on master to propagate the schema version for 1.9 or later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

master already attaches the schema versions to all CRDs:

Labels: map[string]string{

@christarazi
Copy link
Member Author

Just a question on how we decide to do direct backport instead of backporting from upstream PR :). Thanks.

@sayboras We usually do this when the issue is not on master. In this case, this bug exists on the 1.8.x line.

@christarazi christarazi changed the title [v1.8] k8s: Fix non-strucural schema with CiliumNode [v1.8] k8s: Fix non-structural schema with CiliumNode Sep 17, 2020
@christarazi
Copy link
Member Author

christarazi commented Sep 17, 2020

Also, please write a good release note for our users

@aanm I updated it, but unsure how to best word this one. Feel free to update the PR description to your liking.

@joestringer
Copy link
Member

@christarazi The question is typically "If I'm a user without deep understanding of Cilium internals, how do I know if this bug might be fixing the issue I'm hitting?". From this perspective:

Fix CiliumNode CRD schema for Kubernetes 1.16 or greater

This says OK there's some problem with the schema, but the symptoms and consequences aren't clear.

I would suggest instead something like:

Fix bug in EKS environments where Cilium agents never become ready due to a missing CiliumNode CRD schema property

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed up the release note. Just need to address the schema version issue Andre pointed out below.

@@ -31,7 +31,7 @@ const (

// CustomResourceDefinitionSchemaVersion is semver-conformant version of CRD schema
// Used to determine if CRD needs to be updated in cluster
CustomResourceDefinitionSchemaVersion = "1.21.1"
CustomResourceDefinitionSchemaVersion = "1.21.2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we'll need another fix on master to propagate the schema version for 1.9 or later.

@christarazi
Copy link
Member Author

@christarazi The question is typically "If I'm a user without deep understanding of Cilium internals, how do I know if this bug might be fixing the issue I'm hitting?".

Thanks! This is a helpful way to frame the release notes. The new release note sounds much better to me 👍

The schema was missing a `type` field on the `spec` property.

```
  conditions:
  - lastTransitionTime: "2020-09-16T21:08:18Z"
    message: '[spec.validation.openAPIV3Schema.properties[spec].type: Required value:
      must not be empty for specified object fields, spec.validation.openAPIV3Schema.type:
      Required value: must not be empty at the root]'
    reason: Violations
    status: "True"
    type: NonStructuralSchema
```

Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit allows the CiliumNode CRD to be versioned.

Signed-off-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
@christarazi christarazi force-pushed the pr/christarazi/fix-non-structural-schema-cilium-node branch from f90c9d6 to b5d16f9 Compare September 17, 2020 18:09
@christarazi
Copy link
Member Author

test-backport-1.8

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though @aanm would know the nuance of these specifications more clearly

@christarazi
Copy link
Member Author

test-missed-k8s

@aanm aanm merged commit f842660 into cilium:v1.8 Sep 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 18, 2020
@christarazi christarazi deleted the pr/christarazi/fix-non-structural-schema-cilium-node branch November 6, 2020 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. kind/backports This PR provides functionality previously merged into master. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants