Skip to content

Conversation

christarazi
Copy link
Member

@christarazi christarazi commented Sep 17, 2020

This commit attempts to improve the readability and debugability of
(*Node).syncToAPIServer(). The main motivation for this change is to fix
the swallowing of the error when UpdateStatus fails, and the subsequent
Get succeeds.

To that end, this commit adds a warning log msg and propagates the error
in case we never succeed, to surface any trouble with updating the
status of the CiliumNode (CN) resource. Failures in updating the status
can be indicative of general connectivity problems in ENI and Azure IPAM
modes. As seen in #13193, it can
also lead to Cilium never becoming ready because the agent was never
able to retrieve ENI information required to make allocations, etc. (The
former was caused by the Operator being blocked from "publishing" the
ENI information into the CN resource, which is why the agent was unable
to retrieve it.)

Warn when failing to update CiliumNode status

This commit attempts to improve the readability and debugability of
(*Node).syncToAPIServer(). The main motivation for this change is to fix
the swallowing of the error when UpdateStatus fails, and the subsequent
Get succeeds.

To that end, this commit adds a warning log msg and propagates the error
in case we never succeed, to surface any trouble with updating the
status of the CiliumNode (CN) resource. Failures in updating the status
can be indicative of general connectivity problems in ENI and Azure IPAM
modes. As seen in cilium#13193, it can
also lead to Cilium never becoming ready because the agent was never
able to retrieve ENI information required to make allocations, etc. (The
former was caused by the Operator being blocked from "publishing" the
ENI information into the CN resource, which is why the agent was unable
to retrieve it.)

Signed-off-by: Chris Tarazi <chris@isovalent.com>
@christarazi christarazi requested a review from a team September 17, 2020 06:10
@christarazi christarazi added area/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. needs-backport/1.8 release-note/misc This PR makes changes that have no direct user impact. labels Sep 17, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Sep 17, 2020
@aanm
Copy link
Member

aanm commented Sep 17, 2020

test-me-please

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 17, 2020
@vadorovsky vadorovsky merged commit a7451f8 into cilium:master Sep 17, 2020
@christarazi christarazi deleted the pr/christarazi/improve-logging-ipam-node branch September 17, 2020 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants