Skip to content

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Nov 9, 2022

Once this PR is merged, you can update the PR labels via:

$ for pr in 21831 21064 21526 21897; do contrib/backporting/set-labels.py $pr done 1.11; done

pchaigno and others added 2 commits November 9, 2022 23:02
[ upstream commit 34127e6 ]

The KPR guide contains the autoDirectNodeRoutes option in most Helm
commands, but that option isn't a requirement for KPR subfeatures and
may even fail if Kubernetes nodes are not L2-connected.

Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit e121b5d ]

Over time we've been accumulating some knowledge about particular
Linux distributions and groups of distributions that has gone largely
unnoted in our documentation. A good understanding and implementation
of these considerations are extremely important to ensure that Cilium
runs properly, so this commit attempts at adding a subsection containing
this information.

Signed-off-by: Bruno M. Custódio <brunomcustodio@gmail.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno requested a review from a team as a code owner November 9, 2022 22:17
@pchaigno pchaigno added backport/1.11 kind/backports This PR provides functionality previously merged into master. labels Nov 9, 2022
@pchaigno pchaigno force-pushed the pr/v1.11-backport-2022-11-09 branch from 34a8a05 to a791ad5 Compare November 9, 2022 22:19
@dylandreimerink
Copy link
Member

The unit test included in #21526 somehow triggers a nil pointer deference in operator.startSynchronizingCiliumNodes. I will need to do some investigation on why(original stack trace is gone, panic is caught and re-thrown by error handling in the test). So I will have to make a dedicated backport PR with a bugfix for v1.11.

dylandreimerink and others added 2 commits November 15, 2022 11:59
[ upstream commit 4c9c1d3 ]

This commit fixes an edge case in the `NodesPodCIDRManager`.
If there were any nodes on operator startup which have no PodCIDRs, the
operator would sometimes assign PodCIDRs to these nodes which have
already been allocated to other nodes.

The operator assumed that when `k8sCiliumNodesCacheSynced` closes, all
node events have been processed. And it proceeds to call `Resync` on
the `nodeManager`.

The `NodesPodCIDRManager` will queue any nodes
without PodCIDRs to be allocated once the `canAllocatePodCIDRs` variable
is set. This variable is set by the `Resync`.

So, the assumption/expected behavior is that the
`NodesPodCIDRManager.Update` function has been called for all nodes in
the cache before `Resync` is called. However, this wasn't the case.
The `startSynchronizingCiliumNodes` function starts the informer
and connects the nodeManager to it. But instead of handling the events
at once, the callbacks enqueue the events, to be handled by a separate
go routine. This means that `k8sCiliumNodesCacheSynced` is closed once
all of the node events are enqueued, not when they have been processed
by the `nodeManager`.

This commit fixes this behavior by processing all events at once
in the informer callbacks until the full sync is complete, at which
point we will switch over to using the workqueue.

Fixes: cilium#21482

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit b3cd077 ]

Clarify that Azure CNI chaining is different than
Azure CNI Powered by Cilium.

Signed-off-by: Will Daly <widaly@microsoft.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
@dylandreimerink dylandreimerink force-pushed the pr/v1.11-backport-2022-11-09 branch from a791ad5 to a136494 Compare November 15, 2022 10:59
@pchaigno
Copy link
Member Author

pchaigno commented Nov 15, 2022

/test-backport-1.11

Job 'Cilium-PR-K8s-GKE' failed:

Click to show.

Test Name

K8sDatapathConfig MonitorAggregation Checks that monitor aggregation flags send notifications

Failure Output

FAIL: Pods are not ready in time: timed out waiting for pods with filter  to be ready: 4m0s timeout expired

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-GKE so I can create one.

@michi-covalent michi-covalent merged commit be42a2a into cilium:v1.11 Nov 15, 2022
@pchaigno pchaigno deleted the pr/v1.11-backport-2022-11-09 branch November 15, 2022 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants