-
Notifications
You must be signed in to change notification settings - Fork 3.4k
v1.12 backports 2022-10-06 #21631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.12 backports 2022-10-06 #21631
Conversation
[ upstream commit 22cf89a ] The graceful termination feature relies on receiving terminating endpoint events from Kubernetes, which are gated by a flag. Reference the terminating endpoint metric in the feature section so that users can track the metric to check if the feature is working correctly, as a first step. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 18ef607 ] The troubleshooting guide explains packet drops seen due to conntrack map fill-up issue. Reference some of the conntrack garbage collection related metrics so that users get visibility into conntrack gc runs - how often it's being triggered, how many entries were gc'd, etc. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5b41102 ] The mal-formed instance-type looks like below in CiliumNode spec: ```yaml spec: ... alibaba-cloud: instance-type: | <?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>404 - Not Found</title> </head> <body> <h1>404 - Not Found</h1> </body> </html> ``` which prohibits cilium-operator from allocating PodIPs for this node. Signed-off-by: ArthurChiao <arthurchiao@hotmail.com> Reported-by: Jaff Cheng <jaff.cheng.sh@gmail.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5b40396 ] These are the only 2 image fields that are currently not quoted. Some users templatize image name / tag values outside of Helm. Not quoting the image fields can cause issues if these values contain special characters. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 8ae1562 ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 3fe7919 ] This commit has no functional changes. It simply renames a few variables in enableIPsec to make their relationships clearer. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 91fdc20 ] This commit adds two new arguments to UpsertIPsecEndpoint to specify the outer source and destination IP address for IPsec. It doesn't include any functional changes. - UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, ... + UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, outerLocal, outerRemote net.IP, ... Until now, those two outer IP addresses were carried as part of the first two CIDR arguments, `local` and `remote`. For example, `local` would be equal to 192.168.56.11/24 where 192.168.56.0/24 would be used to match packets in XFRM policies and 192.168.56.11 as the outer IP address in XFRM states. The outer IPs are now in separate arguments and the next commit will change the local and remote arguments to not carry the IPs. Why this change? Because in a subsequent commit, I will need the CIDR and IP arguments to diverge. For example, we will have UpsertIPsecEndpoint calls with `local=0.0.0.0/0` and `outerLocal=192.168.56.11`. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 645da80 ] The previous commit changed the UpsertIPsecEndpoint function as follows: - UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, ... + UpsertIPsecEndpoint(local, remote, fwd *net.IPNet, outerLocal, outerRemote net.IP, ... The first two CIDR arguments, `local` and `remote`, now don't need to carry the outer IP addresses (moved to `outerLocal` and `outerRemote`). We can therefore change calls to this function so that those two first arguments carry only the CIDR (i.e., changed from e.g. 192.168.56.11/24 to 192.168.56.0/24). As a result, we also don't need to mask those two arguments when we want only the CIDR part. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit b33adeb ] gops and pprof do not use the same protocol to collect profile data. Thus, the default port for pprof debug endpoints in `cilium-bugtool` should not be the one used for gops, but the default one for pprof itself. Besides, clustermesh-apiserver does not support pprof yet, but only gops. Thus, the help message for the pprof port option in cilium-bugtool is fixed accordingly. Fixes: #416319b1cd (bugtool: Default to the agent's gops port) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 5520be1 ] submit-backport tried to create a backport PR with reviews from all contributors whose fixes are being backported, including people who do not have collaborator status in the repository. GitHub only allows reviews to be assigned to collaborators, and thus rejected the review assignments. This commit changes submit-backport to filter the review assignments to only include collaborators. Fixes: #21548 Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit 1019e70 ] #21120 (comment) uncovered that WaitUntilReady() is racy when using the DemoDaemonSet. After adding an additional DaemonSet to this manifest, the number of expected pods (as calculated per numExpectedPods()) no longer matches the actually deployed pods. So WaitUntilReady() exits prematurely, and we potentially access the pods' state before they are fully set up. This explains the CI flakes we've observed in eg. #21120. Fix this by updating the internal manifest description of the demo_ds.yaml manifest. This allows WaitUntilReady() to wait for the correct number of pods to become ready when using demo_ds. Also deleteInAnyNamespace() can clean up _all_ daemon sets of this manifest. Fixes: 74d9f56 ("egressgw: manager: add support for new egressGateway policy's property") Fixes #21120 Reported-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
[ upstream commit d45999d ] The script originally just exited and assumed the kubelet would restart it, but this is quite disruptive, since that restarts the agent as well. Additionally, it was discovered that some AWS installations need the service network to be functioning before the AWS CNI file is written, leaving clusters in a broken state. So, change the script to retry looking for the AWS CNI file itself. Fixes: #21243 Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>
/test-backport-1.12 Job 'Cilium-PR-K8s-1.22-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-1.22-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for PR 21497, thanks! 👍
/test-1.19-4.9 |
/test-1.22-4.9 |
/test-gke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ #21463 ✅ thanks tam!
/test-1.22-4.9 |
/test-gke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#21577 lgtm 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks!
/test-gke |
/test-1.22-4.9 |
/test-1.22-4.9 |
/test-gke |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backport of my changes looks good. Thanks!
Reviews are in, test-gke is a known flake. Merged. |
UpsertIPsecEndpoint
(@pchaigno) -- ipsec: Refactoring aroundUpsertIPsecEndpoint
#21461