Skip to content

Conversation

rtheobald
Copy link
Contributor

@rtheobald rtheobald commented Jun 26, 2025

Previously the ingress annotation code for the external traffic policy was returning Cluster as a default. This would cause a reconciliation error when using the host network and prevent the service from being created:

failed to create or update Service: Service "cilium-ingress-basic-ingress" is invalid: spec.externalTrafficPolicy: Invalid value: "Cluster": may only be set for externally-accessible services

For the service to work in this mode, there should be no external traffic policy.

To support this, the ingress service template also had to be adjusted to check the host network flag before setting the policy.

Fixes an error where the Ingress controller, when run in host network, created an invalid Service.

Fixes: #34028

@rtheobald rtheobald requested review from a team as code owners June 26, 2025 18:09
@rtheobald rtheobald requested review from youngnick and squeed June 26, 2025 18:09
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 26, 2025
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Jun 26, 2025
@joestringer joestringer added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Jun 26, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 26, 2025
@parlakisik
Copy link
Contributor

This issue happens in 1.17.x . it should be backported to 1.17 too

@joestringer joestringer added the needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch label Jun 26, 2025
Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix @rtheobald

@squeed squeed added the needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch label Jun 30, 2025
@squeed
Copy link
Contributor

squeed commented Jun 30, 2025

/test

@rtheobald
Copy link
Contributor Author

I have a fix for the integration test. I am just looking to run the conformance tests locally to figure out what is happening there.

@rtheobald rtheobald force-pushed the fix-ingress-reconcile-host-network branch from c337159 to 27b784d Compare July 3, 2025 20:23
@dylandreimerink
Copy link
Member

/test

@aanm aanm enabled auto-merge July 14, 2025 10:39
@rtheobald
Copy link
Contributor Author

For the first failing test, Cilium E2E Upgrade (ci-e2e-upgrade), it is setting this helm value underlayProtocol=ipv6 but the fatal error when bringing up Cilium agent indicates that WireGuard requires an IPv4 underlay:

2025-07-14T10:45:58.405297719Z time=2025-07-14T10:45:58.405064359Z level=fatal source=/go/src/github.com/cilium/cilium/pkg/logging/slog.go:159 msg="unable to run agent: failed to start: daemon creation failed: WireGuard requires an IPv4 underlay\nfailed to stop: unable to find controller ipcache-inject-labels" subsys=daemon

The second one, Conformance IPsec E2E (ci-ipsec-e2e), the Setup & Test (ipsec-7, 5.15-...) test seems to have been cancelled after an hour of run time. Not sure if that was scripted or an operator intervened.

The third failing test Conformance Runtime (ci-runtime) seems to be trying to run make for a target that doesn't exist in the makefile:

make: *** No rule to make target 'tests-privileged-only'. Stop. Error: Process completed with exit code 2.

@joestringer
Copy link
Member

At least some of those failures I have seen in the past when we change something in the CI on the main branch but the PR is based on an older commit. As a step forward I would suggest rebasing against the latest main, force push, then we can retrigger the tests to see if any new failures occur.

auto-merge was automatically disabled July 31, 2025 08:18

Head branch was pushed to by a user without write access

@rtheobald rtheobald force-pushed the fix-ingress-reconcile-host-network branch from 27b784d to 06eb366 Compare July 31, 2025 08:18
@joestringer
Copy link
Member

/test

@joestringer joestringer removed the needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch label Jul 31, 2025
@joestringer joestringer enabled auto-merge July 31, 2025 15:59
@joestringer joestringer added the affects/v1.17 This issue affects v1.17 branch label Jul 31, 2025
@rtheobald
Copy link
Contributor Author

Cilium Cluster Mesh upgrade (ci-clustermesh): Failed Upgrade and Downgrade Test (4, wireguard, iptables, false, 511, cluster) due to errors in the log when an lxd link for endpoint kube-system/clustermesh-apiserver-c49bff577-zrvmv could not be found after an upgrade

Conformance Cluster Mesh (ci-clustermesh): Failed installing Cilium on the second cluster because a port was already allocated on the host system:
Error: Unable to install Cilium: failed to create resource: Service "clustermesh-apiserver" is invalid: spec.ports[0].nodePort: Invalid value: 32380: provided port is already allocated

Conformance Gateway API (ci-gateway-api): Failed Gateway API Conformance Test (standard, false, ipsec) for MeshHTTPRouteMatching, MeshHTTPRouteQueryParamMatching and MeshHTTPRouteRedirectHostAndStatus as no pods were found. This had the egress gateway enabled

auto-merge was automatically disabled August 1, 2025 09:38

Head branch was pushed to by a user without write access

@rtheobald rtheobald force-pushed the fix-ingress-reconcile-host-network branch from 06eb366 to 741af69 Compare August 1, 2025 09:38
@joestringer
Copy link
Member

/test

@joestringer joestringer enabled auto-merge August 1, 2025 14:49
@joestringer
Copy link
Member

Thanks for triaging those failures. If they're affecting the tree we might want to check if there are existing CI issues filed for those so we can follow up. At a glance they seem unrelated to this PR.

Previously the ingress annotation code for the external traffic policy was returning Cluster as a default. This would cause a reconciliation error when using the host network and prevent the service from being created. For the service to work in this mode, there should be no external traffic policy.

To support this, the ingress service template has to be adjusted to check the host network flag before setting the policy

Fixes: cilium#34028

Signed-off-by: Rich Theobald <rich.theobald@gmail.com>
auto-merge was automatically disabled August 5, 2025 10:45

Head branch was pushed to by a user without write access

@rtheobald rtheobald force-pushed the fix-ingress-reconcile-host-network branch from 741af69 to 12002e7 Compare August 5, 2025 10:45
@joestringer
Copy link
Member

/test

@joestringer joestringer added dont-merge/bad-bot To prevent MLH from marking ready-to-merge. and removed dont-merge/bad-bot To prevent MLH from marking ready-to-merge. labels Aug 6, 2025
@joestringer joestringer enabled auto-merge August 6, 2025 23:28
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Aug 6, 2025
@joestringer joestringer added this pull request to the merge queue Aug 6, 2025
Merged via the queue into cilium:main with commit 0c97c19 Aug 7, 2025
68 checks passed
@YutaroHayakawa YutaroHayakawa mentioned this pull request Aug 11, 2025
10 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Aug 11, 2025
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. labels Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.17 This issue affects v1.17 branch backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An error occurred when enabling host network mode through Helm
7 participants