Skip to content

Conversation

PhilipSchmid
Copy link
Contributor

@PhilipSchmid PhilipSchmid commented Jun 4, 2024

This PR adds support for configuring the externalTrafficPolicy Kubernetes Service spec field of the Cilium Ingress/GatewayAPI LoadBalancer/NodePort Service.

  • Added configuration option to explicitly configure the externalTrafficPolicy for the Cilium Ingress Kubernetes Service.
  • Added externalTrafficPolicy support for dedicated Cilium Ingress instances. Configurable via new ingress.cilium.io/service-external-traffic-policy Ingress annotation.
  • Added externalTrafficPolicy support for Cilium GatewayAPI. eTP is globally configurable via the gatewayAPI.externalTrafficPolicy Helm flag.
externalTrafficPolicy support for Cilium Ingress and GatewayAPI

If possible, let's please try to merge this before the 1.16 feature freeze 🤞.

Open questions based on the outcome of manual E2E testing:

  • Changing ingressController.service.externalTrafficPolicy=Cluster to ingressController.service.externalTrafficPolicy=Local and re-running make kind-install-cilium causes the Helm deployment to fail. It looks like it's just a make kind-install-cilium "issue" as it runs cilium install in the background, and if I manually run cilium upgrade with the same parameters, it works. Is this a known behavior? Just double-checking it's not something related to this change.
  • Setting an invalid ingress.cilium.io/service-external-traffic-policy value on an ingress.cilium.io/loadbalancer-mode=dedicated Ingress resource "spams" the Cilium Operator logs with multiple (~10) Reconciler error messages. Even after the initial apply, we'll see such an error every other few seconds/minutes. Is there any need to mitigate this? I don't think so, but just checking.
time="2024-06-04T16:16:39Z" level=info msg="Reconciling Ingress" controller=ingress resource=default/basic-ingress subsys=ingress
time="2024-06-04T16:16:39Z" level=debug msg="Updating dedicated resources" controller=ingress resource=default/basic-ingress subsys=ingress
time="2024-06-04T16:16:39Z" level=error msg="failed to get externalTrafficPolicy annotation from Ingress object" controller=ingress error="failed to parse externalTrafficPolicy \"invalid\"" resource=default/basic-ingress subsys=ingress
time="2024-06-04T16:16:39Z" level=error msg="Reconciler error" Ingress="{basic-ingress default}" controller=ingress controllerGroup=networking.k8s.io controllerKind=Ingress error="failed to build dedicated resources: failed to parse externalTrafficPolicy \"invalid\"" name=basic-ingress namespace=default reconcileID="\"e66ea506-b349-4391-b846-aad5ff30c2f9\"" subsys=controller-runtime
  • Passing an invalid gatewayAPI.externalTrafficPolicy value causes the Cilium Operator to log an error and fail. Does this behavior make sense? I think so, but I'd like to hear other folks' opinions on that.
time="2024-06-04T14:50:45Z" level=info msg="Cilium Operator 1.16.0-dev eb7b59e954 2024-06-03T16:38:07+02:00 go version go1.22.3 linux/arm64" subsys=cilium-operator-generic
time="2024-06-04T14:50:45Z" level=fatal msg="failed to start: invalid externalTrafficPolicy: invalid-value" subsys=cilium-operator-generic
time=2024-06-04T14:50:45Z level=error msg="Invoke failed" error="invalid externalTrafficPolicy: invalid-value" function="gateway-api.initGatewayAPIController (pkg/gateway-api/cell.go:106)"
time=2024-06-04T14:50:45Z level=info msg=Stopping
  • It looks like when changing the GwAPI eTP and doing a helm upgrade, the eTP on the actual Gateway LB Services is not updated on demand (at least for <pending> LB services not). A Cilium Operator restart fixes this and patches the new eTP in the Service.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 4, 2024
@PhilipSchmid PhilipSchmid changed the title Pr/philip/etp for ingress and gwapi externalTrafficPolicy configuration support for Ingress/GwAPI K8s Service Jun 4, 2024
@PhilipSchmid PhilipSchmid force-pushed the pr/philip/etp_for_ingress_and_gwapi branch 6 times, most recently from eb7b59e to 0879c41 Compare June 4, 2024 16:18
@mhofstetter
Copy link
Member

/test

@mhofstetter mhofstetter marked this pull request as ready for review June 4, 2024 16:51
@mhofstetter mhofstetter requested review from a team as code owners June 4, 2024 16:51
Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @PhilipSchmid 🎉

I added some comments inline.

It looks like when changing the GwAPI eTP and doing a helm upgrade, the eTP on the actual Gateway LB Services is not updated on demand (at least for LB services not). A Cilium Operator restart fixes this and patches the new eTP in the Service.

Does the Operator gets automatically restarted during the Helm Upgrade (after changing the Helm values)? If not, this would be the issue. (That's why it's working after a manual Operator restart)

Changing ingressController.service.externalTrafficPolicy=Cluster to ingressController.service.externalTrafficPolicy=Local and re-running make kind-install-cilium causes the Helm deployment to fail. It looks like it's just a make kind-install-cilium "issue" as it runs cilium install in the background, and if I manually run cilium upgrade with the same parameters, it works. Is this a known behavior? Just double-checking it's not something related to this change.

Probably not related to this PR 🤞

-> I tried to address the other questions in my comments inline

@mhofstetter mhofstetter added kind/feature This introduces new functionality. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. area/servicemesh GH issues or PRs regarding servicemesh feature/k8s-gateway-api feature/k8s-ingress labels Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jun 5, 2024
@PhilipSchmid PhilipSchmid force-pushed the pr/philip/etp_for_ingress_and_gwapi branch from 0879c41 to ecef14f Compare June 6, 2024 09:55
@PhilipSchmid
Copy link
Contributor Author

Rebased to main

@PhilipSchmid PhilipSchmid force-pushed the pr/philip/etp_for_ingress_and_gwapi branch from ecef14f to 7211c32 Compare June 6, 2024 11:48
@PhilipSchmid
Copy link
Contributor Author

Thanks for the feedback, @mhofstetter. I just implemented it and force-pushed the fixed commits.

Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helm looks good to me (one minor nit)

@PhilipSchmid PhilipSchmid force-pushed the pr/philip/etp_for_ingress_and_gwapi branch from 7211c32 to 55b69fc Compare June 6, 2024 12:05
@PhilipSchmid PhilipSchmid requested a review from mhofstetter June 6, 2024 12:06
@PhilipSchmid
Copy link
Contributor Author

@mhofstetter Regarding this 👇

It looks like when changing the GwAPI eTP and doing a helm upgrade, the eTP on the actual Gateway LB Services is not updated on demand (at least for LB services not). A Cilium Operator restart fixes this and patches the new eTP in the Service.

Does the Operator gets automatically restarted during the Helm Upgrade (after changing the Helm values)? If not, this would be the issue. (That's why it's working after a manual Operator restart)

No, the Ciliium Operator isn't automatically restarted, as it's only a kube-system/cilium-config CM change. However, the same basically also applies to all of the following GwAPI-related flags:

{{- if .Values.gatewayAPI.enabled }}
enable-gateway-api: "true"
enable-gateway-api-secrets-sync: {{ .Values.gatewayAPI.secretsNamespace.sync | quote }}
enable-gateway-api-proxy-protocol: {{ .Values.gatewayAPI.enableProxyProtocol | quote }}
enable-gateway-api-app-protocol: {{ or .Values.gatewayAPI.enableAppProtocol .Values.gatewayAPI.enableAlpn | quote }}
enable-gateway-api-alpn: {{ .Values.gatewayAPI.enableAlpn | quote }}
gateway-api-xff-num-trusted-hops: {{ .Values.gatewayAPI.xffNumTrustedHops | quote }}
gateway-api-secrets-namespace: {{ .Values.gatewayAPI.secretsNamespace.name | quote }}
gateway-api-hostnetwork-enabled: {{ .Values.gatewayAPI.hostNetwork.enabled | quote }}
gateway-api-hostnetwork-nodelabelselector: {{ include "mapToString" .Values.gatewayAPI.hostNetwork.nodes.matchLabels | quote }}
{{- end }}

For example, if I change .Values.gatewayAPI.enabled from true to false or vice-versa, the Cilium Operator also doesn't get restarted. Hence, if we would like to address this, we would need a generic solution for all "kube-system/cilium-config CM only" values.

@PhilipSchmid
Copy link
Contributor Author

I did some manual E2E testing, and it looks good, but I couldn't test it 100%. The main reason for that is that a GwAPI bug was introduced with the commits of #32744 because with the commit right before it still worked. GwAPI engineers are aware of this and provide a fix anytime soon. However, this should not impact or delay this PR.

Detailed Instructions for GwAPI on AKS BYOCNI E2E Testing

GwAPI on AKS (BYOCNI) E2E Testing

## Cluster Setup
Kube-Proxy Replacement prerequisites:

az extension add --name aks-preview
az extension update --name aks-preview

az feature register --namespace "Microsoft.ContainerService" --name "KubeProxyConfigurationPreview"
az provider register --namespace Microsoft.ContainerService
set -gx CLUSTERNAME "etp-test-philip"
set -gx RESOURCEGROUP "etp-test-philip"
set -gx LOCATION "eastus2"

az group create --name $RESOURCEGROUP --location $LOCATION

echo >kube-proxy.json '{
  "enabled": false,
  "mode": "IPVS",
  "ipvsConfig": {
      "scheduler": "LeastConnection",
      "TCPTimeoutSeconds": 900,
      "TCPFINTimeoutSeconds": 120,
      "UDPTimeoutSeconds": 300
  }
}'

az aks create --name $CLUSTERNAME --resource-group $RESOURCEGROUP \
    --network-plugin none \
    --kube-proxy-config kube-proxy.json \
    --kubernetes-version 1.29

az aks get-credentials -g $RESOURCEGROUP --name $CLUSTERNAME

kubectl get nodes -o wide

Cilium Installation

Install GwAPI CRDs:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/experimental/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.0.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

Install Cilium:

echo >etp-test-values.yaml "\
image:
  override: "quay.io/cilium/cilium-ci:0879c41b0c952972f781af4c44c14b78b8e07f61@sha256:57d4325ab70d2e6b4de6bc17b17d3af27693b26adf5dd8933e00b403cfdae382"
operator:
  image:
    override: "quay.io/cilium/operator-generic-ci:0879c41b0c952972f781af4c44c14b78b8e07f61@sha256:e0485aecc65d3f0eef8fa67030104a5f5a0cb2d1f2c95f723cd14423433390bf"

kubeProxyReplacement: \"true\"
k8sServiceHost: <TBD>
k8sServicePort: 443

ingressController:
  enabled: true
  loadbalancerMode: shared
  service:
    externalTrafficPolicy: Local
gatewayAPI:
  enabled: true
  externalTrafficPolicy: Local"

cilium install --datapath-mode aks-byocni \
  --chart-directory=$HOME/git/github.com/philipschmid/cilium/install/kubernetes/cilium \
  --helm-values=etp-test-values.yaml

# Wait a few seconds
cilium status

Workload

Test infra workload:

---
apiVersion: v1
kind: Namespace
metadata:
  name: infra
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: shared-gateway
  namespace: infra
spec:
  gatewayClassName: cilium
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      namespaces:
        from: All

Get the LB IP/name (wait a few seconds):

set -gx LB_IP (kubectl get services \
  --namespace infra \
  cilium-gateway-shared-gateway \
  --output jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $LB_IP
4.153.48.103 # <- IP in my test case

Test workload (replace LB_IP within the HTTPRoute, in my example with 4.153.48.103):

---
apiVersion: v1
kind: Namespace
metadata:
  name: workload
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-app-deployment
  namespace: workload
  labels:
    app: echo-app
spec:
  replicas: 3 # 3 replicas because my testing AKS cluster has 3 nodes
  selector:
    matchLabels:
      app: echo-app
  template:
    metadata:
      labels:
        app: echo-app
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - echo-app
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: echo-app
        image: ghcr.io/philipschmid/echo-app:main
        ports:
        - containerPort: 8080
        env:
        - name: PRINT_HTTP_REQUEST_HEADERS
          value: "true"
---
apiVersion: v1
kind: Service
metadata:
  name: echo-app-service
  namespace: workload
  labels:
    app: echo-app
spec:
  selector:
    app: echo-app
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080
  type: ClusterIP
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: workload
  namespace: workload
spec:
  parentRefs:
  - name: shared-gateway
    namespace: infra
  hostnames:
    - gwapi.$LB_IP.sslip.io # gwapi.4.153.48.103.sslip.io
  rules:
  - backendRefs:
    - name: echo-app-service
      port: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: workload
  namespace: workload
spec:
  ingressClassName: cilium
  rules:
  - host: ingress.$LB_IP.sslip.io # ingress.4.153.48.103.sslip.io
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: echo-app-service
            port:
              number: 8080

Testing

Test GwAPI:

curl http://gwapi.4.153.48.103.sslip.io

Test Ingress:

curl http://ingress.4.153.48.103.sslip.io

Clean Up

az aks delete --name $CLUSTERNAME --resource-group $RESOURCEGROUP
az group delete --name $RESOURCEGROUP

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc changes look good, thanks!

A few (non-blocking) observations on the commit formatting:

  • 3rd commit description has a typo: globaly -> globally
  • 3rd commit title (gwapi: eTP support for GwAPI) is also not super easy to read because of the many abbreviations, I'd consider expanding at least one of the two and using consistent case on gwapi/GwAPI if you keep it abbreviated
  • We usually wrap commit descriptions on 72-character wide lines (see most commit descriptions in git log)
  • On your way to become the first @cisco.com commit author! 😎

Added configuration option to explicitly configure the
externalTrafficPolicy for the Cilium Ingress Kubernetes Service.

Signed-off-by: Philip Schmid <phisch@cisco.com>
Added externalTrafficPolicy support for dedicated Cilium Ingress
instances. Configurable via new
`ingress.cilium.io/service-external-traffic-policy` Ingress annotation.

Signed-off-by: Philip Schmid <phisch@cisco.com>
Added externalTrafficPolicy (eTP) support for Cilium GatewayAPI.
eTP is globally configurable via `gatewayAPI.externalTrafficPolicy`
Helm flag.

Signed-off-by: Philip Schmid <phisch@cisco.com>
@PhilipSchmid PhilipSchmid force-pushed the pr/philip/etp_for_ingress_and_gwapi branch from 55b69fc to a7c3bcf Compare June 7, 2024 12:00
@PhilipSchmid
Copy link
Contributor Author

Thanks a lot for the input, @qmonnet! I just reworded the commit messages and force-pushed once again.

@qmonnet
Copy link
Member

qmonnet commented Jun 7, 2024

Great, thanks!

/test

Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - great work @PhilipSchmid 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/v1.15 This issue affects v1.15 branch area/servicemesh GH issues or PRs regarding servicemesh feature/k8s-gateway-api feature/k8s-ingress kind/feature This introduces new functionality. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants