Skip to content

BGPv2: CiliumBGPAdvertisement ignores overlapping matches #35721

@dswaffordcw

Description

@dswaffordcw

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.16.0 and lower than v1.17.0

What happened?

When configuring CiliumBGPAdvertisement with overlapping sector-based matches, the last sequential match is used and previous matches are ignored. There are no errors thrown. Based on https://github.com/cilium/cilium/blob/main/pkg/bgpv1/manager/reconcilerv2/service.go#L295 and my own log messages, overlapping matches are carried through to this loop and each previous match is overwritten by the next.

I believe an assumption was made that the administrator would only define a single match for each combination of advertisementType and selector. The lack of an error message leads me to believe that I am trying to configure Cilium for a use case that wasn't considered, rather than an intentionally not supported one.

As a cloud operator, I have a use case that would ideally be served by setting a dynamic combination of BGP communities based on the workload deployed. As an example, consider:

  • Matching the label "customer" which sets the BGP Community 100:100.
  • Matching the label "customer-vpc" which sets the BGP Community 110:110.
  • Matching the label "backbone" which specifies the BGP Community 200:200.

In this example, all advertisements from the customer's workload include at minimum BGP Community 100:100. Some advertisements may contain BGP Communities 100:100 and 110:110, while others may contain 100:100 and 200:200.

I would like to modify Cilium to permit overlapping matches for CiliumBGPAdvertisement.advertisements with the following logic:

  • When overlapping entries define communities (standard, wellKnown, or large), treat it as an append operation. All matches within the same type (standard vs. large) are combined.
  • When overlapping entries define localPreference, regardless of being accompanied by communities or in isolation, if their localPreference values are different -- throw an error and stop processing the CR.

Example:

    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "101:101" ]
    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "202:202" ]

When processed by Cilium, this is the same as:

    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "101:101", "202:202" ]

How can we reproduce the issue?

Cilium Configuration:

---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
  name: cilium-bgp-cp
spec:
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: bgp-cplane-dev-v4-control-plane
  bgpInstances:
  - name: "control-plane"
    localASN: 65001
    peers:
    - name: "frr"
      peerASN: 65000
      peerAddress: 10.0.1.1
      peerConfigRef:
        name: "cilium-cp-frr-peer"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
  name: cilium-bgp-worker
spec:
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: bgp-cplane-dev-v4-worker
  bgpInstances:
  - name: "worker"
    localASN: 65002
    peers:
    - name: "frr"
      peerASN: 65000
      peerAddress: 10.0.2.1
      peerConfigRef:
        name: "cilium-cp-frr-peer"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
  name: "cilium-cp-frr-peer"
spec:
  gracefulRestart:
    enabled: true
  families:
    - afi: ipv4
      safi: unicast
      advertisements:
        matchLabels:
          advertise: "bgp"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: "bgp-advertisements"
  labels:
    advertise: "bgp"
spec:
  advertisements:
    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "101:101" ]
    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "202:202" ]
    - advertisementType: "Service"
      service:
        addresses:
        - ClusterIP
      selector:
        matchExpressions:
        - {key: somekey, operator: NotIn, values: ['never-used-value']}
      attributes:
        communities:
          standard: [ "303:303" ]
---

Cilium Version

$  cilium version
cilium-cli: v0.16.7-40-g9316d0ac compiled with go1.22.2 on linux/amd64
cilium image (default): v1.15.5
cilium image (stable): v1.16.3
cilium image (running): 1.17.0-dev

I am running a local build of Cilium using the BGP Control Plane's Kind dev environment:

git log
commit d32ff87a6aa69b64f027fae40083d9148d283d1e (HEAD -> dswaffordcw/bgp-adverts-additive, upstream/main, main)
Author: André Martins <andre@cilium.io>
Date:   Thu Sep 12 11:07:02 2024 +0200

Kernel Version

Linux hostname 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

$  k version
Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.15-enhanced-describe-dirty", GitCommit:"ac2e2baa7d4039cc4c68f2e869e4edbe2d60b305", GitTreeState:"dirty", BuildDate:"2023-03-02T00:33:46Z", GoVersion:"go1.20.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.0", GitCommit:"7c48c2bd72b9bf5c44d21d7338cc7bea77d0ad2a", GitTreeState:"clean", BuildDate:"2024-05-13T22:00:36Z", GoVersion:"go1.22.2", Compiler:"gc", Platform:"linux/amd64"}

Regression

No response

Sysdump

No response

Relevant log output

FRR (from BGP Control Plane's Dev Environment)

router0# show ip bgp sum

IPv4 Unicast Summary (VRF default):
BGP router identifier 10.0.0.1, local AS number 65000 vrf-id 0
BGP table version 3
RIB entries 5, using 960 bytes of memory
Peers 2, using 1434 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
10.0.1.2        4          0         0         0        0    0    0    never       Active        0 N/A
10.0.2.2        4      65002         4         4        0    0    0 00:00:01            3        3 N/A

Total number of neighbors 2
router0# show ip bgp
BGP table version is 6, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 65000
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*= 10.2.0.1/32      10.0.1.2                               0 65001 i
*>                  10.0.2.2                               0 65002 i
*= 10.2.0.10/32     10.0.1.2                               0 65001 i
*>                  10.0.2.2                               0 65002 i
*= 10.2.236.92/32   10.0.1.2                               0 65001 i
*>                  10.0.2.2                               0 65002 i

Displayed  3 routes and 6 total paths
router0# show ip bgp 10.2.0.1
BGP routing table entry for 10.2.0.1/32, version 5
Paths: (2 available, best #2, table default)
  Advertised to non peer-group peers:
  10.0.1.2 10.0.2.2
  65001
    10.0.1.2 from 10.0.1.2 (10.0.1.2)
      Origin IGP, valid, external, multipath
      Community: 303:303
      Last update: Mon Nov  4 00:04:01 2024
  65002
    10.0.2.2 from 10.0.2.2 (10.0.2.2)
      Origin IGP, valid, external, multipath, best (Older Path)
      Community: 303:303
      Last update: Mon Nov  4 00:03:59 2024

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/bgpImpacts the Border Gateway Protocol feature.area/datapathImpacts bpf/ or low-level forwarding details, including map management and monitor messages.info-completedThe GH issue has received a reply from the authorkind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions