Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Nov 3, 2023

To provide modules access to the evolving set of local node's addresses,
add Table[NodeAddress] that derives from the low-level Table[*Device] and
applies the Cilium-specific heuristics to pick which addresses are considered
host IPs and which are used for NodePort and BPF masquerading.

To allow user to expand the set of addresses used for NodePort, add the
configuration flag "--nodeport-addresses" for specifying from which CIDRs
the NodePort addresses are allowed. This mirrors exactly the same kube-proxy
flag. If user does not specify this, then the default behavior remains, which
is to pick the first IPv4 and IPv6 address of each native device.

// NodeAddress is an IP address assigned to a network interface on a Cilium node                                                                                                               
 // that is considered a "host" IP address.                                                                                                                                                     
 type NodeAddress struct {                                                                                                                                                                      
         Addr netip.Addr                                                                                                                                                                        
                                                                                                                                                                                                
         // NodePort is true if this address is to be used for NodePort.                                                                                                                        
         // If --nodeport-addresses is set, then all addresses on native                                                                                                                        
         // devices that are contained within the specified CIDRs are chosen.                                                                                                                   
         // If it is not set, then only the primary IPv4 and/or IPv6 address                                                                                                                    
         // of each native device is used.                                                                                                                                                      
         NodePort bool                                                                                                                                                                          
                                                                                                                                                                                                
         // Primary is true if this is the primary IPv4 or IPv6 address of this device.                                                                                                         
         // This is mainly used to pick the address for BPF masquerading.                                                                                                                       
         Primary bool                                                                                                                                                                           
                                                                                                                                                                                                
         // DeviceName is the name of the network device from which this address                                                                                                                
         // is derived from.                                                                                                                                                                    
         DeviceName string                                                                                                                                                                      
 } 

This PR is one of several that aims to replace the global maps in pkg/node/address.go containing the NodePort and BPF masquerade addresses with a table that can be watched for changes and allow dynamic reconfiguration when the addresses
change. The next PR will adapt NodeAddressing to access Table[NodeAddress] and removes the global maps from pkg/node.
Further PRs down the line will switch from NodeAddressing to Table[NodeAddress] and start reconciling on changes.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 3, 2023
@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Nov 3, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 3, 2023
@joamaki joamaki force-pushed the pr/joamaki/node-address-table branch from 1483f90 to cc0f561 Compare November 3, 2023 08:25
@joamaki joamaki marked this pull request as ready for review November 3, 2023 09:47
@joamaki joamaki requested review from a team as code owners November 3, 2023 09:47
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug when there's a primary address with a scope > scopeMax, followed by secondary addresses which have smaller scope. But I might just be missing context on what that scope means and that might be an impossible case.

Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, I think having a table for this will be useful. I'm neither an expert on statedb, nor on node addressing

@joamaki joamaki force-pushed the pr/joamaki/node-address-table branch from cc0f561 to 5b6d68b Compare November 6, 2023 15:42
@joamaki joamaki requested review from bimmlerd and gandro November 6, 2023 15:45
Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (as mentioned before however, I'm a low-confidence reviewer both for statedb and node addresses)

@joamaki joamaki force-pushed the pr/joamaki/node-address-table branch from 5b6d68b to fb34639 Compare November 7, 2023 07:45
@joamaki
Copy link
Contributor Author

joamaki commented Nov 7, 2023

/test

@joamaki joamaki requested a review from gentoo-root November 7, 2023 08:21
@joamaki joamaki force-pushed the pr/joamaki/node-address-table branch from fb34639 to ea9d9b2 Compare November 7, 2023 08:38
Add support for "*cidr.CIDR" and "[]*cidr.CIDR" in config
cells:

  type ExampleConfig {
    ExampleCIDRs []*cidr.CIDR
  }

  func (def ExampleConfig) Flags(flags *pflag.FlagSet) {
    flags.StringSlice("example-cidrs", def.ExampleCIDRs, "Set the example CIDRs")
  }
  // --example-cidrs=1.2.3.4/24,2001:db8::/64

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add 'Secondary' to 'DeviceAddress' struct to allow readers to sort
the addresses by scope & secondary when picking relevant addresses.
'Secondary' is the same as IF_A_SECONDARY.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
To provide modules access to the evolving set of local node's addresses,
add Table[NodeAddress] that derives from the low-level Table[*Device] and
applies the Cilium-specific heuristics to pick which addresses are considered
host IPs and which are used for NodePort and BPF masquerading.

To allow user to expand the set of addresses used for NodePort, add the
configuration flag "--nodeport-addresses" for specifying from which CIDRs
the NodePort addresses are allowed. This mirrors exactly the same kube-proxy
flag. If user does not specify this, then the default behavior remains, which
is to pick the first IPv4 and IPv6 address of each native device.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
The addition of the check to see if routes exists without corresponding
device to the devices controller tests caused it to stop the test too
early before the test case had passed.

As the tests were faulty they didn't catch the bug where the 'Selected'
status wasn't rechecked and thus the 'veth-with-default-route' test
passed when it shouldn't have. Fix this by always checking if selected
status updates.

Fixes: 60fd85a ("devices: Flush routes on device delete")
Signed-off-by: Jussi Maki <jussi@isovalent.com>
@joamaki joamaki force-pushed the pr/joamaki/node-address-table branch from ea9d9b2 to 96e1022 Compare November 7, 2023 11:47
Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now! I think the netip stuff would be nice, but won't block the PR on it.

@joamaki
Copy link
Contributor Author

joamaki commented Nov 8, 2023

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 8, 2023
@joamaki joamaki merged commit e128789 into cilium:main Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants