Skip to content

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Nov 7, 2023

This works towards making the cilium-agent responsive to node address changes by moving from global variables and direct use of netlink.AddrList to using statedb tables that can be queried and watched for changes.

Reimplement NodeAddressing to query against Table[NodeAddress] (derived from Table[*Device] which is populated from netlink subscriptions to links, addresses and routes) and LocalNodeStore instead of using the pkg/node globals or doing a netlink.AddrList. Eventual goal is to remove NodeAddressing and instead have the modules query Table[NodeAddress] directly and reconcile if it changes.

Remove the ipv{4,6}NodePortAddrs globals and the related getters and init, replacing uses of them with either NodeAddressing or queries to Table[NodeAddress].

See commit messages for more details.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 7, 2023
@joamaki joamaki force-pushed the pr/joamaki/node-addressing-via-address-table branch from fbae7b5 to c23775a Compare November 8, 2023 12:22
@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Nov 8, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 8, 2023
@joamaki
Copy link
Contributor Author

joamaki commented Nov 8, 2023

/test

@joamaki joamaki changed the title [DRAFT] Implement NodeAddressing on top of Table[NodeAddress] Implement NodeAddressing on top of Table[NodeAddress] Nov 8, 2023
@joamaki joamaki force-pushed the pr/joamaki/node-addressing-via-address-table branch from c23775a to 9a29a49 Compare November 8, 2023 13:30
@joamaki
Copy link
Contributor Author

joamaki commented Nov 8, 2023

/test

@joamaki joamaki marked this pull request as ready for review November 8, 2023 13:58
@joamaki joamaki requested review from a team as code owners November 8, 2023 13:58
@joamaki
Copy link
Contributor Author

joamaki commented Nov 8, 2023

Assigning to @bimmlerd to push this forward while I'm on vacation.

@bimmlerd bimmlerd force-pushed the pr/joamaki/node-addressing-via-address-table branch from 9a29a49 to 7f0a827 Compare November 10, 2023 16:23
@bimmlerd
Copy link
Member

bimmlerd commented Nov 10, 2023

/test (rebased to fix conflicts)

@bimmlerd bimmlerd force-pushed the pr/joamaki/node-addressing-via-address-table branch from 7f0a827 to 917df73 Compare November 13, 2023 10:08
@bimmlerd
Copy link
Member

bimmlerd commented Nov 13, 2023

/test (legitimate failures in ci-runtime)

@bimmlerd bimmlerd force-pushed the pr/joamaki/node-addressing-via-address-table branch 3 times, most recently from 17418f9 to a7ff7df Compare November 13, 2023 12:47
@bimmlerd
Copy link
Member

bimmlerd commented Nov 13, 2023

/ci-runtime (rebased to fix conflicts, and fixed another test in ci-runtime)

@bimmlerd bimmlerd force-pushed the pr/joamaki/node-addressing-via-address-table branch from a7ff7df to c56ef0f Compare November 13, 2023 13:43
@bimmlerd
Copy link
Member

/test

As a step towards removing the global address maps in pkg/node
(ipv4NodeportAddrs, ipv4MasqAddrs etc.), reimplement NodeAddressing
as queries against the Table[NodeAddress].

The goal is to eventually remove NodeAddressing and have modules
query the Table[NodeAddress] directly and to react to changes, but
as an intermediary step without having to refactor whole bunch of
uses of NodeAddressing, this still serves the purpose of allowing
removal of some global variables and removing the other ways of
extracting the device and address information from the system.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Use a single set of methods for both IPv4 and IPv6 to remove
code duplication.

Add DirectRouting() method for looking up the direct routing device
and IP address.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Remove the ipv{4,6}NodePortAddrs global variables and associated
getters and inits from pkg/node and replace their uses with Table[NodeAddress]
and NodeAddressing (which queries from the former).

This paves the path towards being able to react to the local node's address
changes.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
@bimmlerd bimmlerd force-pushed the pr/joamaki/node-addressing-via-address-table branch from 79f9939 to 1cc056d Compare November 22, 2023 16:29
@bimmlerd
Copy link
Member

/test

@joamaki joamaki requested a review from thorn3r November 27, 2023 08:55
Copy link
Contributor

@thorn3r thorn3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👍

Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for pkg/endpoint changes.

@aditighag aditighag requested a review from a team November 27, 2023 21:58
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Nov 28, 2023
@ysksuzuki ysksuzuki removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 28, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 28, 2023
@joamaki joamaki added this pull request to the merge queue Nov 28, 2023
Merged via the queue into cilium:main with commit 7df3194 Nov 28, 2023
@joamaki joamaki deleted the pr/joamaki/node-addressing-via-address-table branch November 28, 2023 08:47
pippolo84 added a commit to pippolo84/cilium that referenced this pull request Jun 20, 2025
This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in cilium#29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see cilium#34861).

The same unintended consequence became more frequent after
cilium#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: cilium#28953
Related: cilium#34861
Related: cilium#29033
Related: cilium#32381
Fixes: cilium#37817

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
pippolo84 added a commit to pippolo84/cilium that referenced this pull request Jun 23, 2025
This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in cilium#29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see cilium#34861).

The same unintended consequence became more frequent after
cilium#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: cilium#28953
Related: cilium#34861
Related: cilium#29033
Related: cilium#32381
Fixes: cilium#37817

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Jun 23, 2025
This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in #29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see #34861).

The same unintended consequence became more frequent after
#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Jun 23, 2025
This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in #29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see #34861).

The same unintended consequence became more frequent after
#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
viktor-kurchenko pushed a commit that referenced this pull request Jun 25, 2025
[ upstream commit dfd8f93 ]

This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in #29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see #34861).

The same unintended consequence became more frequent after
#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817

[ Backporter's notes:
  - resolved conflict in: daemon/k8s/init.go
]

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
wanglei4687 pushed a commit to wanglei4687/cilium that referenced this pull request Jun 26, 2025
This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in cilium#29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see cilium#34861).

The same unintended consequence became more frequent after
cilium#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: cilium#28953
Related: cilium#34861
Related: cilium#29033
Related: cilium#32381
Fixes: cilium#37817

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
viktor-kurchenko pushed a commit that referenced this pull request Jun 26, 2025
[ upstream commit dfd8f93 ]

This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in #29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see #34861).

The same unintended consequence became more frequent after
#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817

[ Backporter's notes:
  - resolved conflict in: daemon/k8s/init.go
]

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
github-merge-queue bot pushed a commit that referenced this pull request Jun 26, 2025
[ upstream commit dfd8f93 ]

This reverts commit 2fc52eb.

The commit was added with the intent to have an uniform behavior when
Cilium is started on a single stack cluster with ipv6 support enabled,
despite the KPR setting being enabled or disabled. Specifically, the
added check was stopping the agent startup in case of a missing IPv6
address on the node, even when KPR was disabled.

However, the IPv6 detection for NodePort, responsible for the startup
error in case of missing IPv6 address with KPR enabled, has been removed
in #29033. Therefore, there is no
reason anymore to stop the agent at startup when the IPv6 support is
enabled and the node is missing an IPv6 address.  Moreover, the commit
above led to some unintended consequences in environments where KPR is
disabled and the IPv6 address is set on the node after the Cilium agent
is started (see #34861).

The same unintended consequence became more frequent after
#32381, where we moved later in the
daemon startup process the first call to SetDefaultPrefix. That
function, among other things, was in charge of setting up a temporary
IPv6 Unique Local Address in case the node does not have one yet. Since
now the agent does this later (specifically, while configuring IPAM),
the check introduced in 2fc52eb sees the IPv6 node address as nil and
stops the agent.

Therefore, revert the check to avoid the unintended regressions.

Related: #28953
Related: #34861
Related: #29033
Related: #32381
Fixes: #37817

[ Backporter's notes:
  - resolved conflict in: daemon/k8s/init.go
]

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants