Skip to content

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Jul 24, 2025

With W-ECMP, when an interface goes down, we mark the singleton nexthop as INACTIVE. We then process the dependents (NHG groups containing this singleton nexthop) and attempt to mark them as INACTIVE as well.

During this process, we compare all singleton nexthops in the nexthop group with the singleton nexthop that went down using nexthop_same() in zebra_nhg_set_valid().

However, there's a weight mismatch issue:

  • The standalone singleton nexthop has weight = 1
  • The same singleton nexthop when part of an NHG has weight = 255 This weight mismatch causes nexthop_same() to return FALSE, preventing proper matching.

Testing:

Before fix:

ID: 76 (zebra)
     RefCnt: 20
     Uptime: 00:00:59
     VRF: default
     Valid, Installed
     Interface Index: 46
           via 22.64.0.18, swp43 (vrf default), weight 1          <<<< weight 1 for swp43
     Dependents: (74)
ID: 74 (zebra)
     RefCnt: 19
     Uptime: 00:00:59
     VRF: default
     Valid, Installed
     Depends: (75) (76) (77) (78)
           via 22.64.0.16, swp42 (vrf default), weight 255
           via 22.64.0.18, swp43 (vrf default), weight 255       <<<<< weight 255 for swp43
           via 22.64.0.20, swp44 (vrf default), weight 255
           via 22.64.0.22, swp45 (vrf default), weight 255

root@leaf:mgmt:~# ip link set swp43 down                  <<<<< trigger bring down swp43
root@leaf:mgmt:~#
root@leaf:mgmt:~# vtysh -c "show nexthop-group rib 74"
ID: 74 (zebra)
     RefCnt: 1 Time to Deletion: 00:02:54                        <<<< marked for deletion
     Uptime: 00:02:53
     VRF: default
     Valid, Installed
     Depends: (75) (76) (77) (78)
           via 22.64.0.16, swp42 (vrf default), weight 255
           via 22.64.0.18, swp43 (vrf default), weight 255        <<< swp43 not marked inactive (nexthop_same fails due to wt check)
           via 22.64.0.20, swp44 (vrf default), weight 255
           via 22.64.0.22, swp45 (vrf default), weight 255

After fix:

root@leaf:mgmt:/var/log/frr# vtysh -c "show nexthop-group rib 69"
ID: 69 (zebra)
     RefCnt: 19
     Uptime: 00:01:11
     VRF: default
     Valid, Installed
     Depends: (70) (71) (72)
           via 22.64.0.16, swp42 (vrf default), weight 255
           via 22.64.0.20, swp44 (vrf default), weight 255
           via 22.64.0.22, swp45 (vrf default), weight 255

root@leaf:mgmt:/var/log/frr# ip link set swp44 down                    <<< trigger bring swp44 down
root@leaf:mgmt:/var/log/frr#
root@leaf:mgmt:/var/log/frr# vtysh -c "show nexthop-group rib 69"      <<< NHG 69 not marked for deletion
ID: 69 (zebra)
     RefCnt: 19
     Uptime: 00:02:41
     VRF: default
     Valid, Installed
     Depends: (70) (71) (72)
           via 22.64.0.16, swp42 (vrf default), weight 255
           via 22.64.0.20, swp44 (vrf default) inactive, weight 255           <<< swp44 marked as inactive
           via 22.64.0.22, swp45 (vrf default), weight 255

Ticket: #


This is an automatic backport of pull request #18947 done by Mergify.

Karthikeya Venkat Muppalla added 3 commits July 24, 2025 03:33
add nexthop_same_no_weight api which compares if 2 nexthops are same
ignoring the weight. This is used in W-ECMP case.

Ticket: #

Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
(cherry picked from commit 6d0224d)
…for wecmp

Issue:
------

When interface goes down, singleton nexthops are marked INACTIVE, but
their individual entries within dependent NHG groups were not being
marked INACTIVE due to weight comparison checks in nexthop_same().

Similarly, when interface comes up, singleton nexthops are marked ACTIVE, but their
individual entries within dependent NHG groups were not being marked
ACTIVE due to weight comparision checks when comparision checks in
nexthop_same().

There is a weight mismatch issue:
- The standalone singleton nexthop has weight = 1
- The same singleton nexthop when part of an NHG has weight = 255
This weight mismatch causes nexthop_same() to return FALSE, preventing proper matching.

Fix:
----
Skip weight comparison when marking dependent nexthops to ensure marking the individual
nexthops in the nexthop group correctly by using
nexthop_same_no_weight() instead of nexthop_same()

Testing:
--------
Before fix:
-----------
root@leaf11:~# ip route | grep nhid
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 20 proto bgp metric 20
21.0.0.0/8 nhid 20 proto bgp metric 20
22.0.0.0/8 nhid 20 proto bgp metric 20
root@leaf11:~#
root@leaf11:~# ip nexthop | grep group
id 20 group 21,254/22,254/23,254/24,254 proto zebra
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 20"
ID: 20 (zebra)
     RefCnt: 6
     Uptime: 00:00:46
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::146f:98ff:fe54:d087, leaf11-eth0 (vrf default), weight 254
           via fe80::181a:f0ff:feab:f2f1, leaf11-eth1 (vrf default), weight 254
           via fe80::4804:4dff:fe6f:2dfc, leaf11-eth2 (vrf default), weight 254
           via fe80::6444:f3ff:fe4d:7360, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# ip link set leaf11-eth1 down  --> Bring down leaf11-eth1 link
root@leaf11:~#
root@leaf11:~# ip route | grep nhid         --> routes moved to nhid 39
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 39 proto bgp metric 20
21.0.0.0/8 nhid 39 proto bgp metric 20
22.0.0.0/8 nhid 39 proto bgp metric 20
root@leaf11:~#
root@leaf11:~# ip nexthop | grep group
id 20 group 21,254/23,254/24,254 proto zebra
id 39 group 21,254/23,254/24,254 proto zebra  --> new nhid 39 created
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 20"
ID: 20 (zebra)
     RefCnt: 1 Time to Deletion: 00:02:46
     Uptime: 00:01:06
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::146f:98ff:fe54:d087, leaf11-eth0 (vrf default), weight 254
           via fe80::181a:f0ff:feab:f2f1, leaf11-eth1 (vrf default), weight 254 --> even though leaf11-eth1 is down, it is not marked as inactive
           via fe80::4804:4dff:fe6f:2dfc, leaf11-eth2 (vrf default), weight 254
           via fe80::6444:f3ff:fe4d:7360, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 39"   --> new nhid 39 is created
ID: 39 (zebra)
     RefCnt: 6
     Uptime: 00:00:19
     VRF: default(No AFI)
     Nexthop Count: 3
     Valid, Installed
     Depends: (21) (23) (24)
           via fe80::146f:98ff:fe54:d087, leaf11-eth0 (vrf default), weight 254
           via fe80::4804:4dff:fe6f:2dfc, leaf11-eth2 (vrf default), weight 254
           via fe80::6444:f3ff:fe4d:7360, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# ip link set leaf11-eth1 up     ---> bring up leaf11-eth1
root@leaf11:~#
root@leaf11:~# ip route | grep nhid           ---> all routes moved from nhid 39 to nhid 20 again
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 20 proto bgp metric 20
21.0.0.0/8 nhid 20 proto bgp metric 20
22.0.0.0/8 nhid 20 proto bgp metric 20
root@leaf11:~#
root@leaf11:~# ip nexthop | grep group
id 20 group 21,254/22,254/23,254/24,254 proto zebra
id 39 group 21,254/23,254/24,254 proto zebra
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 20"
ID: 20 (zebra)
     RefCnt: 6
     Uptime: 00:00:39
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::146f:98ff:fe54:d087, leaf11-eth0 (vrf default), weight 254
           via fe80::181a:f0ff:feab:f2f1, leaf11-eth1 (vrf default), weight 254
           via fe80::4804:4dff:fe6f:2dfc, leaf11-eth2 (vrf default), weight 254
           via fe80::6444:f3ff:fe4d:7360, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 39"
ID: 39 (zebra)
     RefCnt: 1 Time to Deletion: 00:02:16       ---> nhid marked for deletion
     Uptime: 00:01:21
     VRF: default(No AFI)
     Nexthop Count: 3
     Valid, Installed
     Depends: (21) (23) (24)
           via fe80::146f:98ff:fe54:d087, leaf11-eth0 (vrf default), weight 254
           via fe80::4804:4dff:fe6f:2dfc, leaf11-eth2 (vrf default), weight 254
           via fe80::6444:f3ff:fe4d:7360, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#

After fix:
----------
root@leaf11:~# ip route show | grep nhid
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 20 proto bgp metric 20
21.0.0.0/8 nhid 20 proto bgp metric 20
22.0.0.0/8 nhid 20 proto bgp metric 20
root@leaf11:~# ip nexthop show | grep group
id 20 group 21,254/22,254/23,254/24,254 proto zebra
root@leaf11:~# vtysh -c "show nexthop-group rib 20"
ID: 20 (zebra)
     RefCnt: 6
     Uptime: 00:01:52
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::1c71:39ff:febc:5328, leaf11-eth2 (vrf default), weight 254
           via fe80::5c54:11ff:fef3:3fe3, leaf11-eth1 (vrf default), weight 254
           via fe80::8c06:3bff:fe29:752f, leaf11-eth0 (vrf default), weight 254
           via fe80::f89e:7fff:fea3:8c7, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# ip link set leaf11-eth1 down             ---> leaf11-eth1 link is down
root@leaf11:~# vtysh -c "show nexthop-group rib 20"     ---> NHG retained
ID: 20 (zebra)
     RefCnt: 6
     Uptime: 00:02:23
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::1c71:39ff:febc:5328, leaf11-eth2 (vrf default), weight 254
           via fe80::5c54:11ff:fef3:3fe3, leaf11-eth1 (vrf default) inactive, weight 254  <<< leaf11-eth1 NH is marked inactive
           via fe80::8c06:3bff:fe29:752f, leaf11-eth0 (vrf default), weight 254
           via fe80::f89e:7fff:fea3:8c7, leaf11-eth3 (vrf default), weight 254
root@leaf11:~# ip route show | grep nhid               ---> routes still pointing to the same NHG 20
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 20 proto bgp metric 20
21.0.0.0/8 nhid 20 proto bgp metric 20
22.0.0.0/8 nhid 20 proto bgp metric 20
root@leaf11:~#
root@leaf11:~# ip nexthop show | grep group            ---> no additional NHG's are created
id 20 group 21,254/23,254/24,254 proto zebra
root@leaf11:~#
root@leaf11:~# ip link set leaf11-eth1 up              ---> leaf11-eth1 link is up
root@leaf11:~#
root@leaf11:~# vtysh -c "show nexthop-group rib 20"    ---> NHG retained
ID: 20 (zebra)
     RefCnt: 6
     Uptime: 00:02:54
     VRF: default(No AFI)
     Nexthop Count: 4
     Valid, Installed
     Depends: (21) (22) (23) (24)
           via fe80::1c71:39ff:febc:5328, leaf11-eth2 (vrf default), weight 254
           via fe80::5c54:11ff:fef3:3fe3, leaf11-eth1 (vrf default), weight 254    <<< leaf11-eth1 NH is marked active
           via fe80::8c06:3bff:fe29:752f, leaf11-eth0 (vrf default), weight 254
           via fe80::f89e:7fff:fea3:8c7, leaf11-eth3 (vrf default), weight 254
root@leaf11:~#
root@leaf11:~# ip route show | grep nhid               ---> routes still pointing to the same NHG 20
11.0.0.0/8 nhid 13 dev lo proto 196 metric 20
12.0.0.0/8 nhid 20 proto bgp metric 20
21.0.0.0/8 nhid 20 proto bgp metric 20
22.0.0.0/8 nhid 20 proto bgp metric 20
root@leaf11:~#
root@leaf11:~# ip nexthop show | grep group           ---> no additional NHG's are created
id 20 group 21,254/22,254/23,254/24,254 proto zebra
root@leaf11:~#

Ticket: #

Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
(cherry picked from commit ed3aab6)

# Conflicts:
#	zebra/zebra_nhg.c
Add tests for nexthop group (NHG) stability with Weighted ECMP
in high-density CLOS topologies. This test addresses weight mismatch issues
in dependent NHGs where link state changes previously caused unnecessary
NHG deletion/recreation.

Test topology:
- 2-layer CLOS with 2 spines and 2 leafs
- 64 parallel links per leaf (32 to each spine)
- 1000 test routes injected via sharpd

Test scenarios:
- Single link down/up
- Partial links down/up (16 out of 32 links to spine1)

Key verification points:
- NHG ID remains stable during all link state changes
- No new NHG creation during local link events
- All 1000 routes continue using same NHG ID
- No unnecessary route reprogramming occurs

Ticket: #

Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
(cherry picked from commit 5a792e9)
Copy link
Author

mergify bot commented Jul 24, 2025

Cherry-pick of ed3aab6 has failed:

On branch mergify/bp/stable/10.0/pr-18947
Your branch is ahead of 'origin/stable/10.0' by 1 commit.
  (use "git push" to publish your local commits)

You are currently cherry-picking commit ed3aab64d.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   zebra/zebra_nhg.c

no changes added to commit (use "git add" and/or "git commit -a")

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot added the conflicts label Jul 24, 2025
@frrbot frrbot bot added libfrr tests Topotests, make check, etc zebra labels Jul 24, 2025
@Jafaral Jafaral closed this Jul 24, 2025
@Jafaral Jafaral deleted the mergify/bp/stable/10.0/pr-18947 branch July 31, 2025 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant