Skip to content

Conversation

chiragshah6
Copy link
Member

Currently when upper layer protocols like ospf and static route sends the same route with different nexthops sequentially,
zebra creates two different route_entry (re) and two distinct nhgs.
First the static route and corresponding selected and installed, subsequently, the ospf route is installed
Once zebra rib selects the ospf as best the static as non best, then the static route is not uninstalled from dplane.
When networking restart happens at that point only the best NHG is installed.
If user does query of nexthop-group nhe state compare prior to and after networking restart, the behavior is not symmetric.
To make consistent, when zebra rib selects the best nhg to installed, adding a bit of logic to uninstall old route_entry's nhg if it is not selected as best by any of the route_entries then uninstall the nhg from kernel.
This requires to keep best installed re's count in nhg struct.

Initial route points to static, route and NHG 34 is installed:

DUT# show ip route nexthop-group

   S>* 20.8.8.0/30 [115/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:26:31
OSFPv2 come up installs the route, route and new NHG 48 is installed:

DUT# show ip route nexthop-group
    B>* 20.6.6.0/30 [20/0] (37) via 20.3.3.2, swp1s2, weight 1, 00:29:24
    S   20.6.6.0/30 [75/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:29:30
    O>* 20.8.8.0/30 [110/8] (48) via 20.1.1.1, swp1s0, weight 1, 00:01:48
    S   20.8.8.0/30 [115/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:29:30

Old NHG 34 is not removed kernel:

   DUT# show nexthop-group rib 34
   ID: 34 (zebra)
        RefCnt: 2
        Uptime: 00:33:27
        VRF: default
        Valid, Installed    <<<<<  installed
        Interface Index: 37
              via 20.7.7.2, swp31s0 (vrf default), weight

After fix:
NHG 34 is retained in zebra from unistalled from dplane
DUT# show nexthop-group rib 34

    ID: 34 (zebra)
         RefCnt: 2
         Uptime: 00:27:06
         VRF: default
         Valid
         Interface Index: 37
               via 20.7.7.2, swp31s0 (vrf default), weight 1

Signed-off-by: Chirag Shah chirag@nvidia.com

@mjstapp
Copy link
Contributor

mjstapp commented May 27, 2025

hmm - I'd like to hear more about this:
"Once zebra rib selects the ospf as best the static as non best, then the static route is not uninstalled from dplane."

that seems ... unexpected, I'd expect that the route "update" code was uninstalling the non-best route

@mjstapp mjstapp self-requested a review May 27, 2025 13:23
@chiragshah6
Copy link
Member Author

chiragshah6 commented May 27, 2025

hmm - I'd like to hear more about this: "Once zebra rib selects the ospf as best the static as non best, then the static route is not uninstalled from dplane."

that seems ... unexpected, I'd expect that the route "update" code was uninstalling the non-best route

@mjstapp When an existing route's re (lets say static in the example captured in commit message) becomes non-best, its associated re becomes non best but nhe refcount is greater than 0 so it won't explicitly uninstall.
zebra_nhg_uninstall_kernel -> dplane_nexthop_delete (which uninstalls) . This is the code path followed but given refcount is non zero, the dplane_nexthop_delete won't be called.
I have captured detail explanation in commit/PR with example.

@mjstapp
Copy link
Contributor

mjstapp commented May 27, 2025

ah, ok - it's not the "static route" that isn't uninstalled, it's the nhg(s) used by the static route.

@mjstapp When an existing route's re (lets say static in the example captured in commit message) becomes non-best, its associated re becomes non best but nhe refcount is greater than 0 so it won't explicitly uninstall. zebra_nhg_uninstall_kernel -> dplane_nexthop_delete (which uninstalls) . This is the code path followed but given refcount is non zero, the dplane_nexthop_delete won't be called. I have captured detail explanation in commit/PR with example.

@chiragshah6
Copy link
Member Author

chiragshah6 commented May 27, 2025

ah, ok - it's not the "static route" that isn't uninstalled, it's the nhg(s) used by the static route.

@mjstapp When an existing route's re (lets say static in the example captured in commit message) becomes non-best, its associated re becomes non best but nhe refcount is greater than 0 so it won't explicitly uninstall. zebra_nhg_uninstall_kernel -> dplane_nexthop_delete (which uninstalls) . This is the code path followed but given refcount is non zero, the dplane_nexthop_delete won't be called. I have captured detail explanation in commit/PR with example.

That's Correct! Any of the upper layer protocol's re (route_entry) pointing to a particular NHG. If all the re become non-selected then uninstall NHG from kernel but retain in zebra as NHG is being reference by non selected res.

@chiragshah6 chiragshah6 requested a review from riw777 May 30, 2025 12:29
@chiragshah6 chiragshah6 force-pushed the zebra_nhg1 branch 2 times, most recently from 60373e2 to 50937a9 Compare May 30, 2025 16:15
@chiragshah6 chiragshah6 force-pushed the zebra_nhg1 branch 3 times, most recently from 04d2e1d to 9d10f36 Compare June 9, 2025 03:01
@chiragshah6
Copy link
Member Author

@mjstapp / @riw777 kindly please help review and let me know if further query. I have addressed all review comments and concerns.

Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this is something we should do. Adding yet-another refcount in this area just seems awkward. Is there some functional issue that the existing behavior leads to that needs to be fixed? Is there just some cosmetic concern?

@@ -2121,6 +2124,8 @@ static void rib_process_result(struct zebra_dplane_ctx *ctx)
if (re) {
UNSET_FLAG(re->status, ROUTE_ENTRY_FAILED);
SET_FLAG(re->status, ROUTE_ENTRY_INSTALLED);
/* Increment the installed re count for the nhe */
re->nhe->installed_re_count++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no no - we really can't reach-around into the nhe/nhg struct from the rib module.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason? can you elaborate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a PR #18722 which also introduces nhe->re pointer and taking action to reprogram routes if certain nhg failed to install upon interface flap.

Please elaborate your thought process.

@@ -3526,7 +3526,11 @@ void zebra_nhg_uninstall_kernel(struct nhg_hash_entry *nhe)
}
}

zebra_nhg_handle_uninstall(nhe);
/* do not call delete of the NHG if there are re's referencing it. */
if (free_nhg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ... the logic here just seems incorrect: this code places the decision-making for the fate of the nhg object entirely outside that object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the free_nhg ensures to not call free but rather simply uninstalled. Given the current code is doing uninstall and free in same function. In my opinion that can be re organized such uninstalled can be independent function and free should be called independently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐴

if (free_nhg)
zebra_nhg_handle_uninstall(nhe);
else
zlog_debug("%s skip nhe %d free as there are re referencing it", __func__, nhe->id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unprotected debug

@@ -3553,6 +3557,15 @@ void zebra_nhg_dplane_result(struct zebra_dplane_ctx *ctx)
"Failed to uninstall Nexthop ID (%u) from the kernel",
id);

nhe = zebra_nhg_lookup_id(id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the relationship between this new code and the specific problem of adding another refcount to nhgs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have explained before, the objective is to uninstalled the nexthop group which is not selected as best by any of the route.
So when NHG was triggered uninstalled, the dplane result has to unset the installed flag.
Please follow the code from install/uninstall, its a seamless change to mark uninstalled in dplan result for delete (uninstalled) request.

@chiragshah6 chiragshah6 requested a review from mjstapp June 9, 2025 17:25
@chiragshah6
Copy link
Member Author

I'm not convinced that this is something we should do. Adding yet-another refcount in this area just seems awkward. Is there some functional issue that the existing behavior leads to that needs to be fixed? Is there just some cosmetic concern?

@mjstapp there isn't any functional issue, I have tried to explained the rational using example in the commit message.
There is a NHG left in dplane/kernel which is not used by any route. That is the problem we are trying to solve. The existing code does not handle it. Though one can say its a corner case but it is a bug why NHG should remained installed in kernel when Zebra decided to switch its best re to another NHG.
Please take a look at the scenario outlined should be distinctly identify the issue. What other way you are thinking of solving this corner case? Kindly suggest.

@chiragshah6 chiragshah6 force-pushed the zebra_nhg1 branch 2 times, most recently from 76a603d to edc10cf Compare June 10, 2025 03:57
Currently when upper layer protocols like ospf and static route
sends the same route with different nexthops sequentially,
zebra creates two different route_entry (re) and two distinct nhgs.
First the static route and corresponding selected and installed,
subsequently, the ospf route is installed
Once zebra rib selects the ospf as best the static as non best,
then the static route is not uninstalled from dplane.
When networking restart happens at that point only the best NHG is installed.
If user does query of nexthop-group nhe state compare prior to and after
networking restart, the behavior is not symmetric.
To make consistent, when zebra rib selects the best nhg to installed,
adding a bit of logic to uninstall old route_entry's nhg if it is not selected
as best by any of the route_entries then uninstall the nhg from kernel.
This requires to keep best installed re's count in nhg struct.

Ticket: #4417509

system:
Before fix:

Initial route points to static, route and NHG 34 is installed:
DUT# show ip route nexthop-group
S>* 20.8.8.0/30 [115/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:26:31

OSFPv2 come up installs the route, route and new NHG 48 is installed:

DUT# show ip route nexthop-group
B>* 20.6.6.0/30 [20/0] (37) via 20.3.3.2, swp1s2, weight 1, 00:29:24
S   20.6.6.0/30 [75/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:29:30
O>* 20.8.8.0/30 [110/8] (48) via 20.1.1.1, swp1s0, weight 1, 00:01:48
S   20.8.8.0/30 [115/0] (34) via 20.7.7.2, swp31s0, weight 1, 00:29:30

Old NHG 34 is not removed kernel:
DUT# show nexthop-group rib 34
ID: 34 (zebra)
     RefCnt: 2
     Uptime: 00:33:27
     VRF: default
     Valid, Installed    <<<<<  installed
     Interface Index: 37
           via 20.7.7.2, swp31s0 (vrf default), weight

After fix:

NHG 34 is retained in zebra from unistalled from dplane
DUT# show nexthop-group rib 34
ID: 34 (zebra)
     RefCnt: 2
     Uptime: 00:27:06
     VRF: default
     Valid
     Interface Index: 37
           via 20.7.7.2, swp31s0 (vrf default), weight 1

Signed-off-by: Chirag Shah <chirag@nvidia.com>
Signed-off-by: Chirag Shah <chirag@nvidia.com>
@riw777
Copy link
Member

riw777 commented Jun 17, 2025

Not needed.

@riw777 riw777 closed this Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants