Skip to content

Conversation

mjstapp
Copy link
Contributor

@mjstapp mjstapp commented Oct 29, 2024

Replace the "lib/table" based per-NS collection of interfaces with a dedicated rbtree. Convert all the code that uses the if_table directly to use new iterator apis.

Mark Stapp added 6 commits October 28, 2024 11:09
Make an MTYPE used in zifs internal/static

Signed-off-by: Mark Stapp <mjs@cisco.com>
Add new per-NS interface typerb tree, using external linkage, to
replace the use of the if_table table.
Add apis to iterate the per-NS collection - which is not public.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Replace use of the old if_table with the new per-NS ifp
iteration apis.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Add an include to an isis header so it's self-contained.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Remove use of the per-NS if_table from zebra/interface
module. Use new add, lookup, and iteration apis.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Use the new per-NS interface iteration apis in the evpn
module.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Mark Stapp added 3 commits October 29, 2024 13:49
Finish removing the table route_node from the ifp struct.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Replace use of the old if_table with the new per-NS ifp
iterator apis in the zebra vxlan code.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Finish removing the if_table from the zebra NS struct.

Signed-off-by: Mark Stapp <mjs@cisco.com>
@mjstapp
Copy link
Contributor Author

mjstapp commented Oct 29, 2024

pushed to fix shutdown-time cleanup


zif = ifp->info;
if (!zif || zif->zif_type != ZEBRA_IF_VXLAN)
goto done;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why goto here? Especially since we have other returns directly in the middle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an artifact of moving the code from the inline loop to the callback I guess. we need to be able to have "special" continue path for this clause, and then the "normal" continue path for non-error cases at the end of the callback?


if (in_param->bridge_vlan_aware) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did this test go to? I'm confused about the fairly big change in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh - yeah, there are a couple of examples of this. this clause only uses info from the top-level call - it doesn't need to be part of the iteration at all, so I moved it to the caller, the function that starts the iteration.

@donaldsharp donaldsharp merged commit ac6314d into FRRouting:master Nov 12, 2024
11 checks passed
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Dec 4, 2024
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Dec 10, 2024
Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
dgsudharsan added a commit to dgsudharsan/sonic-buildimage that referenced this pull request Dec 23, 2024
…et#21095)

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
    at ../zebra/interface.c:312
mssonicbld added a commit to mssonicbld/sonic-buildimage-msft that referenced this pull request Jan 7, 2025
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Azure#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
Azure#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
Azure#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
Azure#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
Azure#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
Azure#6  route_next (node=<optimized out>) at ../lib/table.c:436
Azure#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
Azure#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
Azure#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
Azure#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
Azure#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
Azure#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
Azure#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
Azure#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
mssonicbld added a commit to mssonicbld/sonic-buildimage that referenced this pull request Jan 13, 2025
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
sonic-net#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
sonic-net#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
sonic-net#6  route_next (node=<optimized out>) at ../lib/table.c:436
sonic-net#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
sonic-net#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
sonic-net#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
sonic-net#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
sonic-net#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
sonic-net#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
sonic-net#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
mssonicbld added a commit to sonic-net/sonic-buildimage that referenced this pull request Jan 15, 2025
<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6 route_next (node=<optimized out>) at ../lib/table.c:436
#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
 at ../zebra/interface.c:312
#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
…et#21095)

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
yanjundeng pushed a commit to yanjundeng/sonic-buildimage that referenced this pull request Apr 23, 2025
…et#21095)

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
sonic-net#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
sonic-net#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
sonic-net#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
sonic-net#6  route_next (node=<optimized out>) at ../lib/table.c:436
sonic-net#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
sonic-net#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
sonic-net#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
sonic-net#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
sonic-net#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
sonic-net#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
sonic-net#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
sonic-net#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request May 8, 2025
New patches that were added:
Patch	FRR Pull request
0086-isisd-lib-add-some-codepoints-usually-shared-with-other-vendors.patch	FRRouting/frr#17957
0087-staticd-Add-support-for-SRv6-uA-behavior.patch	FRRouting/frr#18198

Removed patches:
Patch	FRR commit / Pull request
0025-bgp-community-memory-leak-fix.patch	FRRouting/frr@e613e12
0028-zebra-fix-parse-attr-problems-for-encap.patch	FRRouting/frr@ba5a353 FRRouting/frr@569f9e4 FRRouting/frr@bd4fca1
0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch	FRRouting/frr@a8efa99
0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch	FRRouting/frr@705fed7
0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch	FRRouting/frr@5f379be
0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch	FRRouting/frr@ccfe452
0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch	FRRouting/frr@a07df6f
0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch	FRRouting/frr@ed7005d
0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch	FRRouting/frr@9edf45b
0038-zebra-Actually-display-I-O-buffer-sizes.patch	FRRouting/frr@8d8f12b
0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch	FRRouting/frr@33dccbe
0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch	FRRouting/frr@f4ba472
0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch	FRRouting/frr@920bf45
0042-zebra-Use-built-in-data-structure-counter.patch	FRRouting/frr@a23a938
0043-zebra-Use-the-ctx-queue-counters.patch	FRRouting/frr@34670c4
0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch	FRRouting/frr@3af381b
0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch	FRRouting/frr@8926ac1
0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch	FRRouting/frr@98b11de
0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch	FRRouting/frr@b47a92e
0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch	FRRouting/frr@4395fcd
0049-bgpd-backpressure-Improve-debuggability.patch	FRRouting/frr@186db96
0050-bgpd-backpressure-Avoid-use-after-free.patch	FRRouting/frr@40965e5
0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch	FRRouting/frr@c4bbb5b
0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch	FRRouting/frr@6cf5b79
0055-bgpd-lib-Include-SID-structure-in-seg6local-nexthop.patch	FRRouting/frr@0402551
0059-Fix-BGP-reset-on-suppress-fib-pending-configuration.patch	FRRouting/frr#17487
0060-bgpd-Validate-both-nexthop-information-NEXTHOP-and-N.patch	FRRouting/frr@a0d2734
0061-dont-print-warning-if-not-a-daemon.patch	FRRouting/frr@cecf571
0062-zebra-lib-use-internal-rbtree-per-ns.patch	FRRouting/frr#17297
0064-SRv6-BGP-SID-reachability.patch	FRRouting/frr#14810
0065-zebra-display-srv6-encapsulation-source-address-when-configured.patch	FRRouting/frr@890b67d
0066-lib-fix-srv6-locator-flags-propagated-to-isis.patch	FRRouting/frr@03d2ad0
0067-Add-support-for-SRv6-SID-Manager.patch	FRRouting/frr#15604
0068-bgpd-Extend-BGP-to-communicate-with-the-SRv6-SID-Manager-to-allocate-release-SRv6-SIDs.patch	FRRouting/frr#15676
0069-lib-nexthop-code-should-use-uint16_t-for-nexthop-cou.patch	FRRouting/frr@0bc79f5
0070-Allow-16-bit-size-for-nexthops.patch	FRRouting/frr@9f8968f
0071-zebra-Only-notify-dplane-work-pthread-when-needed.patch	FRRouting/frr#17062
0072-Fix-up-improper-handling-of-nexthops-for-nexthop-tra.patch	FRRouting/frr#17076
0073-remove-in6addr-cmp.patch	FRRouting/frr#17312
0074-bgp-best-port-reordering.patch	FRRouting/frr#15572
0075-bgp-mp-info-changes.patch	FRRouting/frr#16961
0076-Optimizations-and-problem-fixing-for-large-scale-ecmp-from-bgp.patch	FRRouting/frr#17229
0077-frr-vtysh-dependencies-for-srv6-static-patches.patch	FRRouting/frr@fd8edc3
0078-vtysh-de-conditionalize-and-reorder-install-node.patch	FRRouting/frr@e26c580
0079-staticd-add-support-for-srv6.patch	FRRouting/frr#16894
0081-bgpd-Optimize-evaluate-paths-for-a-peer-going-down.patch	FRRouting/frr@9f55368

Realigned patches:
Patch
0001-Reduce-severity-of-Vty-connected-from-message.patch
0002-Allow-BGP-attr-NEXT_HOP-to-be-0.0.0.0-due-to-allevia.patch
0003-nexthops-compare-vrf-only-if-ip-type.patch
0004-frr-remove-frr-log-outchannel-to-var-log-frr.log.patch
0005-Add-support-of-bgp-l3vni-evpn.patch
0006-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch
0007-ignore-route-from-default-table.patch
0008-Use-vrf_id-for-vrf-not-tabled_id.patch
0010-bgpd-Change-log-level-for-graceful-restart-events.patch
0021-Disable-ipv6-src-address-test-in-pceplib.patch
0022-cross-compile-changes.patch
0054-build-dplane-fpm-sonic-module.patch
0056-zebra-do-not-send-local-routes-to-fpm.patch
0057-Adding-changes-to-write-ip-nht-resolve-via-default-c.patch
0058-When-the-file-is-config-replayed-we-cannot-handle-th.patch
0061-Set-multipath-to-514-and-disable-bgp-vnc-for-optimiz.patch
0063-Patch-to-send-tag-value-associated-with-route-via-ne.patch
0080-SRv6-vpn-route-and-sidlist-install.patch
0082-Revert-bgpd-upon-if-event-evaluate-bnc-with-matching.patch
0083-staticd-add-cli-to-support-steering-of-ipv4-traffic-over-srv6-sid-list.patch
0084-lib-Return-duplicate-prefix-list-entry-test.patch
0085-This-error-happens-when-we-try-to-write-to-a-socket.patch

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants