Skip to content

Conversation

pguibert6WIND
Copy link
Member

If the 'show bgp ipv4 vpn' command is interrupted during the dump of the whole entries, then a heap use after free may appear.

May 15 16:08:14 vRR-DUT bgpd[1647]: [N9HHH-F8H1M] %ADJCHANGE: neighbor 3.3.1.232(taastpap-scale-1) in vrf default Up
May 15 16:08:14 vRR-DUT bgpd[1647]: [GJPH1-W8PZV] Resetting peer 3.3.1.232 due to change in addpath config
May 15 16:08:14 vRR-DUT bgpd[1647]: [YAF85-253AP][EC 100663299] buffer_flush_available: write error on fd 70: Broken pipe
May 15 16:08:14 vRR-DUT bgpd[1647]: [THHDB-YPEY6][EC 100663299] vtysh_flush: write error to fd 70, closing
May 15 16:08:14 vRR-DUT bgpd[1647]: =================================================================
May 15 16:08:14 vRR-DUT bgpd[1647]: ==1647==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b001d65238 at pc 0x7efd46b83fa0 bp 0x7ffd4d598>
May 15 16:08:14 vRR-DUT bgpd[1647]: READ of size 1 at 0x62b001d65238 thread T0
May 15 16:08:14 vRR-DUT bgpd[1647]: #0 0x7efd46b83f9f in vty_out (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f9f)
May 15 16:08:14 vRR-DUT bgpd[1647]: #1 0x5570c0a4e7d2 in aspath_print_vty /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_aspath.c:2308
May 15 16:08:14 vRR-DUT bgpd[1647]: #2 0x5570c08a8ce6 in route_vty_out /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:9565
May 15 16:08:14 vRR-DUT bgpd[1647]: #3 0x5570c08b5b65 in bgp_show_table /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11751
May 15 16:08:14 vRR-DUT bgpd[1647]: #4 0x5570c08b677f in bgp_show_table_rd /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11915
May 15 16:08:14 vRR-DUT bgpd[1647]: #5 0x5570c08b6d1c in bgp_show /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11964
May 15 16:08:14 vRR-DUT bgpd[1647]: #6 0x5570c08bcb03 in show_ip_bgp_magic /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:13134
May 15 16:08:14 vRR-DUT bgpd[1647]: #7 0x5570c086f616 in show_ip_bgp bgpd/bgp_route_clippy.c:514
May 15 16:08:14 vRR-DUT bgpd[1647]: #8 0x7efd469b38b6 (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b38b6)
May 15 16:08:14 vRR-DUT bgpd[1647]: #9 0x7efd469b3c43 in cmd_execute_command (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b3c43)
May 15 16:08:14 vRR-DUT bgpd[1647]: #10 0x7efd469b4915 in cmd_execute (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b4915)
May 15 16:08:14 vRR-DUT bgpd[1647]: #11 0x7efd46b8581a (/lib/x86_64-linux-gnu/libfrr.so.0+0x38581a)
May 15 16:08:14 vRR-DUT bgpd[1647]: #12 0x7efd46b8a6ff (/lib/x86_64-linux-gnu/libfrr.so.0+0x38a6ff)
May 15 16:08:14 vRR-DUT bgpd[1647]: #13 0x7efd46b90f90 (/lib/x86_64-linux-gnu/libfrr.so.0+0x390f90)
May 15 16:08:14 vRR-DUT bgpd[1647]: #14 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
May 15 16:08:14 vRR-DUT bgpd[1647]: #15 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
May 15 16:08:14 vRR-DUT bgpd[1647]: #16 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
May 15 16:08:14 vRR-DUT bgpd[1647]: #17 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
May 15 16:08:14 vRR-DUT bgpd[1647]: #18 0x7efd46429e3f in __libc_start_main_impl ../csu/libc-start.c:392
May 15 16:08:14 vRR-DUT bgpd[1647]: #19 0x5570c06bc384 in _start (/usr/bin/bgpd+0x272384)
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x62b001d65238 is located 56 bytes inside of 26536-byte region [0x62b001d65200,0x62b001d6b9a8)
May 15 16:08:14 vRR-DUT bgpd[1647]: freed by thread T0 here:
May 15 16:08:14 vRR-DUT bgpd[1647]: #0 0x7efd470b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
May 15 16:08:14 vRR-DUT bgpd[1647]: #1 0x7efd46a6f930 in qfree (/lib/x86_64-linux-gnu/libfrr.so.0+0x26f930)
May 15 16:08:14 vRR-DUT bgpd[1647]: #2 0x7efd46b92467 in vty_close (/lib/x86_64-linux-gnu/libfrr.so.0+0x392467)
May 15 16:08:14 vRR-DUT bgpd[1647]: #3 0x7efd46b8fd4c (/lib/x86_64-linux-gnu/libfrr.so.0+0x38fd4c)
May 15 16:08:14 vRR-DUT bgpd[1647]: #4 0x7efd46b83f60 in vty_out (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f60)
May 15 16:08:14 vRR-DUT bgpd[1647]: #5 0x5570c0a4e7d2 in aspath_print_vty /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_aspath.c:2308
May 15 16:08:14 vRR-DUT bgpd[1647]: #6 0x5570c08a8ce6 in route_vty_out /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:9565
May 15 16:08:14 vRR-DUT bgpd[1647]: #7 0x5570c08b5b65 in bgp_show_table /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11751
May 15 16:08:14 vRR-DUT bgpd[1647]: #8 0x5570c08b677f in bgp_show_table_rd /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11915
May 15 16:08:14 vRR-DUT bgpd[1647]: #9 0x5570c08b6d1c in bgp_show /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11964
May 15 16:08:14 vRR-DUT bgpd[1647]: #10 0x5570c08bcb03 in show_ip_bgp_magic /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:13134
May 15 16:08:14 vRR-DUT bgpd[1647]: #11 0x5570c086f616 in show_ip_bgp bgpd/bgp_route_clippy.c:514
May 15 16:08:14 vRR-DUT bgpd[1647]: #12 0x7efd469b38b6 (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b38b6)
May 15 16:08:14 vRR-DUT bgpd[1647]: #13 0x7efd469b3c43 in cmd_execute_command (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b3c43)
May 15 16:08:14 vRR-DUT bgpd[1647]: #14 0x7efd469b4915 in cmd_execute (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b4915)
May 15 16:08:14 vRR-DUT bgpd[1647]: #15 0x7efd46b8581a (/lib/x86_64-linux-gnu/libfrr.so.0+0x38581a)
May 15 16:08:14 vRR-DUT bgpd[1647]: #16 0x7efd46b8a6ff (/lib/x86_64-linux-gnu/libfrr.so.0+0x38a6ff)
May 15 16:08:14 vRR-DUT bgpd[1647]: #17 0x7efd46b90f90 (/lib/x86_64-linux-gnu/libfrr.so.0+0x390f90)
May 15 16:08:14 vRR-DUT bgpd[1647]: #18 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
May 15 16:08:14 vRR-DUT bgpd[1647]: #19 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
May 15 16:08:14 vRR-DUT bgpd[1647]: #20 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
May 15 16:08:14 vRR-DUT bgpd[1647]: #21 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
May 15 16:08:14 vRR-DUT bgpd[1647]: previously allocated by thread T0 here:
May 15 16:08:14 vRR-DUT bgpd[1647]: #0 0x7efd470b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
May 15 16:08:14 vRR-DUT bgpd[1647]: #1 0x7efd46a6f7de in qcalloc (/lib/x86_64-linux-gnu/libfrr.so.0+0x26f7de)
May 15 16:08:14 vRR-DUT bgpd[1647]: #2 0x7efd46b8c5ba in vty_new (/lib/x86_64-linux-gnu/libfrr.so.0+0x38c5ba)
May 15 16:08:14 vRR-DUT bgpd[1647]: #3 0x7efd46b8f0fe (/lib/x86_64-linux-gnu/libfrr.so.0+0x38f0fe)
May 15 16:08:14 vRR-DUT bgpd[1647]: #4 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
May 15 16:08:14 vRR-DUT bgpd[1647]: #5 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
May 15 16:08:14 vRR-DUT bgpd[1647]: #6 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
May 15 16:08:14 vRR-DUT bgpd[1647]: #7 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
May 15 16:08:14 vRR-DUT bgpd[1647]: SUMMARY: AddressSanitizer: heap-use-after-free (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f9f) in vty_out
May 15 16:08:14 vRR-DUT bgpd[1647]: Shadow bytes around the buggy address:
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a49f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
May 15 16:08:14 vRR-DUT bgpd[1647]: =>0x0c56803a4a40: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: 0x0c56803a4a90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
May 15 16:08:14 vRR-DUT bgpd[1647]: Shadow byte legend (one shadow byte represents 8 application bytes):
May 15 16:08:14 vRR-DUT bgpd[1647]: Addressable: 00
May 15 16:08:14 vRR-DUT bgpd[1647]: Partially addressable: 01 02 03 04 05 06 07
May 15 16:08:14 vRR-DUT bgpd[1647]: Heap left redzone: fa
May 15 16:08:14 vRR-DUT bgpd[1647]: Freed heap region: fd
May 15 16:08:14 vRR-DUT bgpd[1647]: Stack left redzone: f1
May 15 16:08:14 vRR-DUT bgpd[1647]: Stack mid redzone: f2
May 15 16:08:14 vRR-DUT bgpd[1647]: Stack right redzone: f3
May 15 16:08:14 vRR-DUT bgpd[1647]: Stack after return: f5
May 15 16:08:14 vRR-DUT bgpd[1647]: Stack use after scope: f8
May 15 16:08:14 vRR-DUT bgpd[1647]: Global redzone: f9
May 15 16:08:14 vRR-DUT bgpd[1647]: Global init order: f6
May 15 16:08:14 vRR-DUT bgpd[1647]: Poisoned by user: f7
May 15 16:08:14 vRR-DUT bgpd[1647]: Container overflow: fc
May 15 16:08:14 vRR-DUT bgpd[1647]: Array cookie: ac
May 15 16:08:14 vRR-DUT bgpd[1647]: Intra object redzone: bb
May 15 16:08:14 vRR-DUT bgpd[1647]: ASan internal: fe
May 15 16:08:14 vRR-DUT bgpd[1647]: Left alloca redzone: ca
May 15 16:08:14 vRR-DUT bgpd[1647]: Right alloca redzone: cb
May 15 16:08:14 vRR-DUT bgpd[1647]: Shadow gap: cc
May 15 16:08:14 vRR-DUT bgpd[1647]: ==1647==ABORTING

The vty_out() function detects a BROKEN PIPE error durint the call of vty_flush(), and closes the vty pointer. Consequently, the next vty_out() calls may use an invalid vty pointer.

Propose to delay the removal of the vty pointer at the next scheduling. Until that event, keep the vty structure at CLOSE state. The show function will eventually detect the invalid vty status, and will anticipate the end of the show command.

@pguibert6WIND pguibert6WIND force-pushed the delay_vty_closure_upon_error branch 4 times, most recently from 3ec1edc to 3b64ef3 Compare June 16, 2025 16:52
@pguibert6WIND
Copy link
Member Author

ci:rerun

Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this problem isn't specific to bgp in any way, is it? that is, any frr code that makes more than one output call could encounter this error, couldn't it?
this approach seems to require every possible code path to look for an error condition - and any path that doesn't do that is exposed to a problem. would it be possible to prevent vty_out() and friends from acting badly if the connection is closed?

@pguibert6WIND
Copy link
Member Author

this problem isn't specific to bgp in any way, is it? that is, any frr code that makes more than one output call could encounter this error, couldn't it? this approach seems to require every possible code path to look for an error condition - and any path that doesn't do that is exposed to a problem.

Hi Mark.
thanks for your feedback.

In a pinch, BGP patch is not really needed, since now, vty_out() is protected: erroneous I/O lead to create an event to handle the vty_close() call.

Today, all the show can not be interrupted. This BGP patch is here to avoid spending 10s of seconds writing a full route into an erroneous socket.

would it be possible to prevent vty_out() and friends from acting badly if the connection is closed?

This needs to be checked on all calls to vty_close(), I think.

@@ -12471,6 +12471,12 @@ static int bgp_show_table(struct vty *vty, struct bgp *bgp, afi_t afi, safi_t sa
else
json_paths = NULL;

if (vty->status == VTY_CLOSE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This memleaks json_paths.

@donaldsharp
Copy link
Member

Mark is correct, this modification requires that every single code path that uses vty become conversant with this error condition happening and as I point out it creates a whole another level of worry about leaked memory that needs to be taken care of.

I agree with Mark's proposal. This is a problem with the vty subsystem and exposing this code change outside of vty.c just opens up a whole can of worms associated with changing every place that uses vty_out().

@pguibert6WIND pguibert6WIND force-pushed the delay_vty_closure_upon_error branch from 3b64ef3 to 8c442a6 Compare June 19, 2025 10:13
@pguibert6WIND pguibert6WIND marked this pull request as draft June 19, 2025 10:13
@pguibert6WIND pguibert6WIND force-pushed the delay_vty_closure_upon_error branch from 8c442a6 to 3094f07 Compare June 19, 2025 13:06
@frrbot frrbot bot added the tests Topotests, make check, etc label Jun 19, 2025
@github-actions github-actions bot added size/L and removed size/M labels Jun 19, 2025
@pguibert6WIND pguibert6WIND force-pushed the delay_vty_closure_upon_error branch 4 times, most recently from 9efd996 to 49b9d44 Compare June 20, 2025 10:19
If the 'show bgp ipv4 vpn' command is interrupted during the dump of the
whole entries, then a heap use after free may appear.

> May 15 16:08:14 vRR-DUT bgpd[1647]: [N9HHH-F8H1M] %ADJCHANGE: neighbor 3.3.1.232(taastpap-scale-1) in vrf default Up
> May 15 16:08:14 vRR-DUT bgpd[1647]: [GJPH1-W8PZV] Resetting peer 3.3.1.232 due to change in addpath config
> May 15 16:08:14 vRR-DUT bgpd[1647]: [YAF85-253AP][EC 100663299] buffer_flush_available: write error on fd 70: Broken pipe
> May 15 16:08:14 vRR-DUT bgpd[1647]: [THHDB-YPEY6][EC 100663299] vtysh_flush: write error to fd 70, closing
> May 15 16:08:14 vRR-DUT bgpd[1647]: =================================================================
> May 15 16:08:14 vRR-DUT bgpd[1647]: ==1647==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b001d65238 at pc 0x7efd46b83fa0 bp 0x7ffd4d598>
> May 15 16:08:14 vRR-DUT bgpd[1647]: READ of size 1 at 0x62b001d65238 thread T0
> May 15 16:08:14 vRR-DUT bgpd[1647]:     #0 0x7efd46b83f9f in vty_out (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f9f)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#1 0x5570c0a4e7d2 in aspath_print_vty /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_aspath.c:2308
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#2 0x5570c08a8ce6 in route_vty_out /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:9565
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#3 0x5570c08b5b65 in bgp_show_table /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11751
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#4 0x5570c08b677f in bgp_show_table_rd /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11915
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#5 0x5570c08b6d1c in bgp_show /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11964
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#6 0x5570c08bcb03 in show_ip_bgp_magic /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:13134
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#7 0x5570c086f616 in show_ip_bgp bgpd/bgp_route_clippy.c:514
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#8 0x7efd469b38b6  (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b38b6)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#9 0x7efd469b3c43 in cmd_execute_command (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b3c43)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#10 0x7efd469b4915 in cmd_execute (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b4915)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#11 0x7efd46b8581a  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38581a)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#12 0x7efd46b8a6ff  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38a6ff)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#13 0x7efd46b90f90  (/lib/x86_64-linux-gnu/libfrr.so.0+0x390f90)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#14 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#15 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#16 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#17 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#18 0x7efd46429e3f in __libc_start_main_impl ../csu/libc-start.c:392
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#19 0x5570c06bc384 in _start (/usr/bin/bgpd+0x272384)
> May 15 16:08:14 vRR-DUT bgpd[1647]: 0x62b001d65238 is located 56 bytes inside of 26536-byte region [0x62b001d65200,0x62b001d6b9a8)
> May 15 16:08:14 vRR-DUT bgpd[1647]: freed by thread T0 here:
> May 15 16:08:14 vRR-DUT bgpd[1647]:     #0 0x7efd470b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#1 0x7efd46a6f930 in qfree (/lib/x86_64-linux-gnu/libfrr.so.0+0x26f930)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#2 0x7efd46b92467 in vty_close (/lib/x86_64-linux-gnu/libfrr.so.0+0x392467)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#3 0x7efd46b8fd4c  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38fd4c)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#4 0x7efd46b83f60 in vty_out (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f60)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#5 0x5570c0a4e7d2 in aspath_print_vty /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_aspath.c:2308
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#6 0x5570c08a8ce6 in route_vty_out /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:9565
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#7 0x5570c08b5b65 in bgp_show_table /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11751
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#8 0x5570c08b677f in bgp_show_table_rd /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11915
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#9 0x5570c08b6d1c in bgp_show /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:11964
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#10 0x5570c08bcb03 in show_ip_bgp_magic /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_route.c:13134
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#11 0x5570c086f616 in show_ip_bgp bgpd/bgp_route_clippy.c:514
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#12 0x7efd469b38b6  (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b38b6)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#13 0x7efd469b3c43 in cmd_execute_command (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b3c43)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#14 0x7efd469b4915 in cmd_execute (/lib/x86_64-linux-gnu/libfrr.so.0+0x1b4915)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#15 0x7efd46b8581a  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38581a)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#16 0x7efd46b8a6ff  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38a6ff)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#17 0x7efd46b90f90  (/lib/x86_64-linux-gnu/libfrr.so.0+0x390f90)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#18 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#19 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#20 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#21 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> May 15 16:08:14 vRR-DUT bgpd[1647]: previously allocated by thread T0 here:
> May 15 16:08:14 vRR-DUT bgpd[1647]:     #0 0x7efd470b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#1 0x7efd46a6f7de in qcalloc (/lib/x86_64-linux-gnu/libfrr.so.0+0x26f7de)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#2 0x7efd46b8c5ba in vty_new (/lib/x86_64-linux-gnu/libfrr.so.0+0x38c5ba)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#3 0x7efd46b8f0fe  (/lib/x86_64-linux-gnu/libfrr.so.0+0x38f0fe)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#4 0x7efd46b75257 in event_call (/lib/x86_64-linux-gnu/libfrr.so.0+0x375257)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#5 0x7efd46a42ecf in frr_run (/lib/x86_64-linux-gnu/libfrr.so.0+0x242ecf)
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#6 0x5570c06bf804 in main /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:545
> May 15 16:08:14 vRR-DUT bgpd[1647]:     FRRouting#7 0x7efd46429d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
> May 15 16:08:14 vRR-DUT bgpd[1647]: SUMMARY: AddressSanitizer: heap-use-after-free (/lib/x86_64-linux-gnu/libfrr.so.0+0x383f9f) in vty_out
> May 15 16:08:14 vRR-DUT bgpd[1647]: Shadow bytes around the buggy address:
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a49f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> May 15 16:08:14 vRR-DUT bgpd[1647]: =>0x0c56803a4a40: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   0x0c56803a4a90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> May 15 16:08:14 vRR-DUT bgpd[1647]: Shadow byte legend (one shadow byte represents 8 application bytes):
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Addressable:           00
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Partially addressable: 01 02 03 04 05 06 07
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Heap left redzone:       fa
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Freed heap region:       fd
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Stack left redzone:      f1
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Stack mid redzone:       f2
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Stack right redzone:     f3
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Stack after return:      f5
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Stack use after scope:   f8
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Global redzone:          f9
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Global init order:       f6
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Poisoned by user:        f7
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Container overflow:      fc
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Array cookie:            ac
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Intra object redzone:    bb
> May 15 16:08:14 vRR-DUT bgpd[1647]:   ASan internal:           fe
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Left alloca redzone:     ca
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Right alloca redzone:    cb
> May 15 16:08:14 vRR-DUT bgpd[1647]:   Shadow gap:              cc
> May 15 16:08:14 vRR-DUT bgpd[1647]: ==1647==ABORTING

The vty_out() function detects a BROKEN PIPE error during the call of
vtysh_flush() call, and closes the vty pointer. Consequently, the next
vty_out() calls may use an invalid vty pointer.

The call to vtysh_flush() has been added in a previous commit: this call
does a syncronous write for a large buffer, which triggers the error.

Propose to delay the removal of the vty pointer at the next scheduling.
Until that event, keep the vty structure at CLOSE state. The show
function will eventually detect the invalid vty status, and will
anticipate the end of the show command.

Fixes: 9112fb3 ("lib: Memory spike reduction for sh cmds at scale")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The 'show bgp <address-family>' command may take a long time when
dumping a full route equivalent of entries. It is recommended to
interrupt if the underlying vty session is closed. Propose to leave the
cli handler by detecting it. The user experience will be better.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Declare vty_status enumerate, outside of the struct vty structure, to
define the status.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Because vty_close() is called, it is not necessary to set the status
before the call.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The VTY_CLOSE status is only set by vty_close(), vty_closed() functions.
Avoid setting this value in other places of code.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The vty_close() call is performed, because in the end, the expectation
is to close the vty structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Simplify code by avoiding calling vty_close, because it is already
scheduled.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
This API will be used when it will be possible to call the event library
to flush the vty structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Simplify code so that one single call of vty_close() is done.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When leaving a process, the vty structure must absolutely be freed.
Otherwise, memory leaks will happen. Identify all the cases by a comment
marked above the call.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Delay the removal of the vty socket in vty library, except at the
termination hooks, because the threads may be stopped too.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The closure of the vty session will be handled by an event thread that
is scheduled at the end of the current thread.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The buffer_flush_available() may detect a socket error, and this may be
a problem for next calls. Protect this call by sending a close event.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
@pguibert6WIND pguibert6WIND force-pushed the delay_vty_closure_upon_error branch from 49b9d44 to 971c1d8 Compare June 20, 2025 11:42
@mwinter-osr
Copy link
Member

CI:rerun

@eqvinox eqvinox self-requested a review June 30, 2025 08:39
@eqvinox
Copy link
Contributor

eqvinox commented Jun 30, 2025

Hmm. After looking at this for a few minutes, my initial thought is that we should add an on_close callback on the VTY so that bgpd can put a function there that aborts the dump as soon as the VTY is closed. It feels like that is a simpler solution than keeping the VTY around for longer, especially in regard to long-term maintenance and code complexity.

But I haven't fully looked at this yet, this is only my preliminary thoughts after scrolling through this PR.

@eqvinox
Copy link
Contributor

eqvinox commented Jun 30, 2025

Actually this looks like a regression from 9112fb3. That change is likely not correct, as I see it right now it is in general just straight up not permissible/possible to call vtysh_flush() from vty_out().

@eqvinox
Copy link
Contributor

eqvinox commented Jun 30, 2025

#19109 hopefully reverts the underlying root cause so this shouldn't be needed anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants