-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Description
Recently, after upgrading the FRR version on both Baremetal and VMs to 10.2.1, we encountered an abnormal issue.
When the VM is rebooted or systemd-network is restarted, the BGP session remains stuck in Idle until we either bring down and up the bridge interface or restart FRR on the physical machine (bare-metal).
However, when using FRR versions 9.0.2/9.0.5 on the physical machine, the BGP peer automatically re-establishes correctly without issues
FRR log (VM):
Jan 15 19:03:01 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:01 vm-01 bgpd[3231]: [H4B4J-DCW2R][EC 33554455] ens4 [Error] bgp_read_packet error: Connection reset by peer
Jan 15 19:03:11 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:11 vm-01 bgpd[3231]: [H4B4J-DCW2R][EC 33554455] ens4 [Error] bgp_read_packet error: Connection reset by peer
Jan 15 19:03:21 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:31 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:31 vm-01 bgpd[3231]: [H4B4J-DCW2R][EC 33554455] ens4 [Error] bgp_read_packet error: Connection reset by peer
Jan 15 19:03:41 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:41 vm-01 bgpd[3231]: [H4B4J-DCW2R][EC 33554455] ens4 [Error] bgp_read_packet error: Connection reset by peer
Jan 15 19:03:51 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Jan 15 19:03:51 vm-01 bgpd[3231]: [H4B4J-DCW2R][EC 33554455] ens4 [Error] bgp_read_packet error: Connection reset by peer
Jan 15 19:04:01 vm-01 bgpd[3231]: [TXY0T-CYY6F][EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Packet sniffer (VM):
root@vm-01:~# tcpdump -i ens4 'tcp port 179' -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens4, link-type EN10MB (Ethernet), snapshot length 262144 bytes
19:04:40.697365 IP6 fe80::666f:5bff:fef0:82d7.44616 > fe80::182d:37ff:fe5d:3c0e.179: Flags [S], seq 3840576242, win 62580, options [mss 8940,sackOK,TS val 2973803148 ecr 0,nop,wscale 14], length 0
19:04:40.697490 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44616: Flags [S.], seq 1164270323, ack 3840576243, win 62496, options [mss 8940,sackOK,TS val 1447149278 ecr 2973803148,nop,wscale 14], length 0
19:04:40.697526 IP6 fe80::666f:5bff:fef0:82d7.44616 > fe80::182d:37ff:fe5d:3c0e.179: Flags [.], ack 1, win 4, options [nop,nop,TS val 2973803148 ecr 1447149278], length 0
19:04:40.697603 IP6 fe80::666f:5bff:fef0:82d7.44616 > fe80::182d:37ff:fe5d:3c0e.179: Flags [P.], seq 1:184, ack 1, win 4, options [nop,nop,TS val 2973803148 ecr 1447149278], length 183: BGP
19:04:40.697684 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44616: Flags [.], ack 184, win 4, options [nop,nop,TS val 1447149278 ecr 2973803148], length 0
19:04:40.697687 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44616: Flags [R.], seq 1, ack 184, win 4, options [nop,nop,TS val 1447149278 ecr 2973803148], length 0
19:04:40.697720 IP6 fe80::666f:5bff:fef0:82d7.44616 > fe80::182d:37ff:fe5d:3c0e.179: Flags [R], seq 3840576426, win 0, length 0
BGP session (abnormal):
vm-01# show ip bgp summary
IPv4 Unicast Summary:
BGP router identifier 100.111.91.240, local AS number 65500 VRF default vrf-id 0
BGP table version 3
RIB entries 4, using 512 bytes of memory
Peers 1, using 24 KiB of memory
Peer groups 1, using 64 bytes of memoryNeighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
ens4 4 65500 0 59 0 0 0 never Active 0 N/A
Bring down and up the bridge interface or restart FRR on the physical machine
root@vm-01:~# tcpdump -i ens4 'tcp port 179' -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on ens4, link-type EN10MB (Ethernet), snapshot length 262144 bytes
19:05:42.900870 IP6 fe80::666f:5bff:fef0:82d7.60170 > fe80::182d:37ff:fe5d:3c0e.179: Flags [S], seq 1472385439, win 62580, options [mss 8940,sackOK,TS val 2973865351 ecr 0,nop,wscale 14], length 0
19:05:42.901019 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.60170: Flags [S.], seq 3060964090, ack 1472385440, win 62496, options [mss 8940,sackOK,TS val 1447211481 ecr 2973865351,nop,wscale 14], length 0
19:05:42.901081 IP6 fe80::666f:5bff:fef0:82d7.60170 > fe80::182d:37ff:fe5d:3c0e.179: Flags [.], ack 1, win 4, options [nop,nop,TS val 2973865352 ecr 1447211481], length 0
19:05:42.901237 IP6 fe80::666f:5bff:fef0:82d7.60170 > fe80::182d:37ff:fe5d:3c0e.179: Flags [P.], seq 1:184, ack 1, win 4, options [nop,nop,TS val 2973865352 ecr 1447211481], length 183: BGP
19:05:42.901287 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.60170: Flags [F.], seq 1, ack 1, win 4, options [nop,nop,TS val 1447211481 ecr 2973865352], length 0
19:05:42.901316 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.60170: Flags [R], seq 3060964091, win 0, length 0
19:05:52.901889 IP6 fe80::666f:5bff:fef0:82d7.44206 > fe80::182d:37ff:fe5d:3c0e.179: Flags [S], seq 4072060639, win 62580, options [mss 8940,sackOK,TS val 2973875352 ecr 0,nop,wscale 14], length 0
19:05:53.928066 IP6 fe80::666f:5bff:fef0:82d7.44206 > fe80::182d:37ff:fe5d:3c0e.179: Flags [S], seq 4072060639, win 62580, options [mss 8940,sackOK,TS val 2973876379 ecr 0,nop,wscale 14], length 0
19:05:54.227238 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44206: Flags [S.], seq 3255217763, ack 4072060640, win 62496, options [mss 8940,sackOK,TS val 1447222508 ecr 2973876379,nop,wscale 14], length 0
19:05:54.227268 IP6 fe80::666f:5bff:fef0:82d7.44206 > fe80::182d:37ff:fe5d:3c0e.179: Flags [.], ack 1, win 4, options [nop,nop,TS val 2973876678 ecr 1447222508], length 0
19:05:54.227432 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44206: Flags [F.], seq 1, ack 1, win 4, options [nop,nop,TS val 1447222807 ecr 2973876678], length 0
19:05:54.227460 IP6 fe80::666f:5bff:fef0:82d7.44206 > fe80::182d:37ff:fe5d:3c0e.179: Flags [P.], seq 1:184, ack 2, win 4, options [nop,nop,TS val 2973876678 ecr 1447222807], length 183: BGP
19:05:54.227553 IP6 fe80::182d:37ff:fe5d:3c0e.179 > fe80::666f:5bff:fef0:82d7.44206: Flags [R], seq 3255217765, win 0, length 0
19:05:54.576228 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [S], seq 1796987062, win 62580, options [mss 8940,sackOK,TS val 1447223156 ecr 0,nop,wscale 14], length 0
19:05:54.576269 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [S.], seq 3311269946, ack 1796987063, win 62496, options [mss 8940,sackOK,TS val 2973877027 ecr 1447223156,nop,wscale 14], length 0
19:05:54.576328 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [.], ack 1, win 4, options [nop,nop,TS val 1447223156 ecr 2973877027], length 0
19:05:54.576400 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [P.], seq 1:170, ack 1, win 4, options [nop,nop,TS val 1447223156 ecr 2973877027], length 169: BGP
19:05:54.576429 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [.], ack 170, win 4, options [nop,nop,TS val 2973877027 ecr 1447223156], length 0
19:05:54.576699 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [P.], seq 1:184, ack 170, win 4, options [nop,nop,TS val 2973877027 ecr 1447223156], length 183: BGP
19:05:54.576744 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [.], ack 184, win 4, options [nop,nop,TS val 1447223156 ecr 2973877027], length 0
19:05:54.576798 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [P.], seq 170:189, ack 184, win 4, options [nop,nop,TS val 1447223156 ecr 2973877027], length 19: BGP
19:05:54.576829 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [P.], seq 184:203, ack 189, win 4, options [nop,nop,TS val 2973877027 ecr 1447223156], length 19: BGP
19:05:54.619076 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [.], ack 203, win 4, options [nop,nop,TS val 1447223199 ecr 2973877027], length 0
19:05:55.677175 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [P.], seq 203:725, ack 189, win 4, options [nop,nop,TS val 2973878128 ecr 1447223199], length 522: BGP
19:05:55.677311 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [.], ack 725, win 4, options [nop,nop,TS val 1447224257 ecr 2973878128], length 0
19:05:55.877183 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [P.], seq 189:506, ack 725, win 4, options [nop,nop,TS val 1447224457 ecr 2973878128], length 317: BGP
19:05:55.920065 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [.], ack 506, win 4, options [nop,nop,TS val 2973878371 ecr 1447224457], length 0
19:06:04.021284 IP6 fe80::666f:5bff:fef0:82d7.179 > fe80::182d:37ff:fe5d:3c0e.56812: Flags [P.], seq 725:875, ack 506, win 4, options [nop,nop,TS val 2973886472 ecr 1447224457], length 150: BGP
19:06:04.021397 IP6 fe80::182d:37ff:fe5d:3c0e.56812 > fe80::666f:5bff:fef0:82d7.179: Flags [.], ack 875, win 4, options [nop,nop,TS val 1447232601 ecr 2973886472], length 0
^C
29 packets captured
29 packets received by filter
0 packets dropped by kernel
root@vm-01:~#
root@vm-01:~# vtysh
Hello, this is FRRouting (version 10.2.1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
vm-01# show ip bgp summary
IPv4 Unicast Summary:
BGP router identifier 100.111.91.240, local AS number 65500 VRF default vrf-id 0
BGP table version 5
RIB entries 6, using 768 bytes of memory
Peers 1, using 24 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
ens4 4 65500 7 74 5 0 0 00:00:23 2 3 FRRouting/10.2.1
Version
OS/kernel: Ubuntu 22.04/6.2.0-39-generic
FRR version: 10.2.1
How to reproduce
Both Bare-metal and VM are using FRR version 10.2.1.
BM's FRR config:
hostname bm-01
log file /var/log/frr/bgpd.log informational
log syslog informational
no zebra nexthop kernel enable
service intergrated-vtysh-config
!
interface eno5
ipv6 nd ra-interval 4
ipv6 nd ra-lifetime 10
no ipv6 nd suppress-ra
!
interface ens3f0
ipv6 nd ra-interval 4
ipv6 nd ra-lifetime 10
no ipv6 nd suppress-ra
!
interface br1
ipv6 nd ra-interval 4
ipv6 nd ra-lifetime 10
no ipv6 nd suppress-ra
exit
!
router bgp 65500 vrf vm
bgp router-id 100.111.11.228
neighbor vm_fabric peer-group
neighbor vm_fabric remote-as 65500
neighbor vm_fabric description Internal VM Network
neighbor vm_fabric bfd
neighbor vm_fabric bfd profile bfd_template
neighbor vm_fabric timers connect 10
neighbor vm_fabric capacity extended-nexthop
neighbor br1 interface peer-group vm_fabric
!
address-family ipv4 unicast
redistribute kernel route-map route_filter
redistribute connected route-map route_filter
neighbor vm_fabric default-originate
neighbor vm_fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
exit-address-family
!
address-family ipv6 unicast
redistribute kernel route-map v6_route_filter
redistribute connected route-map v6_route_filter
neighbor vm_fabric activate
neighbor vm_fabric default-originate
neighbor vm_fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
exit-address-family
exit
!
router bgp 64705
bgp router-id 100.111.11.228
neighbor fabric peer-group
neighbor fabric remote-as 64705
neighbor fabric description Interal Fabric Network
neighbor fabric bfd
neighbor fabric bfd profile bfd_template
neighbor fabric timers connect 10
neighbor fabric capability extended-nexthop
neighbor eno5 interface peer-group fabric
neighbor ens3f0 interface peer-group fabric
!
address-family ipv4 unicast
redistribute kernel route-map route_filter
redistribute connected route-map route_filter
neighbor fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
import vrf vm
exit-address-family
!
address-family ipv6 unicast
redistribute kernel route-map v6_route_filter
redistribute connected route-map v6_route_filter
neighbor fabric activate
neighbor fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
import vrf vm
exit-address-family
exit
!
access-list block_default seq 5 permit 0.0.0.0/0 exact-match
!
ipv6 access-list v6_block_default seq 5 permit ::/0 exact-match
!
route-map route_filter deny 10
match ip address block_default
exit
!
route-map route_filter permit 20
exit
!
route-map v6_route_filter deny 10
match ipv6 address v6_block_default
exit
!
route-map v6_route_filter permit 20
exit
!
bfd
profile bfd_template
detect-multiplier 6
transmit-interval 500
receive-interval 500
exit
!
exit
!
VM's FRR config:
log file /var/log/frr/bgpd.log informational
log syslog informational
service integrated-vtysh-config
no zebra nexthop kernel enable
!
interface ens4
ipv6 nd ra-interval 4
ipv6 nd ra-lifetime 10
no ipv6 nd suppress-ra
!
router bgp 65500
bgp router-id 100.111.91.240
neighbor fabric peer-group
neighbor ens4 peer-group fabric
neighbor fabric remote-as 65500
neighbor fabric description Interal Fabric Network
neighbor fabric bfd
neighbor fabric bfd profile bfd_template
neighbor fabric timers connect 10
neighbor fabric capability extended-nexthop
neighbor ens4 interface peer-group fabric
!
address-family ipv4 unicast
redistribute kernel route-map route_filter
redistribute connected route-map route_filter
neighbor fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
exit-address-family
!
address-family ipv6 unicast
redistribute kernel route-map v6_route_filter
redistribute connected route-map v6_route_filter
neighbor fabric activate
neighbor fabric soft-reconfiguration inbound
maximum-paths 64
maximum-paths ibgp 64
exit-address-family
!
access-list block_default seq 5 permit 0.0.0.0/0 exact-match
!
ipv6 access-list v6_block_default seq 5 permit ::/0 exact-match
!
route-map route_filter deny 10
match ip address block_default
!
route-map route_filter permit 20
!
route-map v6_route_filter deny 10
match ipv6 address v6_block_default
exit
!
route-map v6_route_filter permit 20
exit
!
bfd
profile bfd_template
detect-multiplier 6
transmit-interval 500
receive-interval 500
exit
!
exit
!
line vty
!
Expected behavior
After rebooting the VM or restarting systemd-networkd, the BGP session automatically re-establishes
Actual behavior
BGP session remains in an Idle state
Additional context
No response
Checklist
- I have searched the open issues for this bug.
- I have not included sensitive information in this report.