Skip to content

Memory leak seen in bgpd on FRR 8.5.1 after stress test of announcing/withdrawing routes #15459

@saiarcot895

Description

@saiarcot895

Description

On FRR 8.5.1, after repeatedly running a stress test where 6400 routes are repeatedly announced and withdrawn, there is a memory leak that builds up in a short amount of time. Specifically, about 11MB is taken in bgpd after each loop. This results in the device running out of memory eventually.

It appears that the memory increase appears to be coming from the number of community objects increasing. The number of community objects increased by 50k-100k after every loop.

This may be partially related to #14828, which saw a memory leak in the large-community objects.

Version

str2-7260cx3-acs-9# show version
FRRouting 8.5.1 (str2-7260cx3-acs-9) on Linux(5.10.0-23-2-amd64).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--localstatedir=/var/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--disable-rpki' '--disable-scripting' '--enable-pim6d' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

How to reproduce

Set up a topology with 4 upstream peers in a different ASN. Have all 4 upstream peers advertise the same 6400 prefixes to this device.

Then, from each of the peers, withdraw and re-announce all of the routes. This process of withdraw and announce is considered one loop. Repeat for 10-100 times.

Expected behavior

There is no memory increase from this.

Actual behavior

There is a steady 10-11MB increase after each loop.

Additional context

No response

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions