Skip to content

p2p: Switch.OnStop uses stale and already mutated PeerSet.List() which could have data race consequences due to a leaky abstraction of using .List() #2158

@odeke-em

Description

@odeke-em

Bug Report

The code in p2p/Switch.OnStop invokes sw.peers.List() per

for _, p := range sw.peers.List() {

and if we look at the code for .List()

cometbft/p2p/peer_set.go

Lines 153 to 156 in f4d73cd

func (ps *PeerSet) List() []Peer {
ps.mtx.Lock()
defer ps.mtx.Unlock()
return ps.list
which returns a copy of the slice while under a lock but after handing it over, it is no longer under a lock and is modified by .Remove() which causes .OnStop() to read stale data.

I found this problem while working on the correct fix for #2157 in which the correct way to shorten a slice is to ensure you insert a nil at the last element before reslicing to ensure that there are no references to it and that the backing array can be garbage collected i.e.

endItem := slice[len(slice)-1]
slice[len(slice)-1] = nil // nil it so that it'll be garbage collected

I can reliably reproduce this issue when I apply the fix for the referenced issue

Remedy

Check and ensure that the peer is non-nil.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp2p

    Type

    No type

    Projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions