-
Notifications
You must be signed in to change notification settings - Fork 906
Description
This ticket is created to track the work on issues identified by the sanitizer.
Related tickets: #1547, #1609, #1610.
Findings
- FINDING
P00
==============================
False positives or sanitizer collision.
SEVERITY: NONE.
- FINDING
P01-LastRspAckTime
==============================
Accessing CUDT::m_tsLastRspAckTime without lock.
SEVERITY: LOW
- FINDING
P02-SndBuffer_count
==============================
Unprotected access to CSndBuffer::m_iCount through getCurrBufSize
with simultaneous modification from CSndBuffer::addBuffer
SEVERITY: MEDIUM
FOCUS: might be that atomic can fix it, but the size is modified
usually together with buffer contents and internal pointers, so
this is likely data integrity violation.
- FINDING
P03-CSndUList
==============================
CSndUList against CSndQueue and m_WindowCond/m_WindowLock locking
that caused accessing m_iLastEntry without locking. CV synchronization
and access control use two different locks located in two different classes,
shared by a plain pointer. Larger rework required.
SEVERITY: MEDIUM
FOCUS: This is a design flaw; two different mutexes in two different
classes (one class of interest accesses it by a pointer) guard the same data.
A planned rework code is already in progress.
- FINDING
P04-Cycle3
==============================
Report Add SRT logo and fix typo. #8, Cycled ordering between CUDT::m_ConnectionLock and CUDT::m_SendLock,
with involved CSndUList::m_ListLock.
Report #28, Cycled ordering between CUDT::m_ConnectionLock and CUDTUnited::m_GlobControlLock
with involved CRcvQueue::m_LSLock.
SEVERITY: HIGH
FOCUS: This is a series of lock cycles, all of them must be resolved
under highest priority. This is a high deadlock risk.
- FINDING
P05-m_dCongestionWindow
==============================
Unguarded writing to CUDT::m_dCongestionWindow while reading
(likely requires atomic)
SEVERITY: LOW
- FINDING
P06-LastSendTime
==============================
Unguarded read of CUDT::m_tsLastSndTime,
likely should be under CUDT::m_ConnectionLock
SEVERITY: MEDIUM
FOCUS: Although this itself shouldn't be a big deal, the matter of
what exactly data is guarded by m_ConnectionLock is worth focusing.
Theoreticaly it should guard data that are used during connection
process, but then there are also data being used by workers and the
main thread simultaneously as a part of the data transmission activities,
and these don't even have dedicated mutexes to protect the data.
Not high priority because this looks like a problem derived from UDT.
- FINDING
P07-SendInterval
==============================
Unguarded write to CUDT::m_tdSendInterval
SEVERITY: LOW
- FINDING
P08-m_pLastBlock
==============================
Unguarded READING AND WRITING of CSndBuffer::m_pLastBlock.
Simultaneous access from CSndQueue::worker and CRcvQueue::worker.
Likely to be guarded by CSndBuffer::m_BufLock.
SEVERITY: MEDIUM
FOCUS: Likely a mutex that isn't consistently used.
- FINDING
P09-m_iFlowWindowSize
==============================
Unguarded access to CUDT::m_iFlowWindowSize
Writtten by: ACK report
Read by: sender worker when preparing a packet to send.
Likely to be fixed by atomic.
SEVERITY: LOW
- FINDING
P11-m_iSndLastAck
==============================
Unguarded read and write of CUDT::m_iSndLastAck
Likely to be solved by atomic
SEVERITY: LOW
- FINDING
P12-ACK-data
==============================
Unguarded reading and writing of CUDT::m_iRTT and CUDT::m_iRTTVar,
plus other fields belonging to those filled by incoming ACK message.
Likely atomic.
SEVERITY: LOW
- FINDING
P13-zPayloadSize
==============================
Unguarded r/w of LiveCC::m_zSndAvgPayloadSize
Likely atomic
SEVERITY: LOW
- FINDING
P14-SndBuf_CurBlock
==============================
Unguarded WRITE (!!!) to CSndBuffer::m_pCurrBlock with reading by RcvQueue::worker with RecvAckLock and BufLock applied
SEVERITY: MEDIUM
FOCUS: Likely the whole CSndBuffer should be under that protection,
although might be that atomic can fix it.
- FINDING
P15-SndBuf-BytesCount
==============================
Unguarded read of CSndBuffer::m_iBytesCount in CSndBuffer::getCurrBufSize().
May lead to data discrepancy as this function accesses also CSndBuffer::m_pFirstBlock,
both being modified in CSndBuffer::ackData.
Likely the reading function requires applying lock on m_BufLock.
SEVERITY: MEDIUM
FOCUS: Likely a series referring to guards for CSndBuffer.
- FINDING
P16-SndCurrSeqNo
==============================
Unguarded reading and writing of CUDT::m_iSndCurrSeqNo.
Important: in CUDT::packData this field is READ AND MODIFIED.
Likely atomic.
SEVERITY: MEDIUM
FOCUS: Looks like soluble by atomic, but read-and-write is kinda
more complicated.
- FINDING
P17-SndBuffer_Block
==============================
Sumultaneous read and write of CSndBuffer::Block::m_iLength.
This looks like simply a data race conflict between CSndBuffer::readData() and CSndBuffer::addBuffer().
Likely to be protected by CSndBuffer::m_iBufLock, but this looks much more complicated.
Report #27 additionally contains possible conflict between CChannel::sendto and CSndBuffer::addBuffer,
which seems to write into the same block, although this still seems to be caused by quite ordered read
and write by data-read ordering, just not sanctioned by atomic or mutexes.
SEVERITY: MEDIUM
FOCUS: Likely a series referring to guards for CSndBuffer.
- FINDING
P18-pCryptoControl
==============================
Unguarded access to CUDT::m_pCryptoControl in CRcvQueue::worker, while it might be
simultaneously deleted through the call to srt_close() from the main thread.
SEVERITY: MEDIUM
FOCUS: Likely there's no mutex employed to guard particular data.
- FINDING
P19-CloseGC
==============================
Unguarded (mutexes applied, but different) access to CUDTUnited::m_bClosing.
Likely atomic.
FOCUS: Likely to be merged with P21.
- FINDING
P21-bool-state-flags
==============================
The CUDT::m_bConnected field is read without protection.
Similar reports about m_bClosing or m_bBroken.
Likely can be fixed by atomic, but might be that CUDT::m_ConnectionLock could be locked
for reading this field.
SEVERITY: LOW
- FINDING
P22-GC-against-RcvUList
==============================
CRNode::m_bOnList accessed for writing without protection. The same node
is accessed for reading in a GC thread in the loop rolling over closed sockets.
This could be a bigger problem related to the containment in m_pRcvUList, that is,
the loop over the CRcvQueue::m_pRcvUList might contain CUDT's that have been already
recycled into m_ClosedSockets. The case of m_bOnList is only a marker that the
node entry is invalid, as the entity has been removed from the container. The loop
in GC is also checking the node itself. This is likely something old and exists
since the beginning, so might be that atomic could solve it painlessly.
CRcvQueue::m_pHash contains CUDT entities that may belong to socket that are recycled
and GC is going to delete them. What happened here:
- The CRcvQueue::worker (worker_ProcessAddressedPacket) extracts the CUDT by given ID in its m_pHash container.
- Then performs any kind of operations (in this case, checkTimers -> checkNAKTimer -> m_pRcvLossList->getLossLength()
Simultaneously:
The socket of this entity is being deleted, so the destructor is called:
- for CUDT
- because destroying CUDTSocket
- because the CUDTUnited::removeSocket is deleting RECYCLED sockets
Likely how to fix it: The m_pHash container must be taken out and the entities
collected some other way. This could be also a container of the same kind and
purpose, just in a different place, and the activities performed on its contents
must be done strictly under a mutex that protect its contents. For that purpose
the sockets can be using a busy flag to prevent them from deletion. And the loop
in the CRcvQueue::worker, when finding a socket that has been marked as closed,
should be removed from the continer immediately. The GC thread should also
remove the objects by itself from this container, which is why the whole operation
on the container should be guarded by a dedicated mutex.
In #38 there's a similar problem, the entity is extracted from the RcvUList list,
while the alleged entity is at the same time being deleted.
SEVERITY: HIGH
FOCUS: The socket deletion is done kinda dangerous way and conditions
under which the socket is allowed to be physically deleted are unclear.
In specific situations it may lead to deleting a socket that is under
operation in another thread and this may end up with a deleted-memory-access
and crash.
- FINDING
P23-QueueClosingFlag
==============================
m_bClosing flag in the queues.
Likely fixable by atomic.
SEVERITY: LOW
- FINDING
P24-iRexmitCount
==============================
Unguarded reading of m_iReXmitCount in CRcvQueue::worker.
Likely needs guard of m_RcvAckLock, although atomic might help, too.
SEVERITY: LOW
Findings Related to Socket Groups
See below.
Attachements
The details are collected in directories in the attached archive.