-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Large amounts of repeated DB queries when using multiple federation sender workers #9113
Description
Description
wake_destinations_needing_catchup
performs massive amounts of repeated SELECT queries when using multiple federation_sender workers. Using a single federation_sender works well.
Here are the debug logs from the federation senders, from startup up until the issue starts occuring. I don't think there's any value to providing more of the log, as the output just repeats itself after this point.
The actual destination in the query seems to vary though, I've seen three destinations so far: toofat.ru
, wc20.tencapsule.com
and fpoe.info
. The output of a sample query with all three destinations follows below:
synapse_prod=# SELECT destination FROM destinations WHERE destination IN ( SELECT destination FROM destination_rooms WHERE destination_rooms.stream_ordering > destinations.last_successful_stream_ordering ) AND destination > 'toofat.ru' AND ( retry_last_ts IS NULL OR retry_last_ts + retry_interval < 1610625475512 ) ORDER BY destination LIMIT 25;
destination
-------------
wolffy.eu
(1 row)
synapse_prod=# SELECT destination FROM destinations WHERE destination IN ( SELECT destination FROM destination_rooms WHERE destination_rooms.stream_ordering > destinations.last_successful_stream_ordering ) AND destination > 'fpoe.info' AND ( retry_last_ts IS NULL OR retry_last_ts + retry_interval < 1610625676424 ) ORDER BY destination LIMIT 25;
destination
---------------------
komputernerds.com
matrix.insw.cz
matrix.noppesict.nl
supersandro.de
thedreamer.nl
toofat.ru
wolffy.eu
(7 rows)
synapse_prod=# SELECT destination FROM destinations WHERE destination IN ( SELECT destination FROM destination_rooms WHERE destination_rooms.stream_ordering > destinations.last_successful_stream_ordering ) AND destination > 'wc20.tencapsule.com' AND ( retry_last_ts IS NULL OR retry_last_ts + retry_interval < 1610631393252 ) ORDER BY destination LIMIT 25;
destination
-------------
wolffy.eu
(1 row)
Steps to reproduce
- Enable two or more federation_sender workers
- Start the main process and all workers
- Wait ~30 seconds
Of course, this is probably specific to me and the contents of my db, so naturally I'll provide any further information as needed. Just let me know! :)
Version information
- Homeserver: queersin.space
If not matrix.org:
-
Version: 1.25 (note: problem occurred on 1.24 as well)
-
Install method: Via apt from the matrix.org repo
- Platform: An lxc container running Ubuntu 20.04.1