This repository was archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
replication connection died #9010
Copy link
Copy link
Closed
Description
It looks like the connection between synapse and redis died, and synapse didn't notice.
In the logs, we saw:
2021-01-03 19:19:16,659 - synapse.metrics.background_process_metrics - 216 - ERROR - send-cmd-55720- Background process 'send-cmd' threw an exception
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/synapse/metrics/background_process_metrics.py", line 212, in run
result = await result
File "/usr/local/lib/python3.8/site-packages/synapse/replication/tcp/redis.py", line 185, in _async_send_command
await make_deferred_yieldable(
File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/local/lib/python3.8/site-packages/txredisapi.py", line 533, in handle_reply
raise r
txredisapi.ConnectionError: Lost connection
2021-01-03 19:19:16,696 - synapse.metrics - 576 - INFO - - Collecting gc 1
2021-01-03 19:19:16,857 - twisted - 254 - INFO - send-cmd-55721- Discarding dead connection.
Note that there are two TCP connections between each synapse process and redis (one for sending, one for receiving - and the one causing the error above is the sending side). Nevertheless, it's plausible to imagine that both sockets drop at a similar time, and this is reflected in the "number of clients" graph from redis:
(the synchrotron was restarted at 19:47: note the extra connection).
No replication data was received after this point. My hypothesis is that the TCP socket died, and the synapse side didn't notice due to the lack of TCP keepalives or other outgoing data.
Metadata
Metadata
Assignees
Labels
No labels