Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Synapse potentially cached a failed sync request #9507

@Half-Shot

Description

@Half-Shot

Running 4a32600

Last night my database suffered an issue which meant that it was briefly unavailable, and it dropped all connections for a period. Synapse seemed to largely recover as sync requests to some accounts are working, and federation/event creation seems to be working fine.

2021-02-27 13:10:17,830 - synapse.util.caches.response_cache - 191 - INFO - GET-84929- [sync]: using incomplete cached result for [(@Half-Shot:half-shot.uk, 0, 's44209228_293425298_29572_27619593_543941_1697_969558_19742008_52', '183', False, 'HEWQINNMGI')]
2021-02-27 13:16:43,781 - synapse.access.http.8018 - 299 - INFO - GET-84929- 86.16.94.189 - 8018 - {@Half-Shot:half-shot.uk} Processed request: 384.703sec/1.250sec (0.252sec, 0.000sec) (0.000sec/0.000sec/0) 1877428B 200 "GET /_matrix/client/r0/sync?filter=183&timeout=0&since=s44209228_293425298_29572_27619593_543941_1697_969558_19742008_52 HTTP/1.0" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Element/1.7.21 Chrome/85.0.4183.121 Electron/10.2.0 Safari/537.36" [0 dbevts]

(ignore the absolutely awful sync duration, it's sadly a new normal atm)

However, sync requests for my Element android device fail with a 500 each time:

2021-02-27 13:18:55,369 - synapse.util.caches.response_cache - 189 - INFO - GET-85063- [sync]: using completed cached result for [(@Half-Shot:half-shot.uk, 0, None, '181', False, 'DHNXFYGPRT')]
2021-02-27 13:18:55,369 - twisted - 257 - CRITICAL - GET-85063- Unhandled error in Deferred:
2021-02-27 13:18:55,370 - twisted - 257 - CRITICAL - GET-85063- 
Traceback (most recent call last):
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 532, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/databases/state/bg_updates.py", line 121, in _get_state_groups_from_groups_txn
    txn.execute(sql % (where_clause,), args)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 294, in execute
    self._do_execute(self.txn.execute, sql, *args)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 320, in _do_execute
    return func(sql, *args)
psycopg2.OperationalError: SSL SYSCALL error: EOF detected


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/synapse/lib/python3.9/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 313, in _wait_for_sync_for_user
    result = await self.current_sync_for_user(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 344, in current_sync_for_user
    return await self.generate_sync_result(sync_config, since_token, full_state)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 1000, in generate_sync_result
    res = await self._generate_sync_entry_for_rooms(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 1444, in _generate_sync_entry_for_rooms
    await concurrently_execute(handle_room_entries, room_entries, 10)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/util/async_helpers.py", line 174, in _concurrently_execute_inner
    await maybe_awaitable(func(next(it)))
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 1432, in handle_room_entries
    res = await self._generate_room_entry(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 1829, in _generate_room_entry
    batch = await self._load_filtered_recents(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/handlers/sync.py", line 516, in _load_filtered_recents
    loaded_recents = await filter_events_for_client(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/visibility.py", line 84, in filter_events_for_client
    event_id_to_state = await storage.state.get_state_for_events(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/state.py", line 467, in get_state_for_events
    group_to_state = await self.stores.state._get_state_for_groups(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/databases/state/store.py", line 257, in _get_state_for_groups
    group_to_state_dict = await self._get_state_groups_from_groups(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/databases/state/store.py", line 162, in _get_state_groups_from_groups
    res = await self.db_pool.runInteraction(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 660, in runInteraction
    result = await self.runWithConnection(
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 743, in runWithConnection
    return await make_deferred_yieldable(
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/threadpool.py", line 250, in inContext
    result = inContext.theWork()
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/threadpool.py", line 266, in <lambda>
    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 306, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/python/compat.py", line 464, in reraise
    raise exception.with_traceback(traceback)
  File "/opt/synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 297, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 738, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 548, in new_transaction
    conn.rollback()
  File "/opt/synapse/lib/python3.9/site-packages/synapse/storage/database.py", line 160, in rollback
    self.conn.rollback()
  File "/opt/synapse/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 70, in rollback
    raise ConnectionLost()
twisted.enterprise.adbapi.ConnectionLost

2021-02-27 13:18:55,679 - synapse.http.server - 90 - ERROR - GET-85063- Failed handle request via 'SyncRestServlet': <XForwardedForRequest at 0x7ffb2e2e55e0 method='GET' uri='/_matrix/client/r0/sync?filter=181&set_presence=offline&timeout=0' clientproto='HTTP/1.0' site=8018>
2021-02-27 13:18:55,703 - synapse.access.http.8018 - 299 - INFO - GET-85063- 86.16.94.189 - 8018 - {@Half-Shot:half-shot.uk} Processed request: 0.336sec/-0.000sec (0.009sec, 0.004sec) (0.000sec/0.000sec/0) 55B 500 "GET /_matrix/client/r0/sync?filter=181&set_presence=offline&timeout=0 HTTP/1.0" "Element/1.1.0 (Linux; U; Android 11; Pixel 3a Build/RQ1A.210105.00; Flavour GooglePlay; MatrixAndroidSDK_X 0.0.1)" [0 dbevts]

Given that I am 100% certain that my other devices are actively getting new events and hitting the DB successfully on the same process, I am confused as to why the Element android device might be seeing this over and over. It feels like the failure was cached, especially due to the quick response time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    S-MajorMajor functionality / product severely impaired, no satisfactory workaround.T-DefectBugs, crashes, hangs, security vulnerabilities, or other reported issues.X-RegressionSomething broke which worked on a previous releaseX-Release-BlockerMust be resolved before making a release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions