-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
psycopg2.errors.UniqueViolation: could not create unique index "receipts_graph_unique_index" when upgrading from <1.68.0 to >=1.70.0 #14406
Description
Seen as #14123
When upgrading Synapse from a version older than 1.68.0, the receipts_graph_unique_index
background update may fail with
2022-10-10 16:50:14,874 - synapse.storage.background_updates - 428 - INFO - background_updates-0- Starting update batch on background update 'receipts_graph_unique_index'
2022-10-10 16:50:14,877 - synapse.storage.background_updates - 620 - INFO - background_updates-0- Adding index receipts_graph_unique_index to receipts_graph
2022-10-10 16:50:14,949 - synapse.storage.background_updates - 299 - ERROR - background_updates-0- Error doing update
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 291, in run_background_updates
result = await self.do_next_background_update(sleep)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 421, in do_next_background_update
await self._do_background_update(desired_duration_ms)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 464, in _do_background_update
items_updated = await update_handler(progress, batch_size)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 621, in updater
await self.db_pool.runWithConnection(runner)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 976, in runWithConnection
return await make_deferred_yieldable(
File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 244, in inContext
result = inContext.theWork() # type: ignore[attr-defined]
File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
return func(*args, **kw)
File "/usr/local/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
result = func(conn, *args, **kw)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 969, in inner_func
return func(db_conn, *args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/background_updates.py", line 572, in create_index_psql
c.execute(sql)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 389, in execute
self._do_execute(self.txn.execute, sql, *args)
File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 432, in _do_execute
return func(sql, *args, **kwargs)
psycopg2.errors.UniqueViolation: could not create unique index "receipts_graph_unique_index"
DETAIL: Key (room_id, receipt_type, user_id)=(!watercooler-v9:maunium.net, m.read, @cat:feline.support) is duplicated.
The series of schema deltas involved are:
21/receipts.sql
- Creates the
receipts_graph(room_id, receipt_type, user_id, event_ids, data)
table withCONSTRAINT receipts_graph_uniqueness UNIQUE (room_id, receipt_type, user_id)
ie. there is one receipt for each user for each room.
72/07thread_receipts.sql.postgres
(1.68.0)
- Adds a
thread_id
column toreceipts_graph
- Adds a per-thread unique constraint:
CONSTRAINT receipts_graph_uniqueness_thread UNIQUE (room_id, receipt_type, user_id, thread_id);
ie. there is one receipt for each user for each thread in a room. - At this point, per-thread receipts don't work yet, because the original
(room_id, receipt_type, user_id)
constraint is too restrictive.
72/08thread_receipts.sql
(1.68.0)
- Queues up the
receipts_graph_unique_index
background update, which adds a(room_id, receipt_type, user_id)
constraintWHERE thread_id IS NULL
.
ie. there is one non-thread receipt for each user for each room.
73/08thread_receipts_non_null.sql.postgres
(1.70.0)
- Drops the original, non-thread-aware,
receipts_graph_uniqueness
constraint, allowing thread receipts to work.
sqlite takes a similar, equally confusing path.
The window where there is no unique constraint
Since background updates are run asynchronously, the receipts_graph_unique_index
background update may run after the last schema delta, leaving a window where there is no unique constraint on (room_id, receipt_type, user_id)
for NULL
thread_id
s.
Unsafe upserts
But that isn't the bug. We have logic to deal with this window. See
synapse/synapse/storage/database.py
Lines 90 to 100 in c3a4780
UNIQUE_INDEX_BACKGROUND_UPDATES = { | |
"user_ips": "user_ips_device_unique_index", | |
"device_lists_remote_extremeties": "device_lists_remote_extremeties_unique_idx", | |
"device_lists_remote_cache": "device_lists_remote_cache_unique_idx", | |
"event_search": "event_search_event_id_idx", | |
"local_media_repository_thumbnails": "local_media_repository_thumbnails_method_idx", | |
"remote_media_cache_thumbnails": "remote_media_repository_thumbnails_method_idx", | |
"event_push_summary": "event_push_summary_unique_index2", | |
"receipts_linearized": "receipts_linearized_unique_index", | |
"receipts_graph": "receipts_graph_unique_index", | |
} |
When one of these background updates is in progress, all our
simple_upsert*
operations are done manually without relying on unique constraints.And we don't upsert into
receipts_graph
with handwritten SQL anywhere.
Emulated upsert internals
The emulated upsert first tries an UPDATE
, then an INSERT
if the UPDATE
modified 0 rows.
The default isolation level in Synapse is REPEATABLE READ, which does not prevent the race where two upserts try to insert the same row at the same time.
But we've already thought of this and lock the entire table when doing the emulated upsert:
synapse/synapse/storage/database.py
Lines 1301 to 1302 in c3a4780
if lock: | |
self.engine.lock_table(txn, table) |
Except the locking is controlled by a parameter... and we've left it as False
:
synapse/synapse/storage/databases/main/receipts.py
Lines 856 to 868 in 7d59a51
self.db_pool.simple_upsert_txn( | |
txn, | |
table="receipts_graph", | |
keyvalues=keyvalues, | |
values={ | |
"event_ids": json_encoder.encode(event_ids), | |
"data": json_encoder.encode(data), | |
}, | |
where_clause=where_clause, | |
# receipts_graph has a unique constraint on | |
# (user_id, room_id, receipt_type), so no need to lock | |
lock=False, | |
) |
In summary, there's a window where there is no non-thread unique constraint on receipts_graph
and a race where we try to insert two new rows at the same time for the same (room_id, receipt_type, user_id)
.
The same probably applies to receipts_linearized
.