Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Conversation

Fizzadar
Copy link
Contributor

@Fizzadar Fizzadar commented Feb 18, 2022

Probably fixes #11620.

Let me know if you'd prefer this in two PRs!

Signed-off-by: Nick Mills-Barrett nick@beeper.com

Pull Request Checklist

@Fizzadar Fizzadar requested a review from a team as a code owner February 18, 2022 14:57
Comment on lines +298 to +300
"complete_appservice_txn",
_complete_appservice_txn,
isolation_level=IsolationLevel.READ_COMMITTED,
Copy link
Contributor

@squahtx squahtx Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realise one of the transactions in question was non-trivial! That complicates things a little since last_txn can now change out from under us in between _get_last_txn and the upsert. Previously, we'd get a SerializationFailure and retry until we get a consistent view throughout the transaction.

I think this is okay though since last_txn_id is only used for a debug check, which will now be less reliable - if two complete_appservice_txn transactions run simultaneously with the same txn_id, neither of them may trip the debug check (whereas before, one of them would have to retry and trip the check). It's best to leave a comment noting that last_txn_id can change out from under us before the upsert because we're in READ COMMITTED mode, or something to that effect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is to use a RETURNING clause on the upsert to get the last_txn_id. The downside is you'd need a different path for SQLite (which doesn't support RETURNING).

If we then also replace the DELETE with a DELETE WHERE txn_id <= ? then I don't think this even needs to be a transaction, and we can do the two queries with db_autocommit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought - should txn_id be generated using a sequence? The debug check and manual generation of TXN ID seems unncessary to me (this would potentially open up running multiple appservice workers with additional work I think?).

Currently I believe this is safe but only because there is only a single process-per-AS working on transactions (enforced by https://github.com/matrix-org/synapse/blob/develop/synapse/appservice/scheduler.py#L181).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason the debug check trips on matrix.org from time to time, perhaps around the time synapse workers are restarting(?). It's not immediately clear to me why it happens.

I can't see why we don't use a sequence for txn_id. Something to do with sqlite support maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly historical reasons? The AS codepaths look old on blame to me! Not super familiar with the rest of synapse but sequences in use/defined in https://github.com/matrix-org/synapse/blob/develop/synapse/storage/util/sequence.py so potentially would be beneficial to switch to those. Need more context to make that call though, and probably beyond the scope of this PR!

@squahtx squahtx self-assigned this Feb 18, 2022
@erikjohnston erikjohnston added the X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged label Feb 23, 2022
@Fizzadar
Copy link
Contributor Author

This has now been fixed in #12209!

@Fizzadar Fizzadar closed this Apr 26, 2022
@Fizzadar Fizzadar deleted the as-transaction-isolation branch April 26, 2022 12:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SerializationFailure on appservice outgoing transaction (updating last_txn)
3 participants