Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

_room_pdu_linearizer starves other incoming transactions #9490

@richvdh

Description

@richvdh

I had two transactions arrive at about the same time, each containing one PDU for the same room, each of which had a prev_events ref to a single unknown event.

The first transaction required a significant backfill/state resolution operation, requiring over 90 minutes of activity.

The second transaction waited patiently for in the _room_pdu_linearizer for the first one to finish, and once it finally got to the front of the queue, was processed within 20 seconds. So I'm now 90 minutes behind in that room, and once you get behind it's very hard to catch up.

My questions are:

  • can we somehow round-robin between transactions for a room, rather than having a single transaction hold the lock for 90 minutes?
  • Is this lock even required? If I understand correctly, when you have multiple federation_inbound workers, we can end up with each worker doing its own backfill here.

Metadata

Metadata

Assignees

Labels

T-EnhancementNew features, changes in functionality, improvements in performance, or user-facing enhancements.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions