-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Don't waste time processing backfill events that we never show to the client #15632
Description
As discovered while working on #15617 and a natural extension of #15585 to process previously failed backfill events in the background (follow-up to #13623),
It turns out, when we make a /messages
request and trigger a backfill, Synapse will happily waste time trying hard to process events (_process_pulled_events
) in the foreground that we end up never showing to clients. The one particularly bad culprit is the org.matrix.dummy_event
which Synapse automatically puts in the DAG to resolve outstanding forward extremities. Since it has 10x random prev_events
, it takes a while to fetch the state for each prev_event
and persist. We determine whether some events should be shown to clients in the _check_filter_send_to_client(...)
function
Lines 482 to 519 in ad50510
def _check_filter_send_to_client( | |
event: EventBase, | |
clock: Clock, | |
retention_policy: RetentionPolicy, | |
sender_ignored: bool, | |
) -> _CheckFilter: | |
"""Apply checks for sending events to client | |
Returns: | |
True if might be allowed to be sent to clients, False if definitely not. | |
""" | |
if event.type == EventTypes.Dummy: | |
return _CheckFilter.DENIED | |
if not event.is_state() and sender_ignored: | |
return _CheckFilter.DENIED | |
# Until MSC2261 has landed we can't redact malicious alias events, so for | |
# now we temporarily filter out m.room.aliases entirely to mitigate | |
# abuse, while we spec a better solution to advertising aliases | |
# on rooms. | |
if event.type == EventTypes.Aliases: | |
return _CheckFilter.DENIED | |
# Don't try to apply the room's retention policy if the event is a state | |
# event, as MSC1763 states that retention is only considered for non-state | |
# events. | |
if not event.is_state(): | |
max_lifetime = retention_policy.max_lifetime | |
if max_lifetime is not None: | |
oldest_allowed_ts = clock.time_msec() - max_lifetime | |
if event.origin_server_ts < oldest_allowed_ts: | |
return _CheckFilter.DENIED | |
return _CheckFilter.MAYBE_ALLOWED |
org.matrix.dummy_event
can be particularly bad because they usually include a lot of disparate prev_events
which take a while for us to work out the state for (long _get_state_groups_from_groups
times).
Potential solution
We should use this same _check_filter_send_to_client(...)
criteria when determining whether we should backfill a given event in the background or foreground.
We should be mindful an event that is filtered from the client, might still be used as a prev_event
from another event in the backfill response that we will show to clients. And if we kick off the function to process org.matrix.dummy_event
in the background, while the other event in the foreground, we will probably end up duplicating the work depending on the race.
We don't really have these same race -> duplicate work concerns with #15585 (events which previously failed to backfill) because it's a "fool me once" sort of situation and they are already potentially disconnected from what we were trying to backfill anyway given they failed before.
Also of interest, filter_events_for_server(...)