-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Make sure synapse_rate_limit_reject_affected_hosts
does what it says it does #13670
Description
Spawning from #13541 (comment)
The synapse_rate_limit_reject_affected_hosts
gauge is always evaluating to 0
. The raw data in Prometheus also shows 0
for reference.
Even though we see individual requests being rejected (synapse_rate_limit_reject_total
) which should mean at least 1
host,
But this could be a mismatch in how the guages were being reported because we were accidentally registering them twice, #13641
Now that we fixed the duplicate metric registering issue in #13649 and the fix was put on matrix.org
this morning, we're seeing both at 0
now. This could mean that the previous rejections we were seeing were all from the UsernameAvailabilityRestServlet
which we are no longer tracking. And we're not rejecting any requests in the federation servlets.
It is a bit suspicious though.
How can we know if it's right?
In order to confirm that synapse_rate_limit_reject_affected_hosts
is working, it would be nice to see a non-zero value.
The reject_limit
is 50
which I think means there has to be more than 50 requests within the 1 second federation_rc_window_size
to start rejecting.
We do see the rate of slept requests go above 70 sometimes which I would expect to trigger this 🤔
Dev notes
The synapse_rate_limit_reject_affected_hosts
metric was originally added in #13541 and updated in #13649