-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Description
Proposal
When a remote-write endpoint is unavailable for a long time and then comes back, the data prometheus has may be older than the endpoint will allow to be remote-written, due to various internal implementation reasons.
If this happens, prometheus will try to keep sending data that will never be accepted until it the endpoint returns, at which point it will be rejected. Due to the volume of samples as well as the inherently shorter code-path common for sample rejection scenarios this may result in saturation of both network bandwidth on the prometheus side and bandwidth as well as memory on the receive side.
A parameter in the remote-write configuration that would allow discarding of samples older than some pre-defined limit, rather than attempting to send them would help in this scenario.