-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Fix wlog.Watcher increased replay time on startup #13188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Julien Levesy <jlevesy@gmail.com>
Signed-off-by: Julien Levesy <jlevesy@gmail.com>
I'll roll out that branch on my dev environment on Monday, see if it makes the expected changes :) |
This feels like two PRs to me: one allowing the agent to participate in Notify calls, and one changing the non-tail behaviour to not need a Notify. If I'm mistaken, please modify the description to highlight how they are related. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll ask someone who's more involved in the agent work to double check, but this looks good to me.
tsdb/wlog/watcher.go
Outdated
@@ -487,6 +485,8 @@ func (w *Watcher) watch(segmentNum int, tail bool) error { | |||
readTicker.Reset(readTimeout) | |||
|
|||
case <-w.readNotify: | |||
level.Debug(w.logger).Log("msg", "Watcher is reading the WAL due to notify") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to leave out this log line. It would become way too noisy during normal remote write execution, and we want debug logging to still be useful for finding other issues.
If you're needing insight into WAL reads due to notify vs timeout we could add a metric instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, that's a debugging leftover sorry. Will clean up!
It is indeed. It was the outcome of the discussion happening in #13111 so I packed them into a single PR. But I can open two separate PRs if you want! |
4f4d999
to
fcf68b4
Compare
Signed-off-by: Julien Levesy <jlevesy@gmail.com>
fcf68b4
to
9cb9c9e
Compare
I find it easier to think about self-contained changes, plus we get the benefit of seeing that CI passes on each independent change. |
I don't have a preference either way; these are relatively small changes and you've already isolated them nicely in separate commits. |
Looks reasonable from my standpoint. The only edge case I see is abusing to SetWriteNotified could lead to a nil panic but that seems like an abuse and likely not worth the mutex. |
@mattdurham fair point, we should probably at least have a comment on the interface that |
Signed-off-by: Julien Levesy <jlevesy@gmail.com>
What does this PR do?
agent.DB
notify the remote storage (via awlog.Notifier
) when a commit happens.wlog.Watcher.watch
method read a segment synchronously when it is not tailing the last segment instead of waiting. It includes a test that makes sure we're able to replay within a single timeout interval a generated set of segments. I'm not exactly sure that's what you expected, but I can't see how a real benchmark would help here.Those two fixes commit are the outcome of the discussion happening in #13111
Fixes #13111