-
Notifications
You must be signed in to change notification settings - Fork 3.6k
fix: improve TCP DNS client interactions with TCP actor #32635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments but will probably address myself to not have to wait for American TZ-morning.
@@ -186,7 +189,9 @@ import akka.pattern.{ BackoffOpts, BackoffSupervisor } | |||
log.debug("DNS response truncated, falling back to TCP") | |||
inflightRequests.get(msg.id) match { | |||
case Some((_, msg)) => | |||
tcpRequests = tcpRequests.incl(msg.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only clear this on failure, so for the happy path the set will just accumulate ids, so needs a remove also on success. But, to avoid that, could we make it a boolean flag on the value in inFlightRequest
instead and avoid the extra set?
@@ -38,41 +41,50 @@ import akka.util.ByteString | |||
case _: Tcp.Connected => | |||
log.debug("Connected to TCP address [{}]", ns) | |||
val connection = sender() | |||
context.become(ready(connection)) | |||
writer = new Writing(connection, log) | |||
context.become(ready()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixing a mutable field with the become-style already present here is not great for readability. Let's use just context become closing over state instead.
def maybeWriteMessage(msg: Message): Writer = { | ||
writeMessage(msg) | ||
|
||
new Buffering(connection, log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we really want to allow just the one in flight write message between this actor and the TCP manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's actually how we show the ack based protocol in the docs, so might be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the TCP manager only allows one write at a time. I originally went for the "NACK-based with write-suspending" approach (assume the write will be accepted until you get the backpressure signal, then resend the ones that were backpressured), but the complexity there got a bit out of hand and I wasn't sure that actually gave us that much.
Superseeded by #32636 |
The TCP DNS client responds to a backpressure signal (a
CommandFailed(Write(...))
) from the TCP connection actor by shutting itself down (which is, in its own way a form of backpressure), but means that two requests requiring TCP in fast enough succession (the second arriving before the connection actor has been able to hand the data off to the socket) will cause the DNS client to stop before it can receive a response to the first. DNS resolutions (e.g. through Discovery) will typically retry after some time, but if there are two retries being resolved that failed sufficiently close to each other the scheduler resolution (default 10 millis) will tend to mean that they retry at effectively the same time: it's thus unlikely that this retry loop will ever be broken.This change implements ack-based throttling, it will not issue a subsequent TCP write until the connection actor has written the data to the socket and sent back an ack (this does not require any response from the other end, to be clear). Since there typically should not be that many in-flight DNS resolutions happening (the main use is for service discovery), the impact on applications should be minimal.
Also improves the logging within the TCP DNS client.