fix: improve TCP DNS client interactions with TCP actor #32635

leviramsey · 2025-02-01T21:56:38Z

The TCP DNS client responds to a backpressure signal (a CommandFailed(Write(...))) from the TCP connection actor by shutting itself down (which is, in its own way a form of backpressure), but means that two requests requiring TCP in fast enough succession (the second arriving before the connection actor has been able to hand the data off to the socket) will cause the DNS client to stop before it can receive a response to the first. DNS resolutions (e.g. through Discovery) will typically retry after some time, but if there are two retries being resolved that failed sufficiently close to each other the scheduler resolution (default 10 millis) will tend to mean that they retry at effectively the same time: it's thus unlikely that this retry loop will ever be broken.

This change implements ack-based throttling, it will not issue a subsequent TCP write until the connection actor has written the data to the socket and sent back an ack (this does not require any response from the other end, to be clear). Since there typically should not be that many in-flight DNS resolutions happening (the main use is for service discovery), the impact on applications should be minimal.

Also improves the logging within the TCP DNS client.

johanandren

Left some comments but will probably address myself to not have to wait for American TZ-morning.

johanandren · 2025-02-03T08:27:37Z

akka-actor/src/main/scala/akka/io/dns/internal/DnsClient.scala

@@ -186,7 +189,9 @@ import akka.pattern.{ BackoffOpts, BackoffSupervisor }
        log.debug("DNS response truncated, falling back to TCP")
        inflightRequests.get(msg.id) match {
          case Some((_, msg)) =>
+            tcpRequests = tcpRequests.incl(msg.id)


We only clear this on failure, so for the happy path the set will just accumulate ids, so needs a remove also on success. But, to avoid that, could we make it a boolean flag on the value in inFlightRequest instead and avoid the extra set?

johanandren · 2025-02-03T09:09:40Z

akka-actor/src/main/scala/akka/io/dns/internal/TcpDnsClient.scala

@@ -38,41 +41,50 @@ import akka.util.ByteString
    case _: Tcp.Connected =>
      log.debug("Connected to TCP address [{}]", ns)
      val connection = sender()
-      context.become(ready(connection))
+      writer = new Writing(connection, log)
+      context.become(ready())


Mixing a mutable field with the become-style already present here is not great for readability. Let's use just context become closing over state instead.

johanandren · 2025-02-03T09:12:52Z

akka-actor/src/main/scala/akka/io/dns/internal/TcpDnsClient.scala

+    def maybeWriteMessage(msg: Message): Writer = {
+      writeMessage(msg)
+
+      new Buffering(connection, log)


I'm not sure we really want to allow just the one in flight write message between this actor and the TCP manager

Ok, that's actually how we show the ack based protocol in the docs, so might be fine.

Yeah the TCP manager only allows one write at a time. I originally went for the "NACK-based with write-suspending" approach (assume the write will be accepted until you get the backpressure signal, then resend the ones that were backpressured), but the complexity there got a bit out of hand and I wasn't sure that actually gave us that much.

johanandren · 2025-02-03T12:01:13Z

Superseeded by #32636

leviramsey added 2 commits February 1, 2025 15:48

fix: improve TCP DNS client interactions with TCP actor

6be2d61

ensure that TCP connection is closed when TcpDnsClient fails

b533482

johanandren reviewed Feb 3, 2025

View reviewed changes

johanandren mentioned this pull request Feb 3, 2025

fix: Multiple potential issues with the tcp DNS client #32636

Merged

johanandren closed this Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: improve TCP DNS client interactions with TCP actor #32635

fix: improve TCP DNS client interactions with TCP actor #32635

Uh oh!

leviramsey commented Feb 1, 2025

Uh oh!

johanandren left a comment

Uh oh!

johanandren Feb 3, 2025

Uh oh!

johanandren Feb 3, 2025

Uh oh!

johanandren Feb 3, 2025

Uh oh!

johanandren Feb 3, 2025

Uh oh!

leviramsey Feb 3, 2025

Uh oh!

johanandren commented Feb 3, 2025

Uh oh!

Uh oh!

fix: improve TCP DNS client interactions with TCP actor #32635

fix: improve TCP DNS client interactions with TCP actor #32635

Uh oh!

Conversation

leviramsey commented Feb 1, 2025

Uh oh!

johanandren left a comment

Choose a reason for hiding this comment

Uh oh!

johanandren Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

johanandren Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

johanandren Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

johanandren Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

leviramsey Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

johanandren commented Feb 3, 2025

Uh oh!

Uh oh!