Skip to content

Endpoint actor state is corrupted: Highest SEQ so far was -1 but cumulative ACK is 2 #22156

@2m

Description

@2m

When node A is restarted and comes back with the new UID to node B, node B has endpoint actor state corrupted:

akka.actor.Actor$class.aroundReceive(Actor.scala:496)
Error encountered while processing system message acknowledgement buffer: [14
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
] ack: ACK[378933, {}]
akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:304)
akka.remote.ReliableDeliverySupervisor.aroundReceive(Endpoint.scala:203)
akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
akka.actor.ActorCell.invoke(ActorCell.scala:495)
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
akka.dispatch.Mailbox.run(Mailbox.scala:224)
akka.dispatch.Mailbox.exec(Mailbox.scala:234)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
by: java.lang.IllegalArgumentException: Highest SEQ so far was 14 but cumulative ACK is 378933
akka.remote.AckedSendBuffer.acknowledge(AckedDelivery.scala:103)
akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:300)
akka.actor.Actor$class.aroundReceive(Actor.scala:496)
akka.remote.ReliableDeliverySupervisor.aroundReceive(Endpoint.scala:203)
akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
akka.actor.ActorCell.invoke(ActorCell.scala:495)
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
akka.dispatch.Mailbox.run(Mailbox.scala:224)
akka.dispatch.Mailbox.exec(Mailbox.scala:234)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
akka.remote.ReliableDeliverySupervisor$$anonfun$receive$1.applyOrElse(Endpoint.scala:304)
akka.remote.AckedSendBuffer.acknowledge(AckedDelivery.scala:103)
akka.remote.ReliableDeliverySupervisor.aroundReceive(Endpoint.scala:203)
akka.dispatch.Mailbox.exec(Mailbox.scala:234)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Also there is evidence that TooLongIdle cleanup can affect communications to the node with the same address but different UID:

10.30.10.29:9196|2017-01-16 13:07:38,867 ERROR [][][][][][akka.actor.default-dispatcher-23][] a.r.Remoting - Association to [akka.tcp://sys@nodeA] with UID [-1778088587] irrecoverably failed. Quarantining address.
Remote system has been silent for too long. (more than 120.0 hours)

10.30.10.29:9196|2017-01-16 13:07:38,910 ERROR [][][][][][akka.actor.default-dispatcher-71][] a.r.Remoting - Association to [akka.tcp://sys@nodeA] with UID [1685131585] irrecoverably failed. Quarantining address.
Error encountered while processing system message acknowledgement buffer: [-1 {}] ack: ACK[1187893, {}]

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions