-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Description
Problem: Timeout when akka cluster trying to create the Cluster extension.
Frequency: random
java.util.concurrent.TimeoutException: Futures timed out after [60000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:116)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:116)
at akka.cluster.Cluster.liftedTree1$1(Cluster.scala:172)
at akka.cluster.Cluster.<init>(Cluster.scala:171)
at akka.cluster.Cluster$.createExtension(Cluster.scala:42)
at akka.cluster.Cluster$.createExtension(Cluster.scala:37)
at akka.actor.ActorSystemImpl.registerExtension(ActorSystem.scala:711)
at akka.actor.ExtensionId$class.apply(Extension.scala:79)
at akka.cluster.Cluster$.apply(Cluster.scala:37)
at
......
akka.cluster.ClusterActorRefProvider.createRemoteWatcher(ClusterActorRefProvider.scala:66)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:186)
at akka.cluster.ClusterActorRefProvider.init(ClusterActorRefProvider.scala:58)
at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)
Investigation
The timeout happen in class Cluster, when trying to GetClusterCoreRef
private[cluster] val clusterCore: ActorRef = {
implicit val timeout = system.settings.CreationTimeout
try {
Await.result((clusterDaemons ? InternalClusterAction.GetClusterCoreRef).mapTo[ActorRef], timeout.duration)
} catch {
that lead to
private[cluster] final class ClusterCoreSupervisor extends Actor with ActorLogging
with RequiresMessageQueue[UnboundedMessageQueueSemantics] {
...
val coreDaemon = context.watch(context.actorOf(Props(classOf[ClusterCoreDaemon], publisher).
withDispatcher(context.props.dispatcher), name = "daemon"))
...
def receive = {
case InternalClusterAction.GetClusterCoreRef ⇒ sender() ! coreDaemon
}
...
}
ClusterCoreSupervisor will reply ClusterCoreDaemon
The problem is that ClusterCoreDaemon also requires the Cluster extension is initialized.
Check the defiition of ClusterCoreDaemon:
private[cluster] class ClusterCoreDaemon(publisher: ActorRef) extends Actor with ActorLogging
with RequiresMessageQueue[UnboundedMessageQueueSemantics] {
import InternalClusterAction._
val cluster = Cluster(context.system)
val cluster = Cluster(context.system) will call system.registerExtension, which is where the exception stack is waiting for.
update on 4/22
The updated analysis is on comment #17253 (comment)