Skip to content

Port exhaustion problem with Akka.Cluster #2575

@Blind-Striker

Description

@Blind-Striker

Hi,

We created an Akka Cluster infrastructure for Sms, Email and Push notifications. 3 different kind of nodes are exist in the system, which are client, sender and lighthouse. Client role is being used by Web application and API application(Web and API is hosted at IIS). Lighthouse and Sender roles are being hosted as a Windows service. We are also running 4 more console applications of same windows service that in sender role.

We've been experiencing port exhaustion problems in our Web Server for about 2 weeks. Our Web Server starting to consume the ports quickly and after a while we can not do any SQL operations.
Sometimes we have no choice but to do iis reset. This problems occur if there are more than one nodes that in sender role. We diagnosed it and found the source of the problem.

---------------
HOST                  OPEN    WAIT
SRV_NOTIFICATION      3429    0
SRV_LOCAL             198     0
SRV_UNDEFINED_IPV4    23      0
SRV_DATABASE          15      0
SRV_AUTH              4       0
SRV_API               6       0
SRV_UNDEFINED_IPV6    19      0
SRV_INBOUND           12347   5

TotalPortsInUse   : 17286
MaxUserPorts      : 64510
TcpTimedWaitDelay : 30
03/23/2017 09:30:10
---------------

SRV_NOTIFICATION is server that lighthouse ve sender's nodes running. SRV_INBOUND is our Web Server. After checking this table, we checked what ports on the Web Server were assigned.
And we got results like table below. In netstat there were more than 12000 connections like this :

TCP    192.168.1.10:65531     192.168.1.10:3564      ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65532     192.168.1.101:17527    ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65533     192.168.1.101:17527    ESTABLISHED     5716   [w3wp.exe]
TCP    192.168.1.10:65534     192.168.1.10:3564      ESTABLISHED     5716   [w3wp.exe]

192.168.1.10 Web Server
192.168.1.10:3564 API
192.168.1.101:17527 Lighthouse

The connections are opening but not closing.

After deployments our Web and Api applications are leaving and rejoining to do cluster and they configured for fixed ports. We're monitoring our cluster with application that created by @cgstevens. Even we implemented the grecaful shutdown logic for Actor System sometimes WEB and API applications cant leave the cluster so we have to remove nodes manualy and restart the actor system.

We have reproduce the problem in our development environment and recorded a video below

https://drive.google.com/file/d/0B5ZNfLACId3jMWUyOWliMUhNWTQ/view

Our hocon configuration for nodes are below :

WEB and API

<akka>
	<hocon><![CDATA[
			akka{
				loglevel = DEBUG
				
				actor{
					provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
					
					deployment {
						/coordinatorRouter {
							router = round-robin-group
							routees.paths = ["/user/NotificationCoordinator"]
							cluster {
									enabled = on
									max-nr-of-instances-per-node = 1
									allow-local-routees = off
									use-role = sender
							}
						}
						
						/decidingRouter {
							router = round-robin-group
							routees.paths = ["/user/NotificationDeciding"]
							cluster {
									enabled = on
									max-nr-of-instances-per-node = 1
									allow-local-routees = off
									use-role = sender
							}
						}
					}
					
					serializers {
							wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
					}
					
					serialization-bindings {
					 "System.Object" = wire
					}
	
					debug{
						receive = on
						autoreceive = on
						lifecycle = on
						event-stream = on
						unhandled = on
					}
				}
				
				remote {
					helios.tcp {
							transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
							applied-adapters = []
							transport-protocol = tcp
							hostname = "192.168.1.10"
							port = 3564
					}
				}
				
				cluster {
						seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
						roles = [client]
				}
			}
		]]>
	</hocon>
</akka>

Lighthouse

<akka>
		<hocon>
			<![CDATA[
					lighthouse{
							actorsystem: "notificationSystem"
						}
			
					akka {
						actor { 
							provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
							
							serializers {
								wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
							}
						
							serialization-bindings {
								"System.Object" = wire
							}
						}
						
						remote {
							log-remote-lifecycle-events = DEBUG
							helios.tcp {
								transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
								applied-adapters = []
								transport-protocol = tcp
								#will be populated with a dynamic host-name at runtime if left uncommented
								#public-hostname = "192.168.1.100"
								hostname = "192.168.1.101"
								port = 17527
							}
						}            
						
						loggers = ["Akka.Logger.NLog.NLogLogger,Akka.Logger.NLog"]
						
						cluster {
							seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
							roles = [lighthouse]
						}
					}
			]]>
		</hocon>
	</akka>

Sender

<akka>
    <hocon><![CDATA[
				akka{
					# stdout-loglevel = DEBUG
					loglevel = DEBUG
					# log-config-on-start = on
				
					loggers = ["Akka.Logger.NLog.NLogLogger, Akka.Logger.NLog"]
				
					actor{
						debug {  
							# receive = on 
							# autoreceive = on
							# lifecycle = on
							# event-stream = on
							# unhandled = on
						}         
					
						provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"           
						
						serializers {
							wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
						}
						
						serialization-bindings {
						 "System.Object" = wire
						}
						
						deployment{							
							/NotificationCoordinator/LoggingCoordinator/DatabaseActor{
								router = round-robin-pool
								resizer{
									enabled = on
									lower-bound = 3
									upper-bound = 5
								}
							}							
							
							/NotificationDeciding/NotificationDecidingWorkerActor{
								router = round-robin-pool
								resizer{
									enabled = on
									lower-bound = 3
									upper-bound = 5
								}
							}
							
							/ScheduledNotificationCoordinator/SendToProMaster/JobToProWorker{
								router = round-robin-pool
								resizer{
									enabled = on
									lower-bound = 3
									upper-bound = 5
								}
							}
						}
					}
					
				 remote{							
							log-remote-lifecycle-events = DEBUG
							log-received-messages = on
							
							helios.tcp{
								transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
								applied-adapters = []
								transport-protocol = tcp
								#will be populated with a dynamic host-name at runtime if left uncommented
								#public-hostname = "POPULATE STATIC IP HERE"
								hostname = "192.168.1.101"
								port = 0
						}
					}
					
					cluster {
						seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
						roles = [sender]
					}
				}
			]]></hocon>
  </akka>

Cluster.Monitor

	<akka>
		<hocon>
			<![CDATA[
					akka {
						stdout-loglevel = INFO
						loglevel = INFO
						log-config-on-start = off 
						
						actor {
							provider = "Akka.Remote.RemoteActorRefProvider, Akka.Remote"				
							
							serializers {
								wire = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion"
							}
							serialization-bindings {
								"System.Object" = wire
							}

							deployment {								
								/clustermanager {
									dispatcher = akka.actor.synchronized-dispatcher
								}
							}
						}
						
						remote {
							log-remote-lifecycle-events = INFO
							log-received-messages = off
							log-sent-messages = off
							
							helios.tcp {                                
								transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote"
								applied-adapters = []
								transport-protocol = tcp
								#will be populated with a dynamic host-name at runtime if left uncommented
								#public-hostname = "127.0.0.1"
								hostname = "192.168.1.101"
								port = 0
							}
						}            

						cluster {							
						seed-nodes = ["akka.tcp://notificationSystem@192.168.1.101:17527"]
							roles = [ClusterManager]
							
							client {
								initial-contacts = ["akka.tcp://notificationSystem@192.168.1.101:17527/system/receptionist"]
							}
						}
					}
			]]>
		</hocon>
	</akka>

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions