-
Notifications
You must be signed in to change notification settings - Fork 1.1k
HashedWheelTimerScheduler and IScheduler disposal guarantees #2294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashedWheelTimerScheduler and IScheduler disposal guarantees #2294
Conversation
@@ -517,12 +517,13 @@ namespace Akka.Actor | |||
public static Akka.Actor.DeployableDecider From(Akka.Actor.Directive defaultDirective, System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<System.Type, Akka.Actor.Directive>> pairs) { } | |||
public static Akka.Actor.LocalOnlyDecider From(System.Func<System.Exception, Akka.Actor.Directive> localOnlyDecider) { } | |||
} | |||
public class DedicatedThreadScheduler : Akka.Actor.SchedulerBase, Akka.Actor.IDateTimeOffsetNowTimeProvider, Akka.Actor.ITimeProvider | |||
public class DedicatedThreadScheduler : Akka.Actor.SchedulerBase, Akka.Actor.IDateTimeOffsetNowTimeProvider, Akka.Actor.ITimeProvider, System.IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public API change: DedicatedThreadScheduler
now implements IDisposable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the version to 1.2.0 if we have API breaking changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, this is an internal component which can be extended but we don't know of any implementations out in the wild. Plus the old API is frankly kind of dangerous (as I explained in my comment below.)
What I've decided to do was change the SchedulerBase
constructor that supports the new Activator.CreateInstance
arguments , but I've left the old DedicatedThreadScheduler
constructor intact in case anyone was instantiating one of those for some reason. I figured that was a reasonable compromise.
TODOS:
|
IScheduler
implementations which support IDisposable
will be disposed before ActorSystem.WhenTerminated
completes.IScheduler
implementations which support IDisposable
will be disposed before ActorSystem.WhenTerminated
completes.
Benchmark ResultsAkka.Tests.Performance.Actor.Scheduler.DefaultSchedulerPerformanceTests+SchedulerThroughputStressTestTests to see how many concurrent jobs we can schedule and what the effects are on throughput System InfoNBench=NBench, Version=0.3.0.0, Culture=neutral, PublicKeyToken=null
OS=Microsoft Windows NT 6.2.9200.0
ProcessorCount=2
CLR=4.0.30319.42000,IsMono=False,MaxGcGeneration=2
WorkerThreads=32767, IOThreads=2 NBench SettingsRunMode=Iterations, TestMode=Measurement
NumberOfIterations=1, MaximumRunTime=00:00:01
Concurrent=True
Tracing=True DataTotals
Per-second Totals
Raw DataTotalBytesAllocated
TotalCollections [Gen0]
TotalCollections [Gen1]
TotalCollections [Gen2]
[Counter] ScheduledJobs
[Counter] ScheduleInvokes
|
Biggest surprise to me on this benchmark isn't the speed (roughly 2 million msg /s) but the garbage collection. Feels like this code is blowing through a lot more gen 2 GC than it should.... And an incredible amount of memory allocation (277mb.) The number of schedule attempts per second is pretty much fixed by the way I wrote the benchmark; not terribly important here to measure maximum speed IMHO. Rather to see if the changes we make result in an improvement or not. |
Implemented first cut of HashedWheelTimerScheduler - waiting for benchmark data. A little worried that it's sleep detector isn't sensitive enough, but it's tough to tell when debugging. Probably going to need to go over it with a profiler. Passed all of the scheduler specs save one though (since this is a behavior I have not yet implemented) |
HashedWheelTimerScheduler Benchmark 1Akka.Tests.Performance.Actor.Scheduler.DefaultSchedulerPerformanceTests+SchedulerThroughputStressTestTests to see how many concurrent jobs we can schedule and what the effects are on throughput System InfoNBench=NBench, Version=0.3.0.0, Culture=neutral, PublicKeyToken=null
OS=Microsoft Windows NT 6.2.9200.0
ProcessorCount=2
CLR=4.0.30319.42000,IsMono=False,MaxGcGeneration=2
WorkerThreads=32767, IOThreads=2 NBench SettingsRunMode=Iterations, TestMode=Measurement
NumberOfIterations=1, MaximumRunTime=00:00:01
Concurrent=True
Tracing=True DataTotals
Per-second Totals
Raw DataTotalBytesAllocated
TotalCollections [Gen0]
TotalCollections [Gen1]
TotalCollections [Gen2]
[Counter] ScheduledJobs
[Counter] ScheduleInvokes
|
Big drop in allocations (277mb --> 100 mb) and GEN2 GC. Slight increase in scheduling throughput, which I think is the result of decreased contention around the Bit of a drop in actual scheduled calls per second, which could be the result of the fact that we're now scheduling tasks to be executed on the |
Looking at the MNTR logs, seems like there's still some issues with my implementation:
|
HashedWheelTimerScheduler Benchmark 2Akka.Tests.Performance.Actor.Scheduler.DefaultSchedulerPerformanceTests+SchedulerThroughputStressTestTests to see how many concurrent jobs we can schedule and what the effects are on throughput System InfoNBench=NBench, Version=0.3.0.0, Culture=neutral, PublicKeyToken=null
OS=Microsoft Windows NT 6.2.9200.0
ProcessorCount=2
CLR=4.0.30319.42000,IsMono=False,MaxGcGeneration=2
WorkerThreads=32767, IOThreads=2 NBench SettingsRunMode=Iterations, TestMode=Measurement
NumberOfIterations=1, MaximumRunTime=00:00:01
Concurrent=True
Tracing=True DataTotals
Per-second Totals
Raw DataTotalBytesAllocated
TotalCollections [Gen0]
TotalCollections [Gen1]
TotalCollections [Gen2]
[Counter] ScheduledJobs
[Counter] ScheduleInvokes
|
I've run the benchmark 3-4 times now and this is the lowest recorded result; wanted to sanity check the figures. Biggest change I made? All scheduled work happens on the scheduler thread instead of being executed as a new As long as the scheduled operations don't run for very long (most don't) this design achieves significantly higher throughput with a fraction of the allocations and garbage collections. Memory consumption is about the same: it's determined largely by the number of scheduled jobs, and many more jobs are queued in this design. ThroughputThis spec always runs for 20 seconds. This implementation has fundamentally the same contention overhead as the previous HashedWheelTimerScheduler, but this implementation is able to improve upon 42,817.91 schedules added per second all the way up to 48,237.54 per second. At 12-13% increase. Moving away from the In in this benchmark we also improve upon total scheduled actions executed throughput from the previous HashedWheelTimerScheduler implementation from 1,930,100.87 to 3,995,133.22 per second. More than double the performance. I think this is because:
I'm going to remove the MemoryMemory allocation didn't noticeably change at all - consumption increased slightly due to the increase in work scheduled. I'd have to run a profiler on both versions to understand why, but my back-of-the-envelope math works out to the fact that the GCThe biggest difference between this design and the other two (the first one being the original Each item here is for all three implementations in order:
The original ConclusionThe |
88aa0a7
to
a7870f4
Compare
IScheduler
implementations which support IDisposable
will be disposed before ActorSystem.WhenTerminated
completes.
@akkadotnet/developers this is now ready to be reviewed once CI has a chance to run. |
a7870f4
to
604b667
Compare
public class DefaultSchedulerPerformanceTests | ||
{ | ||
private ActorSystem _actorSystem; | ||
public const string ScheduledOps = "ScheduleInvokes"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need these public consts? Could we make them as private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a benchmark. Doesn't really matter.
LGTM, may I merge it? |
@alexvaluyskiy I'm going to implement some of your remarks and do a bit of cleanup first. |
@@ -0,0 +1,670 @@ | |||
using System; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright header ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh! yep, will add that
…s which support `IDisposable` will be disposed before `ActorSystem.WhenTerminated` completes. close akkadotnet#1593 - significantly improved upon the `DedicatedThreadScheduler` performance Signed-off-by: Aaron Stannard <aaron@petabridge.com>
604b667
to
cae9fbb
Compare
Addressed all remarks, approved the API, etc... this should be ready to go now. |
Closes #2288