-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Cleaning up racy specs part1 #2721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -40,11 +40,13 @@ public void ActorProviderConfig_should_resolve_cluster_alias() | |||
|
|||
private void ConfigureAndVerify(string alias, Type actorProviderType) | |||
{ | |||
var config = ConfigurationFactory.ParseString(@"akka.actor.provider = " + alias); | |||
var config = ConfigurationFactory.ParseString(@"akka.actor.provider = " + alias) | |||
.WithFallback(ConfigurationFactory.ParseString("akka.remote.dot-netty.tcp.port = 0")); // use a random port to avoid issues with async and parallelization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved via port 0 binding (won't have any side-effects on using a local ActorSystem
since the configuration never gets read.)
using (var system = ActorSystem.Create(nameof(ActorRefProvidersConfigSpec), config)) | ||
{ | ||
var ext = (ExtendedActorSystem) system; | ||
ext.Provider.GetType().ShouldBe(actorProviderType); | ||
system.Terminate().Wait(TimeSpan.FromSeconds(3)); // force the system to cleanup and shutdown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, Dispose
on an ActorSystem
should probably call this and block until it's had a chance to run part of its shutdown process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More fixes to some of the cluster specs
@@ -20,6 +20,12 @@ namespace Akka.Cluster.Tests | |||
{ | |||
public class ClusterSpec : AkkaSpec | |||
{ | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added portability note explaining some of the semantic differences with the specs.
@@ -257,7 +261,7 @@ public void A_cluster_must_cancel_LeaveAsync_task_if_CancellationToken_fired_bef | |||
|
|||
// Subsequent LeaveAsync() tasks expected to complete immediately (not cancelled) | |||
var task3 = _cluster.LeaveAsync(); | |||
task3.Should(t => t.IsCompleted && !t.IsCanceled, "Task should be completed, but not cancelled."); | |||
AwaitCondition(() => task3.IsCompleted && !task3.IsCanceled, null, "Task should be completed, but not cancelled."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to be await-ed
@@ -214,7 +218,7 @@ public void A_cluster_must_return_completed_LeaveAsync_task_if_member_already_re | |||
}); | |||
|
|||
// LeaveAsync() task expected to complete immediately | |||
_cluster.LeaveAsync().IsCompleted.Should().BeTrue(); | |||
AwaitCondition(() => _cluster.LeaveAsync().IsCompleted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to be await-ed
@@ -237,7 +241,7 @@ public void A_cluster_must_cancel_LeaveAsync_task_if_CancellationToken_fired_bef | |||
|
|||
// Cancelling the first task | |||
cts.Cancel(); | |||
task1.Should(t => t.IsCanceled, "Task should be cancelled."); | |||
AwaitCondition(() => task1.IsCanceled, null, "Task should be cancelled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to be await-ed
#else | ||
[Fact(Skip = "Fails flakily on .NET 4.5")] | ||
#endif | ||
//#if CORECLR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to see this spec pass again
Going to work on |
…s likely on busy systems
…ter_must_cancel_LeaveAsync_task_if_CancellationToken_fired_before_node_
@akkadotnet/core ready for review. |
@@ -59,7 +59,7 @@ public class DowningProviderSpec : AkkaSpec | |||
loglevel = WARNING | |||
actor.provider = ""Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"" | |||
remote { | |||
netty.tcp { | |||
dot-netty.tcp { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
@@ -127,8 +127,8 @@ public void Tail_chopping_group_router_must_return_response_from_second_actor_af | |||
[Fact] | |||
public void Tail_chopping_group_router_must_throw_exception_if_no_result_will_arrive_within_the_given_time() | |||
{ | |||
var actor1 = Sys.ActorOf(Props.Create(() => new TailChopTestActor(500.Milliseconds())), "Actor3"); | |||
var actor2 = Sys.ActorOf(Props.Create(() => new TailChopTestActor(500.Milliseconds())), "Actor4"); | |||
var actor1 = Sys.ActorOf(Props.Create(() => new TailChopTestActor(1500.Milliseconds())), "Actor3"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should document all changes of timeout in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexvaluyskiy that's what commit histories are for - in this case the issue was just that the delay 500ms
and the timeout 300ms
were sufficiently close together that a busy system could receive the unwanted "ack" message before the delay. Nothing was technically wrong with the spec or the code it was testing otherwise.
Mentioned this already on #2728, but I'm splitting off some of the other racy specs into their own issue. Figured it'd be better to get some fixes in immediately rather than trying to do it all at once. |
* cleaning up racy spec for ActorRefProvider resolution * fixed LeaveAsync cancellation task specs * fixed downingproviderspec config * fixed TailChoppingSpec - increase the sleep time to make timeouts less likely on busy systems * added another necessary awaitAssert to ClusterSpec.ClusterSpec.A_cluster_must_cancel_LeaveAsync_task_if_CancellationToken_fired_before_node_
I've looked through some of these and the reasons for the raciness in most of these specs are due to issues where sockets aren't being cleaned up quickly enough or the timeouts are too narrowly defined. There might be some legitimate bugs with the specs here too, so I'm going to go through each one and submit some patches for each in the name of making our build process run more smoothly.
Current list:
The common thread with most of these issues is that they use the real socket system underneath Akka.Remote, so I suspect it's a problem with how that's being used / not giving it enough time.
I'm also going to see if these failures might be related to #2575 - which I was planning on fixing anyway. Stay tuned.