-
Notifications
You must be signed in to change notification settings - Fork 3.4k
inctimer: Fix bug where timer fired immediately #16955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inctimer: Fix bug where timer fired immediately #16955
Conversation
8c28cb6
to
47531c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
47531c0
to
68123f3
Compare
test-me-please |
This commit fixes cilium#15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to cilium#16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
test-1.16-netnext Edit: net-next hit a variant of #16399 Since this PR fixes another CI flake, this should not block this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit and one curious question. The documentation is as Michi would say, "amazing man" 🚀
This fixes a bug where the `IncTimer.After` would fire immediately due to a rare race. See the comment in the diff as to how the race could occur. This commit also adds a unit test which has a high likelihood of triggering the bug in the old code. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
68123f3
to
65cd5fb
Compare
test-me-please |
This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
The failure in net-next is unrelated, very similar to #16928: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.16-net-next/1140/testReport/junit/Suite-k8s-1/16/K8sPolicyTest_Basic_Test_Validate_to_entities_policies_Validate_toEntities_All/
Marking this ready to merge as soon as remaining reviews are in, as this is a bug fix and thus should be exempt from the merge freeze. |
[ upstream commit 31176f7 ] This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
[ upstream commit 31176f7 ] This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
[ upstream commit 31176f7 ] This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
[ upstream commit 31176f7 ] This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
[ upstream commit 31176f7 ] This commit fixes #15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to #16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Marking for backport to v1.9 for the same reasons as #14380 (comment). |
This commit fixes cilium#15442 (and variants), where the `done` channel used to indicate completion to the test driver could be closed twice. This happened because at the end of a test, most mock client will start returning `io.EOF`. Due to cilium#16955, this sometimes caused the peer manager to reconnect immediately and create a new mock client, which would then attempt to re-run the test-logic again. This commit addresses this issue by ensuring that all mock clients within a test share the same state (i.e. the `i` counter and `once` instance). This way, each mock client instance will continue the work of its predecessor instead of replaying the whole test sequence. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This fixes a bug where the
IncTimer.After
would fire immediately dueto a rare race. See the comment in the diff as to how the race
could occur. This commit also adds a unit test which has a high
likelihood of triggering the bug in the old code.
Discovered this will debugging #15442 (which is not fully fixed by this PR,
but the reason why that other unit test faililure surfaced sporadically was
because of this bug here)