-
Notifications
You must be signed in to change notification settings - Fork 742
tso: fix tso proxy error propagation and add test #9268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: artem_danilov <artem.danilov@airbnb.com>
Hi @Tema. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Now you can start all CI jobs with |
@JmPotato @bufferflies I've found a bug while writing tests for #9219 PTAL |
/retest |
1 similar comment
/retest |
Signed-off-by: artem_danilov <artem.danilov@airbnb.com>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #9268 +/- ##
==========================================
+ Coverage 75.97% 76.11% +0.14%
==========================================
Files 470 472 +2
Lines 73051 73264 +213
==========================================
+ Hits 55503 55768 +265
+ Misses 14101 14050 -51
+ Partials 3447 3446 -1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Signed-off-by: artem_danilov <artem.danilov@airbnb.com>
@JmPotato @bufferflies I've done with sysbench tests as well. Now it looks good with only subsecond impact on leader transfer. PTAL
|
@@ -74,10 +75,10 @@ func (s *TSODispatcher) DispatchRequest(serverCtx context.Context, req Request, | |||
key := req.getForwardedHost() | |||
val, loaded := s.dispatchChs.Load(key) | |||
if !loaded { | |||
val = tsoRequestProxyQueue{requestCh: make(chan Request, maxMergeRequests+1)} | |||
val = &tsoRequestProxyQueue{requestCh: make(chan Request, maxMergeRequests+1)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the context was not actually stored in map due to this bug, so the context never was return from this method except the first request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest LGTM.
s.assertReceiveError(re, "pd is not leader of cluster") | ||
|
||
// verify fails faster than 3 second timeout, otherwise the unavailable time will be too long. | ||
re.Less(time.Since(start), time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comment suggests, should this be changed to re.Less(time.Since(start), 3*time.Second)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout is 3 seconds indeed, but the test verifies that it is below timeout. I don't really want to hardcode timeout for this test, it just verifies that it is "instant" failure, not based on timeout. Please let me know if change is still required.
s.assertReceiveError(re, "Canceled") | ||
|
||
// verify fails faster than 3 second timeout, otherwise the unavailable time will be too long. | ||
re.Less(time.Since(start), time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
s.serverCancel() | ||
|
||
start := time.Now() | ||
s.assertReceiveError(re, "Canceled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.assertReceiveError(re, "Canceled") | |
s.assertReceiveError(re, "Cancelled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is go sdk error: "rpc error: code = Canceled desc = context canceled"
where "Canceled" is grpc status. Single L is a preferred spelling in American English.
tests/server/tso/tso_proxy_test.go
Outdated
s.verifyProxyIsHealthyWith(re, s.proxyClient) | ||
} | ||
|
||
func (s *tsoProxyTestSuite) verifyProxyIsHealthyWith(re *require.Assertions, client pdpb.PD_TsoClient) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing the first parameter and creating it in this function?
Signed-off-by: artem_danilov <artem.danilov@airbnb.com>
@JmPotato @bufferflies I've addressed/replied the comments |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bufferflies, JmPotato The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
ref tikv#9188 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
In response to a cherrypick label: new pull request created to branch |
ref tikv#9188 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: ref #9188
What is changed and how does it work?
#9219 fixed error propagation, but had a bug that it propagates an error only for the first request on the stream. This PR fixes the bug and adds test coverage.
Check List
Tests
Code changes
Side effects
Related changes
pingcap/docs
/pingcap/docs-cn
:pingcap/tiup
:Release note