-
Notifications
You must be signed in to change notification settings - Fork 860
Fix: Adding a retry mechanism in case the addMoreGameServers function call fails. #4214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
/gcbrun |
Build Failed 😭 Build Id: b17dfa63-46dd-49e0-84c0-c2cb31558f88 Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Succeeded 🥳 Build Id: 74c02649-f424-490b-ad05-ee6459b3b3e5 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
/gcbrun |
Build Succeeded 🥳 Build Id: bf8d4493-7f6a-4ab0-833e-8133900ef684 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it.
/gcbrun |
Build Succeeded 🥳 Build Id: 7a741d83-08db-4428-b04e-9924722a854c The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
What type of PR is this?
/kind bug
Which issue(s) this PR fixes:
Closes #4162
Fix GameServer Not Created on addMoreGameServers Failure
What this PR does / Why we need it
affected version <= 1.50.0
The core of my pull request is that I identified an issue where, if
all GameServers
are deleted and the addMoreGameServers function fails, no further GameServers are created. To resolve this, I added logic to reprocess the GameServerSet usingc.workerqueue.EnqueueAfter
.Note
In the case of partial GameServer deletion (not all GameServers), the GameServer informer’s resync triggers an Update event. This leads to a call to the computeReconciliationAction function, which calculates the number of GameServers to add (numServersToAdd), followed by a call to the addMoreGameServers function.
Without this fix, since no GameServers exist, there are no further GameServer informer updates or resyncs triggered. As a result, the addMoreGameServers function is never called again, and no new GameServers are created.
POC
Imagine that the Agones Controller and Agones Extension are deployed on top of Kubernetes, and there is one GameServerSet (test-gss) with two GameServers (test-gss-1 and test-gss-2).
Delete the two GameServers (test-gss-1 and test-gss-2) while the Agones Extension is in the TLS certs changing state.
Note
Delete all deployed GameServers at once.
# Agones Extension Pod http: TLS handshake error from ip:port: EOF http: TLS handshake error from ip:port: EOF http: TLS handshake error from ip:port: EOF [...]
# my cli terminal kubectl delete pod test-gss-1 test-gss-2 -n default
In a normal scenario, the addMoreGameServers function is triggered when a GameServer transitions to the Shutdown state or when its Pod is deleted.
but we can see this error message when invoke
addMoreGameServers
Due to the above error, the addMoreGameServers function is no longer invoked.
Solution
The solution is simple: if the addMoreGameServers function call fails, it should be reprocessed using
c.workerqueue.EnqueueAfter
.Limitation
A potential drawback is that retries may continue indefinitely until success.
If you have any suggestions for a better approach, I'd appreciate your comments.
Special notes for your reviewer:
N/A