-
Notifications
You must be signed in to change notification settings - Fork 862
Description
Is your feature request related to a problem? Please describe.
The current Health Failure Strategy and lifecycle for the sdkserver and game server container works, but it has edge cases, and is eventually consistent - which can also be fun (but not always in a good way).
This means as new ways for Pods to fail come up we have build out new features to capture this, and it's not straightforward, and often puts more load on the K8s control plane.
Describe the solution you'd like
Once Agones supports Kubernetes 1.29+ (or maybe earlier behind a feature flag?) we could move the sdkserver to the new Sidecar container model.
Then we can set the Pod to restart: Never
by default, and the sdkserver sidecar to restart: Always
, which would simplify things greatly.
- Alot of https://github.com/googleforgames/agones/blob/main/pkg/gameservers/health.go would likely dissapear, especially around setting
GameServerReadyContainerIDAnnotation
to know if the container is ready or not. - The sdkserver only shutting down on
Shutdown
state also goes away, since the sidecar container can be spun up before and stay alive for the entire duration of the Pod.
There is one thing that is definitely tricky here - the way we have things setup now, if a GameServer is before Ready, we let it restart. If it's after Ready, we do not. Maybe we should revisit this as a pattern we still need for simplification - especially if the Pod crashes before being Ready, a new one will be recreated 🤔 Although this breaks backward compatibility, and that is less than ideal.
Describe alternatives you've considered
Leave things as they are, they seem to mostly work!
Additional context
N/A