Skip to content

[Feature Proposal] - Improve allocator HA to favor packing sessions into multi sessions servers #4190

@miai10

Description

@miai10

Is your feature request related to a problem? Please describe.
The current HA solution for the allocator is implemented with a service and at least 2 replicas of the allocator that process the received gsa at intervals of time.

This approach is not well suited for high allocations rate and high capacity servers because the allocator compete to update the same server CRD thus generating update failures and many retries leading eventually to spreading of the sessions on multiple servers and high allocations time. Even with one session per server, the allocators can compete on the same server.

Using one allocator (especially with batching enabled) with a fine tuned batch wait time gives better results but the HA policy is downgraded to just restarting the pod when needed.

The scenarios we need to support would be:

  • one node goes down with the allocator on it
  • planned maintenance when the allocator is restarted to be updated
  • allocator crashes

Describe the solution you'd like
The perfect solution would be to find a design where multiple allocators are running, sharing the load when needed.

Describe alternatives you've considered
Possible solutions:

  1. master/slave allocators implemented using the readiness check of each pod and a leader election scheme
  2. pub/sub instead of channel for consuming the allocations
  3. shared cache and list for servers state between allocators

Additional context
Add any other context or screenshots about the feature request here.

Link to the Agones Feature Proposal (if any)
None

Discussion Link (if any)
There have been discussions here: #4176 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions