Skip to content

Conversation

whybeyoung
Copy link
Collaborator

[PD Feature] Support TP16 and Optimize Dynamic Connection Establishment

Motivation

Currently, Mooncake requires enhancements in handling multi-node prefill and decode setups. Specifically, it needs to support dynamic connection establishment and work reliably with multi-node TP=16 scenarios.

Thanks the pr: 5351 authored by @yuan-luo

Modifications

  1. Changed the port assignment from static computation to dynamically acquiring available ports.
  2. Enabled the load balancer to forward the prefill_addr to both prefill and decode nodes. This address is now used as a key to register and query the corresponding IP and ZMQ port information.

Checklist

  • Dynamic port assignment implemented
  • prefill_addr forwarding and registration logic added
  • Multi-node TP=16 setup tested
  • Add relevant unit/integration tests
  • Update documentation if necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant