-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add NetworkTopology plugin score doc #4213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NetworkTopology plugin score doc #4213
Conversation
1. If it is the first scheduling of a job, all HyperNodes that need to be scored will be given a score of 0 and returned. The HyperNode that is successfully scheduled in the end will be recorded as the `JobAllocatedHyperNode` attribute of the job. | ||
2. If it is not the first scheduling of a job, calculate the LCAHyperNode (Lowest Common Ancestor HyperNode) between all HyperNodes that need to be scored and the `JobAllocatedHyperNode` of the job. The lower the tier of the calculated LCAHyperNode, the higher the score. If there is only one highest score, return the scoring result. | ||
3. If there is more than one HyperNode with the highest score in the scoring result of step 2, calculate the distribution of the tasks that have been successfully scheduled for the job among these HyperNodes. The greater the distribution quantity, the higher the score. | ||
4. The HyperNode that is successfully scheduled in the end in steps 2 and 3 will also be recorded as the `JobAllocatedHyperNode` attribute of the job. | ||
|
||
- AddNodeOrderFn: score for nodes.(take effect in soft limit,take effect in ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take effect in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have modified it, you can check again.
|
||
- AddNodeOrderFn: score for nodes.(take effect in soft limit,take effect in ) | ||
1. To score all nodes, you need to first obtain the HyperNode to which the node belongs and the `JobAllocatedHyperNode` of the job to which the task belongs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need -> we need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
@@ -570,8 +570,14 @@ Allocate resources for queue-\> hyperJob \-\> Job \-\> Task. | |||
- AddJobGroupReadyFn: check whether hyperJob minAvailable is met.(phase 2) | |||
|
|||
- AddHyperNodeOrderFn: score for hyperNodes.(take effect in hard limit, closest tiers have higher score) | |||
1. If it is the first scheduling of a job, all HyperNodes that need to be scored will be given a score of 0 and returned. The HyperNode that is successfully scheduled in the end will be recorded as the `JobAllocatedHyperNode` attribute of the job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. If it is the first scheduling of a job, all HyperNodes that need to be scored will be given a score of 0 and returned. The HyperNode that is successfully scheduled in the end will be recorded as the `JobAllocatedHyperNode` attribute of the job. | |
1. If a Job is being scheduled for the very first time, all HyperNodes that need to be scored will get a score of 0 and then return right away. The name of the HyperNode where the Job eventually gets scheduled successfully will be recorded in the Job's annotations under the key JobAllocatedHyperNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also change to "plugin: network-topology-aware" in line 568.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, Done
Signed-off-by: wangbin <994903808@qq.com>
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
0b0024d
into
volcano-sh:network-topology
What this PR does / why we need it:
Add NetworkTopology plugin score doc