-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Volcano scheduler: Supports network topology aware scheduling when pods rescheduled #3971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Volcano scheduler: Supports network topology aware scheduling when pods rescheduled #3971
Conversation
95a93df
to
d715c8a
Compare
if jobHyperNode != "" { | ||
LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode) | ||
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode] | ||
if hyperNodeInfo.Tier()+1 > highestAllowedTier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't add 1 here, for example, s0 and s1 both are tier1 hypernode and they are sibling hypernodes, and current jobHyperNode is s1. Then LCAHyperNode is s4, which are s0 and s1's parent, tier 2 > highestAllowedTier, there is no need to add 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I have done it
LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode) | ||
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode] | ||
if hyperNodeInfo.Tier()+1 > highestAllowedTier { | ||
klog.V(4).InfoS("Skip search for higher tier cause highest allowed tier reached", "jobName", job.UID, "highestAllowedTier", highestAllowedTier, "tier", index+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the log here be distinguished from the previous? It could be written like this: Current tier of LCAHyperNode is larger than highestAllowedTier, skipping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index+1
? Should be hypernodeInfo.Tier()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I have done it.
d715c8a
to
4522e7f
Compare
4522e7f
to
196b72b
Compare
/assign @Monokaix |
bff6064
to
d1fd45c
Compare
@@ -819,7 +819,10 @@ func (sc *SchedulerCache) AddPodGroupV1beta1(obj interface{}) { | |||
klog.Errorf("Failed to convert podgroup from %T to %T", ss, podgroup) | |||
return | |||
} | |||
|
|||
if podgroup.GetAnnotations() == nil { | |||
klog.V(3).Infof("Add PodGroup(%s) into cache, spec(%#v)", ss.Name, ss.Spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it to solve the nil map problem? Why to add a log to print the whole podgroup in if here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it to solve the nil map problem? yes, I have clear the log.
@@ -290,6 +302,10 @@ func (alloc *Action) allocateResourceForTasksWithTopology(tasks *util.PriorityQu | |||
klog.V(4).InfoS("Find available hyperNodes for job", "jobName", job.UID, "tier", selectedTier, "hyperNodes", hyperNodes) | |||
} | |||
stmt, hyperNode := alloc.selectBestHyperNode(jobStmtsByTier[selectedTier], job) | |||
jobNewHyperNode := LCAHyperNodeMap[hyperNode] | |||
if jobNewHyperNode != jobHyperNode { | |||
job.PodGroup.GetAnnotations()[api.TopologyAllocateLCAHyperNode] = jobHyperNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be jobNewHyperNode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thanks, l will modify it
@@ -42,4 +42,7 @@ const ( | |||
|
|||
// topologyDecisionAnnotation is the key of topology decision about pod request resource | |||
topologyDecisionAnnotation = "volcano.sh/topology-decision" | |||
|
|||
// TopologyAllocateLCAHyperNode is the key to the lowest common ancestor of the network topology to which the tasks assigned to a job belong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to rewrite as TopologyAllocateLCAHyperNode is the key recording the lowest common ancestor hypernode of job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right. I have done it
pkg/scheduler/cache/cache.go
Outdated
@@ -1564,6 +1565,9 @@ func (sc *SchedulerCache) UpdateJobStatus(job *schedulingapi.JobInfo, updatePG b | |||
if err != nil { | |||
return nil, err | |||
} | |||
sc.Mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap it as a func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that’s ok. I have done it
@@ -290,6 +302,10 @@ func (alloc *Action) allocateResourceForTasksWithTopology(tasks *util.PriorityQu | |||
klog.V(4).InfoS("Find available hyperNodes for job", "jobName", job.UID, "tier", selectedTier, "hyperNodes", hyperNodes) | |||
} | |||
stmt, hyperNode := alloc.selectBestHyperNode(jobStmtsByTier[selectedTier], job) | |||
jobNewHyperNode := LCAHyperNodeMap[hyperNode] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
selectBestHyperNode may fail to return a true hypernode, then it will return empty string "", so we may exclude BestHyperNode is an empty string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I have modify it.
d1fd45c
to
5760f43
Compare
ff29b61
to
9ad3afc
Compare
if jobHyperNode != "" { | ||
LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode) | ||
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode] | ||
if hyperNodeInfo.Tier() > highestAllowedTier { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested a case where the two hypernodes are isolated from each other, they don't have LCAHyperNode, which should also be skipped at this point, at this point fetching from ssn.HyperNodes will be nil and calling Tier() will be panic. We should also add UT to cover this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes ,have done
45e4b1a
to
8fb1eae
Compare
9068dfa
to
54188de
Compare
44bee2a
to
283be11
Compare
MinimalBindCheck: true, | ||
}, | ||
{ | ||
Name: "hard network topology constrain and job rescheduler, can not allocate job when LCAHyperNode is empty", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
1fb58f1
to
a06eb75
Compare
pkg/scheduler/api/node_info.go
Outdated
@@ -90,6 +91,15 @@ type NodeInfo struct { | |||
ImageStates map[string]*k8sframework.ImageStateSummary | |||
} | |||
|
|||
// PodGroupOldState recored podgroup old state | |||
type PodGroupOldState struct { | |||
// Status recored podgroup status during schedule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo recored
, records
is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done
10b47d1
to
7b97ae6
Compare
…ds rescheduled Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>
7b97ae6
to
67957a1
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
By the way, |
What type of PR is this?
/kind feature
/area scheduling
What this PR does / why we need it:
Volcano scheduler: Supports network topology aware scheduling when pods rescheduled
Which issue(s) this PR fixes:
#3877