Skip to content

Conversation

ecosysbin
Copy link
Contributor

@ecosysbin ecosysbin commented Jan 14, 2025

What type of PR is this?
/kind feature
/area scheduling

What this PR does / why we need it:
Volcano scheduler: Supports network topology aware scheduling when pods rescheduled

Which issue(s) this PR fixes:
#3877

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 14, 2025
if jobHyperNode != "" {
LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode)
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode]
if hyperNodeInfo.Tier()+1 > highestAllowedTier {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't add 1 here, for example, s0 and s1 both are tier1 hypernode and they are sibling hypernodes, and current jobHyperNode is s1. Then LCAHyperNode is s4, which are s0 and s1's parent, tier 2 > highestAllowedTier, there is no need to add 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I have done it

LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode)
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode]
if hyperNodeInfo.Tier()+1 > highestAllowedTier {
klog.V(4).InfoS("Skip search for higher tier cause highest allowed tier reached", "jobName", job.UID, "highestAllowedTier", highestAllowedTier, "tier", index+1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the log here be distinguished from the previous? It could be written like this: Current tier of LCAHyperNode is larger than highestAllowedTier, skipping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index+1? Should be hypernodeInfo.Tier()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I have done it.

@volcano-sh-bot volcano-sh-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 15, 2025
@ecosysbin
Copy link
Contributor Author

/assign @Monokaix

@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from bff6064 to d1fd45c Compare January 15, 2025 04:51
@@ -819,7 +819,10 @@ func (sc *SchedulerCache) AddPodGroupV1beta1(obj interface{}) {
klog.Errorf("Failed to convert podgroup from %T to %T", ss, podgroup)
return
}

if podgroup.GetAnnotations() == nil {
klog.V(3).Infof("Add PodGroup(%s) into cache, spec(%#v)", ss.Name, ss.Spec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it to solve the nil map problem? Why to add a log to print the whole podgroup in if here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it to solve the nil map problem? yes, I have clear the log.

@@ -290,6 +302,10 @@ func (alloc *Action) allocateResourceForTasksWithTopology(tasks *util.PriorityQu
klog.V(4).InfoS("Find available hyperNodes for job", "jobName", job.UID, "tier", selectedTier, "hyperNodes", hyperNodes)
}
stmt, hyperNode := alloc.selectBestHyperNode(jobStmtsByTier[selectedTier], job)
jobNewHyperNode := LCAHyperNodeMap[hyperNode]
if jobNewHyperNode != jobHyperNode {
job.PodGroup.GetAnnotations()[api.TopologyAllocateLCAHyperNode] = jobHyperNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be jobNewHyperNode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks, l will modify it

@@ -42,4 +42,7 @@ const (

// topologyDecisionAnnotation is the key of topology decision about pod request resource
topologyDecisionAnnotation = "volcano.sh/topology-decision"

// TopologyAllocateLCAHyperNode is the key to the lowest common ancestor of the network topology to which the tasks assigned to a job belong.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to rewrite as TopologyAllocateLCAHyperNode is the key recording the lowest common ancestor hypernode of job

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right. I have done it

@@ -1564,6 +1565,9 @@ func (sc *SchedulerCache) UpdateJobStatus(job *schedulingapi.JobInfo, updatePG b
if err != nil {
return nil, err
}
sc.Mutex.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap it as a func?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that’s ok. I have done it

@@ -290,6 +302,10 @@ func (alloc *Action) allocateResourceForTasksWithTopology(tasks *util.PriorityQu
klog.V(4).InfoS("Find available hyperNodes for job", "jobName", job.UID, "tier", selectedTier, "hyperNodes", hyperNodes)
}
stmt, hyperNode := alloc.selectBestHyperNode(jobStmtsByTier[selectedTier], job)
jobNewHyperNode := LCAHyperNodeMap[hyperNode]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selectBestHyperNode may fail to return a true hypernode, then it will return empty string "", so we may exclude BestHyperNode is an empty string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I have modify it.

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 15, 2025
@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from ff29b61 to 9ad3afc Compare January 15, 2025 08:25
if jobHyperNode != "" {
LCAHyperNode := ssn.HyperNodes.GetLCAHyperNode(hyperNodeName, jobHyperNode)
hyperNodeInfo := ssn.HyperNodes[LCAHyperNode]
if hyperNodeInfo.Tier() > highestAllowedTier {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested a case where the two hypernodes are isolated from each other, they don't have LCAHyperNode, which should also be skipped at this point, at this point fetching from ssn.HyperNodes will be nil and calling Tier() will be panic. We should also add UT to cover this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes ,have done

@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from 45e4b1a to 8fb1eae Compare January 15, 2025 09:36
@volcano-sh-bot volcano-sh-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 15, 2025
@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from 9068dfa to 54188de Compare January 15, 2025 09:54
@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 15, 2025
@ecosysbin ecosysbin force-pushed the rescheduler branch 3 times, most recently from 44bee2a to 283be11 Compare January 15, 2025 15:23
MinimalBindCheck: true,
},
{
Name: "hard network topology constrain and job rescheduler, can not allocate job when LCAHyperNode is empty",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ecosysbin ecosysbin force-pushed the rescheduler branch 2 times, most recently from 1fb58f1 to a06eb75 Compare January 16, 2025 03:03
@@ -90,6 +91,15 @@ type NodeInfo struct {
ImageStates map[string]*k8sframework.ImageStateSummary
}

// PodGroupOldState recored podgroup old state
type PodGroupOldState struct {
// Status recored podgroup status during schedule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo recored, records is enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done

@ecosysbin ecosysbin force-pushed the rescheduler branch 10 times, most recently from 10b47d1 to 7b97ae6 Compare January 16, 2025 15:38
…ds rescheduled

Signed-off-by: ecosysbin <14729934+ecosysbin@user.noreply.gitee.com>
@Monokaix
Copy link
Member

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 17, 2025
Copy link
Member

@hwdef hwdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 17, 2025
@volcano-sh-bot volcano-sh-bot merged commit 4c100b1 into volcano-sh:network-topology Jan 17, 2025
10 checks passed
@hwdef
Copy link
Member

hwdef commented Jan 17, 2025

By the way, PodGroupOldState is a weird name

@volcano-sh-bot volcano-sh-bot added kind/feature Categorizes issue or PR as related to a new feature. area/scheduling labels Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/scheduling kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants