-
Notifications
You must be signed in to change notification settings - Fork 214
feat(pod): added pod power #2127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## reboot #2127 +/- ##
===========================================
- Coverage 91.74% 81.66% -10.08%
===========================================
Files 34 37 +3
Lines 2991 3861 +870
===========================================
+ Hits 2744 3153 +409
- Misses 193 633 +440
- Partials 54 75 +21 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the implementation looks good to me.
Few comments though
279a24a
to
6f4e06f
Compare
c3293bc
to
96c29f7
Compare
@@ -95,6 +99,7 @@ func NewPowerCollector(monitor PowerDataProvider, logger *slog.Logger) *PowerCol | |||
zone = "zone" | |||
cntrID = "container_id" | |||
vmID = "vm_id" | |||
podID = "pod_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that we need pod_id
in kepler_container_.*
as well
- podInformer for fetching and caching pods - pod energy/power calculation - power collector shows pod energy/power - tests Signed-off-by: Vimal Kumar <vimal78@gmail.com>
@@ -195,6 +207,7 @@ func (c *PowerCollector) Collect(ch chan<- prometheus.Metric) { | |||
c.collectProcessMetrics(ch, snapshot.Processes) | |||
c.collectContainerMetrics(ch, snapshot.Containers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add pod_id to the containers so that we can "join" containers and pods
pi.manager = mgr | ||
|
||
opts := zap.Options{ | ||
Development: true, // enables DebugLevel by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we set this to true if current pi.logger. log-level == Debug?
@@ -311,6 +318,11 @@ func (pm *PowerMonitor) firstReading(newSnapshot *Snapshot) error { | |||
return fmt.Errorf(vmPowerError, err) | |||
} | |||
|
|||
// First read for pods | |||
if err := pm.firstPodRead(newSnapshot); err != nil { | |||
return fmt.Errorf(containerPowerError, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf(containerPowerError, err) | |
return fmt.Errorf(podPowerError, err) |
@@ -30,6 +30,7 @@ func TestCollectionLoop(t *testing.T) { | |||
resourceInformer := &MockResourceInformer{} | |||
resourceInformer.SetExpectations(t, tr) | |||
resourceInformer.On("Refresh").Return(nil) | |||
// resourceInformer.On("Pods").Return(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
|
||
containersNoPod := []string{} | ||
if ri.podInformer != nil { | ||
for _, container := range containersRunning { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing test? In a follow up PR, could you please add test to test this functionality?
pod := &Pod{ | ||
ID: podInfo.ID, | ||
Name: podInfo.Name, | ||
Namespace: podInfo.Namespace, | ||
} | ||
container.Pod = pod | ||
_, seen := podsRunning[pod.ID] | ||
// reset CPU Time of the pod if it is getting added to the running list for the first time | ||
// in the subsequent iteration, the CPUTimeDelta should be incremented by container's CPUTimeDelta | ||
resetCPUTime := !seen | ||
podsRunning[pod.ID] = ri.updatePodCache(container, resetCPUTime) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing tests. Could you please add tests in a follow up PR?
@@ -237,6 +289,22 @@ func (ri *resourceInformer) Refresh() error { | |||
ri.vms.Running = vmsRunning | |||
ri.vms.Terminated = vmsTerminated | |||
|
|||
// Find terminated pods | |||
totalPodsDelta := float64(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to find totalPodsDelta, do we?
|
||
// Resource usage tracking | ||
CPUTotalTime float64 // total cpu time used by the VM so far | ||
CPUTimeDelta float64 // cpu time used by the VM since last refresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPUTimeDelta float64 // cpu time used by the VM since last refresh | |
CPUTimeDelta float64 // cpu time used by the Pod since last refresh |
Namespace string | ||
|
||
// Resource usage tracking | ||
CPUTotalTime float64 // total cpu time used by the VM so far |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPUTotalTime float64 // total cpu time used by the VM so far | |
CPUTotalTime float64 // total cpu time used by the Pod so far |
2e3afdf
into
sustainable-computing-io:reboot
feat(pod): added pod power