Skip to content

Conversation

voelzmo
Copy link
Member

@voelzmo voelzmo commented May 12, 2025

How to categorize this PR?

/area auto-scaling
/area monitoring
/kind enhancement

What this PR does / why we need it:
This PR sets a minAllowed CPU value for prometheus-shoot to 150mCores. We have seen way too frequent evictions on the prometheus-shoot instances for very small absolute CPU values (e.g ~20 changes in 24 hours for moving between 70mCores and 130mCores in intermediate steps of ~20mCores and back again).

We want to address this upstream by adjusting how the vpa computes the upper and lower bounds around the target recommendation. In theory, this is the mechanism that should take care of not evicting for very small changes. In practice, however, the lowerBound and upperBound are way too close to the target, sometimes even identical.

So this is a temporary fix to spend more resources in order to avoid too frequent evictions.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:
The PR also adds 100M for memory to the minAllowed structure, in order to keep the existing behavior. Currently, no values are defined for VPAMinAllowed, therefore it is defaulted to memory: 100M.

Release note:

Set minAllowed CPU to `150m` for prometheus-shoot to avoid frequent evictions

@gardener-prow gardener-prow bot added area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. labels May 12, 2025
@gardener-prow gardener-prow bot requested review from oliver-goetz and rfranzke May 12, 2025 16:24
@gardener-prow gardener-prow bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label May 12, 2025
Copy link
Member

@istvanballok istvanballok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label May 13, 2025
Copy link
Contributor

gardener-prow bot commented May 13, 2025

LGTM label has been added.

Git tree hash: 4ccfc4c144fdf84c9acb3dd0002a6a55233cb51f

Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

Copy link
Contributor

gardener-prow bot commented May 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 13, 2025
@gardener-prow gardener-prow bot merged commit ad61d15 into gardener:master May 13, 2025
19 checks passed
@voelzmo
Copy link
Member Author

voelzmo commented May 13, 2025

/cherry-pick release-v1.118
/cherry-pick release-v1.117
/cherry-pick release-v1.116

@gardener-ci-robot
Copy link
Contributor

@voelzmo: new pull request created: #12069

In response to this:

/cherry-pick release-v1.118
/cherry-pick release-v1.117
/cherry-pick release-v1.116

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@istvanballok
Copy link
Member

The robot seems to process only 1 cherry-pick instruction in a PR comment.

@istvanballok
Copy link
Member

/cherry-pick release-v1.117

@istvanballok
Copy link
Member

/cherry-pick release-v1.116

@gardener-ci-robot
Copy link
Contributor

@istvanballok: new pull request created: #12079

In response to this:

/cherry-pick release-v1.117

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gardener-ci-robot
Copy link
Contributor

@istvanballok: new pull request created: #12080

In response to this:

/cherry-pick release-v1.116

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/monitoring Monitoring (including availability monitoring and alerting) related cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants