-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Add paginated feature to list rules api #14017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8617a47
to
9dfdb47
Compare
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
9dfdb47
to
b674217
Compare
I tagged people that have discussed #12120 as reviewers. |
@roidelapluie @codesome @gotjosh @juliusv could you have a look? Or at least let us know if you have noticed this PR, but you don't have time to get to it so that less qualified reviewers might give it a try? |
I won't have time to review this PR and I don't feel qualified enough for this part of the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally got around to taking a closer look at this one! Overall the approach looks pretty good to me, I left a number of questions and comments about the details though.
It's of course not perfect in the face of rule reloads happening while you paginate, but from reading #12120 I think that is understood.
Once the design + implementation settles, we'll also need https://github.com/prometheus/prometheus/blob/main/docs/querying/api.md#rules to be updated with documentation about the new functionality.
web/api/v1/api.go
Outdated
maxAlert, err := strconv.ParseInt(r.URL.Query().Get("maxAlerts"), 10, 32) | ||
if err != nil || maxAlert < 0 { | ||
return nil, &parsePaginationError{ | ||
err: fmt.Errorf("maxAlerts need to be a valid number and larger than 0: %w", err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it says that maxAlerts needs to be larger than 0, but the check above only checks that it's at least 0. Which one is intended?
If you want to allow 0, then maybe strconv.ParseUInt()
would be better. Then you can omit the explicit maxAlert < 0
check completely as that's covered by the function.
web/api/v1/api.go
Outdated
|
||
if r.URL.Query().Get("maxRuleGroups") != "" { | ||
maxRuleGroups, err := strconv.ParseInt(r.URL.Query().Get("maxRuleGroups"), 10, 32) | ||
if err != nil || maxRuleGroups < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
web/api/v1/api.go
Outdated
@@ -1515,6 +1607,77 @@ func parseExcludeAlerts(r *http.Request) (bool, error) { | |||
return excludeAlerts, nil | |||
} | |||
|
|||
func parseListRulesPaginationRequest(r *http.Request) (*listRulesPaginationRequest, *parsePaginationError) { | |||
var ( | |||
returnMaxAlert = int32(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency:
returnMaxAlert = int32(-1) | |
returnMaxAlerts = int32(-1) |
web/api/v1/api.go
Outdated
func parseListRulesPaginationRequest(r *http.Request) (*listRulesPaginationRequest, *parsePaginationError) { | ||
var ( | ||
returnMaxAlert = int32(-1) | ||
returnMaxRuleGroups = int32(-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably just use uint64 for these, even if we never expect to exceed an int32. Just feels more natural nowadays, avoids parsing issues when a user sets a ridiculously high limit, and you need to have one less cast (if changing to strconv.ParseUInt()
below.
web/api/v1/api.go
Outdated
parameter: "maxAlerts", | ||
} | ||
} | ||
returnMaxAlert = int32(maxAlert) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need two variable names for each parameter (e.g. maxAlert
and returnMaxAlerts
), at least after making the variable type adjustments I mentioned above. I'd just omit the ones with the return
prefix, set the three variables to their default values at the top, and then overwrite them directly from the parsed parameters. That's a bit simpler to read as well I think.
web/api/v1/api.go
Outdated
} | ||
|
||
func TruncateGroupInfos(groupInfos []*RuleGroup, maxRuleGroups int) ([]*RuleGroup, string) { | ||
resultNumber := 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity I'd probably omit this extra tracking variable and just use len(returnGroupDescs)
instead in both places where it's needed.
web/api/v1/api.go
Outdated
continue | ||
} | ||
|
||
// Return the next token if there is more aggregation group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Return the next token if there is more aggregation group | |
// Return the next token if there are more aggregation groups |
web/api/v1/api.go
Outdated
return nil, nil | ||
} | ||
|
||
func TruncateGroupInfos(groupInfos []*RuleGroup, maxRuleGroups int) ([]*RuleGroup, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now reading this function again after adding the other comments about it, I think it'd make sense to write it in a different way that would make it simpler and more efficient:
- Do a special-case early exit for the
maxRuleGroups < 0 || len(groupInfos) <= maxRuleGroups
case (just returngroupInfos
as is and""
for the token). - Then: Instead of looping over all groups and appending them individually, just return
groupInfos[:maxRuleGroups]
for the rule groups and the token forgroupInfos[maxRuleGroups-1]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the issue is that we need to truncate the result of filtering the rule groups based on the file names and group names. There is a small edge case that blocks the optmized version: a list of rule groups for a page can be the last one or not depending if the subsequent rule groups are filtered out or not.
E.g.:
/rules?rule_groups[]=a1,b1,c1&maxAlerts=1
in the rule groups we have
- a1
- b1
- c1
- d1
When we are iterating over the rule groups, there is no way to know if c1 is going to produce a page with nextToke=""
before we evaluate d1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, thanks for the explanation!
web/api/v1/api.go
Outdated
@@ -168,6 +169,17 @@ type apiFuncResult struct { | |||
|
|||
type apiFunc func(r *http.Request) apiFuncResult | |||
|
|||
type listRulesPaginationRequest struct { | |||
MaxAlerts int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd just stick to either int
or int64
for these fields here, less conversions / casts needed later on and no need to artificially limit the value range.
* Reduce number of variables * Reduce type convesion Signed-off-by: Raphael Silva <rapphil@gmail.com>
Addressing comments for pull request
* Remove maxAlerts parameter. * Reuse existing API responses by using omitempty in some fields Signed-off-by: Raphael Silva <rapphil@gmail.com>
* Eliminate the need to sort the rule groups. Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Ok, to simplify this PR I made the following changes:
Things that I'm not happy right now about this PR and would like input:
|
I added a bit more unit tests. |
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
45b39d7
to
b9c0c84
Compare
Signed-off-by: Raphael Silva <rapphil@gmail.com>
I ended updating the new parameters so that they are consistent with the existing ones where we use snake case instead of camel case. |
Thanks, I am on PTO but in between cocktails I will review (joking on cocktails, non sober merging is not a great practice 😜). |
Nice, at least rules API is snake case, so let's keep that? https://prometheus.io/docs/prometheus/latest/querying/api/#rules
Interesting, I think what you suggested make sense (return all from the token) |
Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
badfccf
to
3936f2a
Compare
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
de715ac
to
cd75574
Compare
@bwplotka Thanks for your review. The code looks a lot cleaner now and all the corners cases are being handled. |
Signed-off-by: Raphael Silva <rapphil@gmail.com>
docs/querying/api.md
Outdated
@@ -764,6 +764,8 @@ URL query parameters: | |||
- `file[]=<string>`: only return rules with the given filepath. If the parameter is repeated, rules with any of the provided filepaths are returned. When the parameter is absent or empty, no filtering is done. | |||
- `exclude_alerts=<bool>`: only return rules, do not return active alerts. | |||
- `match[]=<label_selector>`: only return rules that have configured labels that satisfy the label selectors. If the parameter is repeated, rules that match any of the sets of label selectors are returned. Note that matching is on the labels in the definition of each rule, not on the values after template expansion (for alerting rules). Optional. | |||
- `max_groups=<number>`: The `max_groups` parameter allows you to specify the maximum number of rule groups to return in a single response. If the total number of rule groups exceeds the specified `max_groups` value, the response will include a `nextToken` property. You can use the value of this `nextToken` property in subsequent requests in the `next_token` parameter to paginate over the remaining rule groups. The `nextToken` property will not be present in the final response, indicating that you have retrieved all the available rule groups. Please note that there are no guarantees regarding the consistency of the response if the rule groups are being modified during the pagination process. If a rule group is removed while you are paginating over the rule groups, an error might be raised if the removed rule group coincides with the next token. | |||
- `next_token`: the pagination token that was returned in previous request when the `max_groups` property is set. The pagination token is used to iteratively paginate over a large number of rule groups. To use the `next_token` parameter, the `max_groups` parameter also need to be present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if such token will no longer exists or will be wrong? (400 will be returned, looking on code, so should we document that?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
web/api/v1/api.go
Outdated
@@ -1,5 +1,5 @@ | |||
// Copyright 2016 The Prometheus Authors | |||
// Licensed under the Apache License, Version 2.0 (the "License"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an incidental change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh. fixing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, thanks for improvements and sorry for delays.
I want to double check if we want to rename those params to:
group_limit
group_next_token
This is for consistency with limit
params on other APIs. The group
prefix is to be clear we paginate per group not rule, and to be future proof if we ever want to add per rule pagination. WDYT? 🤗
web/api/v1/api.go
Outdated
if paginationRequest != nil && paginationRequest.NextToken != "" && !foundToken { | ||
err := fmt.Errorf("invalid nextToken '%v'. were rule groups changed?", paginationRequest.NextToken) | ||
return invalidParamError(err, "next_token") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if paginationRequest != nil && paginationRequest.NextToken != "" && !foundToken { | |
err := fmt.Errorf("invalid nextToken '%v'. were rule groups changed?", paginationRequest.NextToken) | |
return invalidParamError(err, "next_token") | |
} | |
if paginationRequest != nil && paginationRequest.NextToken != "" && !foundToken { | |
return invalidParamError(fmt.Errorf("invalid nextToken %; did rule groups change?", paginationRequest.NextToken), "next_token") | |
} |
web/api/v1/api.go
Outdated
parsedMaxGroups int64 = -1 | ||
err error | ||
) | ||
maxGroups := r.URL.Query().Get("max_groups") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about group_limit
naming instead? This might be consistent with other limit parameters in our API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one counter argument is that limit
never supported tokens
/pagination, but maybe it does not matter and limit would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistency for the win :) . Actually can we say the APIs that have limit
can support pagination in the future?
web/api/v1/api.go
Outdated
err error | ||
) | ||
maxGroups := r.URL.Query().Get("max_groups") | ||
nextToken := r.URL.Query().Get("next_token") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next_token
is great if we will never paginate over rules (: Are we sure about this? Otherwise maybe group_next_token
and group_limit
would be a nice, explicit pair?
web/api/v1/api.go
Outdated
for _, grp := range ruleGroups { | ||
if paginationRequest != nil && paginationRequest.NextToken != "" && !foundToken { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Code might actually simpler if we don't have a separate struct but groupNextToken
and groupLimit
variables 🙈 parse can return two vars or so (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will try to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. it saved a few lines :)
* Eliminate need for extra structs to store pagination parameters Signed-off-by: Raphael Silva <rapphil@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
Before merging I wanted to have one "last" discussion to double check the hardest thing in programming: naming (: https://cloud-native.slack.com/archives/C01AUBA4PFE/p1729235985519149 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good now besides the last comment!
Co-authored-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Raphael Philipe Mendes da Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for prompt review changes after feedback; LGTM!
* master: (667 commits) NHCB scrape: refactor state handling and speed up scrape test (prometheus#15193) feat(tools): add debug printouts to rules unit testing (prometheus#15196) docs: add keep_firing_for in alerting rules api: Add rule group pagination to list rules api (prometheus#14017) Update scrape/scrape.go benchmark, rename parser omtext_with_nhcb goimports run Better docstring on test Remove omcounterdata.txt as redundant Fix failing benchmarks Add unit test to show that current wrapper is sub-optimal Rename convert_classic_histograms to convert_classic_histograms_to_nhcb More followup to prometheus#15164 Follow up prometheus#15178 Followup to prometheus#15164 test(cmd/prometheus): speed up test execution by t.Parallel() when possible feat: normalize "le" and "quantile" labels values upon ingestion scrape: provide a fallback format (prometheus#15136) Disallowing configure AM with the v1 api (prometheus#13883) feat: ProtobufParse.formatOpenMetricsFloat: improve float formatting by using strconv.AppendFloat instead of fmt.Sprint ... # Conflicts: # go.mod # go.sum
This is to implement the proposal here: #12120
There is 3 parameters added:
MaxAlerts: Limit number of alerts returned for each rules, if there is more alerts for a rule, we will return hasMore = true
MaxRuleGroups: Limit number of rule groups returned, if there is more rule groups to return, we will return a NextToken in the response
NextToken: The paginate token