-
Notifications
You must be signed in to change notification settings - Fork 74.8k
TF-TRT CombinedNMS: fix top_k parameter max value #47698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging @bixia1 for review and @DEKHTIARJonathan for visibility. |
// TRT has a limitation: top_k <=4096. | ||
if (top_k > 4096) { | ||
if (AllowNmsTopkOverride()) { | ||
top_k = 4096; | ||
keep_top_k = std::min(top_k, keep_top_k); | ||
} else { | ||
return errors::InvalidArgument( | ||
"TRT NMS plugin allow top_k<=4096, where top_k = max(num_boxes, " | ||
"max_total_size). You can override this by setting " | ||
"TF_TRT_ALLOW_NMS_TOPK_OVERRIDE=1 environment variable, but this can " | ||
"result in a loss of accuracy."); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you agree, but I think making the env for user to over the top_k value is more flexible. If that is what we will do, this code will look like this:
std::optional<int64> user_request_top_k = read from env
if (user_request_top_k) {
if (top_k <= user_request_top_k.value()) {
issue a VLOG message saying the user request is not used because it is bigger than the natural top_k.
} else {
issue a VLOG to tell the "natural top_k" and user requested top_k values, and the use requested one is use
top_k = user_requested_top_k.value();
keep_top_k = min(top_k, keep_top_k)
}
}
if (top_k > 4096)
return errors::InvalidaArgument(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what we need here is a way for the user to opt-in for a conversion in case the TRT converted op is not strictly equivalent with the TF op. While your suggestion can be used to reach the same goal, I do not see the need for flexibly overriding top_k, and therefore I would choose the current simpler solution.
Why would the user want to override top_k with a concrete value? More importantly, how would he/she decide what value to use? The TF-TRT converter does this job: it sets a value for top_k based on the input parameters. If that value is not compatible with TRT, then we either skip conversion, or not (this case only with the users consent). If we do not skip conversion, then the next best thing (in terms of correctness of the results) is to set the max value allowed by the plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this comment.
Can this value affect performance? That is the only reason I can think of to allow users change the value.
Fixes #46453
The TRT plugin has a limitation on the max value of the top_k input parameter. This PR modifies the converter to refuse conversion if top_k > 4096.
In some cases it might be desirable to do the conversion, even if the top_k<=4096 restriction would lead to loss of accuracy. The user can set TF_TRT_ALLOW_NMS_TOPK_OVERRIDE=1 environment variable can be set to opt-in for conversion in this case.