Skip to content

Caret object: Inconsistent grid creation with documentation #2609

@Rek27

Description

@Rek27

Problem: According to the documentation, Tree depth hyperparameter should be 4-10 (optimal). For CPU, this hyperparameter can be any integer up to 16. Problem comes when looking at the function in catboost.caret that is generating the grid. It depends on the tuneLength which means, if someone does random search with tuneLength > 16, they will get NaN as the metric value (in my case Accuracy).

catboost.caret

...
$grid
function (x, y, len = 5, search = "grid")
{
if (search == "grid") {
grid <- expand.grid(depth = c(2, 4, 6), learning_rate = exp(-(0:len)),
iterations = 100, l2_leaf_reg = 1e-06, rsm = 0.9,
border_count = 255)
}
else {
grid <- data.frame(depth = sample.int(len, len, replace = TRUE),
learning_rate = runif(len, min = 1e-06, max = 1),
iterations = rep(100, len), l2_leaf_reg = sample(c(0.1,
0.001, 1e-06), len, replace = TRUE), rsm = sample(c(1,
0.9, 0.8, 0.7), len, replace = TRUE), border_count = sample(c(255),
len, replace = TRUE))
}
return(grid)
}
...

Shouldn't the grid be limited to 16 most? Not really to depend on the tuneLength.

catboost version: 1.2.2
Operating System: Windows 10 x64
CPU: AMD Ryzen 5 PRO 5650U
GPU: not using

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions