352 questions
0
votes
1
answer
44
views
How do I increment a variable after each training iteration using the CatBoost Classifier in Python?
As the title says, I'm trying to increment a variable after each training iteration using the Catboost Classifier, to update a progress bar in a gui, and I can't seem to find anything about it on the ...
0
votes
0
answers
29
views
Error with use ONNX model from CatBoost in MQL5
I have a Python script that uses CatBoost to create and save two models in ONNX format - a classification model and a regression model.
The first model contains direction (up/down) and probability, ...
1
vote
1
answer
51
views
Concatenating TF-IDF Data and Categorical Data for CatBoost Model
I've been trying to concatenate TF-IDF data with categorical data. However, when concatenating, the categorical data is automatically converted to float by default. Since CatBoost doesn't support ...
0
votes
1
answer
91
views
Catboost error when loading pool from disk
I am creating a catboost pool from a pandas dataframe (columns have strings as names, not sure if thats relevant) and then quantizing it and saving to disk using this code:
import catboost as cb
...
0
votes
0
answers
76
views
Log and load a model with a third-party maven library in MLFlow
I have used catboost-spark from maven installed on a cluster level in databricks for my classification case, CatBoostClassifier.
I have to use ApacheSpark libraries and not python.
The issue I found ...
0
votes
0
answers
30
views
R CatBoost support for incremental training
Does anyone know when the R catboost.train function will support the init_model argument (for incremental training)? Is there a way I can contribute the code, if that's what it will take?
(The Python ...
0
votes
0
answers
30
views
Converting CatBoost 'approx' Scores to Probabilities for PMML Deployment
I'm deploying a CatBoost model for default prediction and need to output the probability of default using PMML. According to the CatBoost documentation, when using CatBoostClassifier to save a PMML ...
0
votes
0
answers
112
views
XGBoost and LGBM models size depends on training data size for a given set of params whereas Catboost doesnt
I am comparing models in a walk forward cross validation setup, under python 3.11. For a given set of hyperparameters, xgboost and LGBM models size (when pickled or saved using the library saving ...
0
votes
2
answers
217
views
Using Optuna for CatBoost with batches: got nan on second trial
I am trying to tune CatBoost's hyperparameters using Optuna. I need to train my CatBoost model using batches, because training data is too big.
Here is my code:
def expand_embeddings(df, embedding_col=...
0
votes
1
answer
188
views
More efficient way to stream data to AWS Batch Transform Job
I have a sagemaker process for training and running inference on data in sagemaker:
processing job: read input csv files from s3 and clean up the data, output csv files to s3
processing job: read in ...
0
votes
1
answer
102
views
CatBoost crashes when launching Optuna
I want to tune a regressor catboost using Optuna and local GPU. The dataset is not very large: the training sample contains about 120k records and only 16 features (including categorical ones). I run ...
0
votes
0
answers
35
views
Converting Pandas DF to Spark Pool Data
I am trying to train a CatBoostClassifier model using catboost_spark using a Pandas DataFrame. All of the examples I've found create a data pool based on dummy data that uses Vector or VectorAssembler ...
1
vote
0
answers
61
views
Unable to initialize Spark CatBoostClassifier with parameters
I am trying to create a CatBoostClassifier using catboost_spark. In the regular Catboost package, parameters such as learning_rate, loss_function, num_leaves, etc. can be included when creating the ...
0
votes
0
answers
44
views
DiscriminationThreshold() for catboost
visualizer = DiscriminationThreshold(estimator, is_fitted=True, exclude="queue_rate", random_state=42)
visualizer.fit(X_val, y_val)
visualizer.show()
When I use the above code with ...
1
vote
1
answer
563
views
Problem with importing `catboost` package
I was trying to install catboost, and everything was going well until I decided to upgrade to Python 3.12. After the upgrade, I encountered an error when I tried to import it:
numpy.dtype size changed,...