-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
pycaret version checks
-
I have checked that this issue has not already been reported here.
-
I have confirmed this bug exists on the latest version of pycaret.
-
I have confirmed this bug exists on the master branch of pycaret (pip install -U git+https://github.com/pycaret/pycaret.git@master).
Issue Description
Hello! i checked and didn't find the issue reported before, i'm very sorry if i couldn't find it.
Also i apologize for my broken english.
The pipeline i am working on ran perfecly with last version of pycaret, but when i updated my model to pycaret 3.0.0 it stopped working.
To make it reproducible i will list the basic tasks i did:
I started a new venv in python 3.10.10 locally in a folder and installed the following list of requirements.txt in this order:
pandas
numpy
sqlalchemy
scikit-learn
seaborn
matplotlib
datetime
streamlit
pyodbc
sqlalchemy<2.0
mlflow
xgboost
pycaret[full]
When i ran my pipeline it selected catboost regressor as the estimator by r2, i tuned the model, finalized it and saved it to a subfolder "models"
When i tried to load the model using load_model() it loadad correctly but when i ran the predict_model() function i got this error:
"ValueError: If estimator is not a Pipeline, you must run setup() first."
i ran the same steps above but in stead of using compare_models i used create_model("lightgbm") and it worked just fine with that algorithm which makes me think its related only to catboost regressor.
Reproducible Example
# importar librerias
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from pycaret.regression import load_model, predict_model
from pycaret.regression import setup, compare_models, tune_model, plot_model, finalize_model, save_model, create_model, evaluate_model
# setup model
session = setup(
data = train,
target = 'Monto_EUR',
log_experiment = True,
use_gpu=False,
session_id=seed,
normalize = True,
normalize_method = 'zscore',
)
# compare models
model = compare_models(exclude=['dummy','ada','en','lar','llar','lasso','rf','et'], sort='r2')
# tune model
model = tune_model(model, optimize='mape', n_iter=100, choose_better=True)
# finalize_model
model = finalize_model(model)
# save_model
save_model(model=model, model_name='models/predictor_model')
# load model
model = load_model('models/predictor_model')
# predict test
predictions = predict_model(estimator=model, data=test)
Expected Behavior
it should create the dataframe with labels
Actual Results
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_13632\4067720024.py in ()
1 # predict test
----> 2 predictions = predict_model(estimator=model, data=test)
c:\Users\alniquia\OneDrive - Telefonica\Documents\Projects\CalculadoraCostos\env\lib\site-packages\pycaret\regression\functional.py in predict_model(estimator, data, round, verbose)
1925 experiment = _EXPERIMENT_CLASS()
1926
-> 1927 return experiment.predict_model(
1928 estimator=estimator,
1929 data=data,
c:\Users\alniquia\OneDrive - Telefonica\Documents\Projects\CalculadoraCostos\env\lib\site-packages\pycaret\regression\oop.py in predict_model(self, estimator, data, round, verbose)
2219 """
2220
-> 2221 return super().predict_model(
2222 estimator=estimator,
2223 data=data,
c:\Users\alniquia\OneDrive - Telefonica\Documents\Projects\CalculadoraCostos\env\lib\site-packages\pycaret\internal\pycaret_experiment\supervised_experiment.py in predict_model(self, estimator, data, probability_threshold, encoded_labels, raw_score, round, verbose, ml_usecase, preprocess)
4925 pipeline.steps = pipeline.steps[:-1]
4926 elif not self._setup_ran:
-> 4927 raise ValueError(
4928 "If estimator is not a Pipeline, you must run setup() first."
4929 )
ValueError: If estimator is not a Pipeline, you must run setup() first.
Installed Versions
System:
python: 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
executable: c:\Users\alniquia\OneDrive - Telefonica\Documents\Projects\CalculadoraCostos\env\Scripts\python.exe
machine: Windows-10-10.0.19044-SP0
PyCaret required dependencies:
pip: 23.0.1
setuptools: 60.10.0
pycaret: 3.0.0
IPython: 7.34.0
ipywidgets: 7.7.4
tqdm: 4.64.1
numpy: 1.23.5
pandas: 1.5.3
jinja2: 3.1.2
scipy: 1.9.3
joblib: 1.2.0
sklearn: 1.2.2
pyod: 1.0.9
imblearn: 0.10.1
category_encoders: 2.6.0
lightgbm: 3.3.5
numba: 0.56.4
requests: 2.28.2
matplotlib: 3.6.3
scikitplot: 0.3.7
yellowbrick: 1.5
plotly: 5.13.1
kaleido: 0.2.1
statsmodels: 0.13.5
sktime: 0.16.1
tbats: 1.1.2
pmdarima: 2.0.3
psutil: 5.9.4
PyCaret optional dependencies:
shap: 0.41.0
interpret: 0.3.2
umap: 0.5.3
pandas_profiling: 4.1.1
explainerdashboard: 0.4.2.1
autoviz: 0.1.58
fairlearn: 0.7.0
xgboost: 1.7.4
catboost: 1.1.1
kmodes: 0.12.2
mlxtend: 0.21.0
statsforecast: 1.5.0
tune_sklearn: 0.4.5
ray: 2.3.1
hyperopt: 0.2.7
optuna: 3.1.0
skopt: 0.9.0
mlflow: 1.30.0
gradio: 3.23.0
fastapi: 0.95.0
uvicorn: 0.21.1
m2cgen: 0.10.0
evidently: 0.2.7
fugue: 0.8.2
streamlit: 1.20.0
prophet: Not installed