Skip to content

Conversation

celestinoxp
Copy link
Contributor

@celestinoxp celestinoxp commented Mar 4, 2024

some details have not been updated to support the latest scikit-learn 1.4 code

  • confirm/update code
  • fix errors
  • make sure tests are testing metrics correctly (needs create more tests?)

Closes #3932

@celestinoxp
Copy link
Contributor Author

@Yard1 @moezali1 @tvdboom @glemaitre @ogrisel @thomasjpfan @lorentzenchr @adrinjalali

something is wrong with AUC metrics... i have no idea how to fix this pull-request...

image

@ogrisel
Copy link

ogrisel commented Mar 5, 2024

Could you please provide a minimal reproducer on synthetic data that ideally only involves scikit-learn? Working on crafting such a reproducer will likely help you understand what's going on.

@celestinoxp
Copy link
Contributor Author

Could you please provide a minimal reproducer on synthetic data that ideally only involves scikit-learn? Working on crafting such a reproducer will likely help you understand what's going on.

from pycaret.datasets import get_data
juice = get_data('juice')
from pycaret.classification import *
exp_name = setup(data = juice,  target = 'Purchase')
best_model = compare_models()

@celestinoxp
Copy link
Contributor Author

@ngupta23 can you help?

@@ -115,10 +116,11 @@ def __init__(
if scorer
else pycaret.internal.metrics.make_scorer_with_error_score(
score_func,
needs_proba=target == "pred_proba",
needs_threshold=target == "threshold",
response_method=None,
Copy link

@thomasjpfan thomasjpfan Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is calling scikit-learn's make_scorer under the covers, then you can pass in the response_method directly here.

if target == "pred"
    response_method = "predict"
elif target == "pred_proba":
    response_method = "predict_proba"
else:  # threshold
    response_method = "decision_function"

...

else pycaret.internal.metrics.make_scorer_with_error_score(
    score_func,
    response_method=response_method,
    greater_is_better=greater_is_better,
    error_score=0.0,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested but still not working...
logs.log show:

2024-03-12 18:16:58,428:WARNING:C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\metrics.py:196: FitFailedWarning: Metric 'make_scorer(roc_auc_score, response_method=('decision_function', 'predict_proba'), average=weighted, multi_class=ovr)' failed and error score 0.0 has been returned instead. If this is a custom metric, this usually means that the error is in the metric code. Full exception below:
Traceback (most recent call last):
  File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\metrics.py", line 188, in _score
    return super()._score(
  File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 345, in _score
    y_pred = method_caller(
  File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 87, in _cached_call
    result, _ = _get_response_values(
  File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\utils\_response.py", line 210, in _get_response_values
    y_pred = prediction_method(X)
  File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\pipeline.py", line 341, in predict_proba
    Xt = transform.transform(Xt)
  File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\utils\_set_output.py", line 295, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\preprocess\transformers.py", line 233, in transform
    X = to_df(X, index=getattr(y, "index", None))
  File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\utils\generic.py", line 103, in to_df
    data = pd.DataFrame(data, index, columns)
  File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\frame.py", line 822, in __init__
    mgr = ndarray_to_mgr(
  File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 319, in ndarray_to_mgr
    values = _prep_ndarraylike(values, copy=copy_on_sanitize)
  File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 575, in _prep_ndarraylike
    values = np.array([convert(v) for v in values])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

  warnings.warn(

2024-03-12 18:16:58,428:WARNING:C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1561: UserWarning: Note that pos_label (set to 'MM') is ignored when average != 'binary' (got 'weighted'). You may use labels=[pos_label] to specify a single positive class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasjpfan Can you help to investigate if the problem is with pycaret or scikit-learn? I'm doing tests on my laptop but I'm not sure where the error is to fix...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have the bandwidth to investigate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have the bandwidth to investigate.

but can you talk to someone on the scikit-learn side for support?

Copy link

@thomasjpfan thomasjpfan Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to debug to see if it is a pycaret bug or a scikit-learn bug. If it is a scikit-learn bug, then open an issue with a minimal reproduce that only involves scikit-learn.

#3935 (comment) is not a valid reproducer for scikit-learn because it is still using pycaret.

@celestinoxp
Copy link
Contributor Author

@celestinoxp
Copy link
Contributor Author

@Aloqeely can you give help fix bugs in pycaret?

@Aloqeely
Copy link

Sorry, I am not familiar with PyCaret
Good luck!

@moezali1 moezali1 requested a review from Yard1 April 25, 2024 19:47
Yard1 added 3 commits April 27, 2024 21:03
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
@Yard1 Yard1 changed the title [WIP] fix auc metric Fix AUC metric Apr 28, 2024
@Yard1 Yard1 merged commit 9ee0cf4 into pycaret:master Apr 28, 2024
@CMobley7
Copy link

CMobley7 commented Aug 1, 2024

@Yard1 , This problem appears to still exist for multiclass classification. If you use the simple example below, 7 of the 16 models will return 0.0000 for AUC. These models include lr, qda, lda, gbc, ada, ridge, and svm. Also, custom_metric, like the one below, works for binary classification if you add **kwargs, as I suggested in #3973, but give 0.0000 for multiclass classification. I think this may be related to the same issue.

Simple Example

from pycaret.datasets import get_data
from pycaret.classification import ClassificationExperiment

data = get_data('iris')
exp = ClassificationExperiment()
exp.setup(data, target = 'species', session_id = 123)
exp.compare_models()

Custom Metric

from pycaret.datasets import get_data
from pycaret.classification import ClassificationExperiment
from sklearn.metrics import fbeta_score

def f2_score(y_true, y_pred, **kwargs):
    """
    Calculate the F2 score.

    Args:
        y_true (1d array-like): The true labels.
        y_pred (1d array-like): The predicted labels.
        **kwargs: Additional arguments for fbeta_score.

    Returns:
        float: The F2 score.
    """
    return fbeta_score(y_true, y_pred, beta=2, **kwargs)

data = get_data('iris')
exp = ClassificationExperiment()
exp.setup(data, target = 'species', session_id = 123)
exp.add_metric(id="f2", name="F2", score_func=f2_score, target="pred", average="macro")
exp.compare_models()

I also tried leaving out average="macro" from the add_metric and updating f2_score to check if y_true had more than 2 unique values. If it did, the var average="macro" along with **kwargs was sent into fbeta_score. This didn't fix the 0.0000 issue either.

@celestinoxp
Copy link
Contributor Author

@CMobley7 can you do a PR fixing this?

@paolodep36
Copy link

Still this problem, very annoying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

All result AUC = 0 with compare_model
7 participants