Skip to content

Why do I get better results with libfm? #28

@ibayer

Description

@ibayer

Why do I get better results with libfm?

Be careful if you use a regression model with a categorical target, such as the 1-5 star rating of the movielens dataset.

libfm automatically clips the prediction values to the higest / lowest value in the training data.
This make sense if you predict ratings with a regression model and evaluate with RMSE.

For example, it's certainly better to predict a 5 star rating if the regression score is > 5 then the regression value.
With fastFM you have to do the clipping yourself, because clipping is not always a good idea.

But it's easy to do if you need it.

    # clip values                                                    
    y_pred[y_pred > y_true.max()] = y_true.max()                        
    y_pred[y_pred < y_true.min()] = y_true.min()

Why do I not get exactly the same results with fastFM as with libFM?

FMs are non-linear models that use random initialization. This means that the solver might end up in a different local optima if the initialization changes. We can use a random seed in fastFM to make individual runs comparable, but that doesn't help if you compare results between different implementations. You should therefore always expect small differences between fastFM and libFM predictions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions