Skip to content

[R] Best iteration index from early stopping is discarded when model is saved to disk #5209

@DavorJ

Description

@DavorJ

These values are predicted after xgboost::xgb.train:
247367.2 258693.3 149572.2 201675.8 250493.9 292349.2 414828.0 296503.2 260851.9 190413.3

These values are predicted after xgboost::xgb.save and xgboost::xgb.load of the previous model:
247508.8 258658.2 149252.1 201692.6 250458.1 292313.4 414787.2 296462.5 260879.0 190430.1

They are close, but not the same. The differences between these two predictions range from -1317.094 to 1088.859 on a set of 25k samples. When comparing with true labels, then the MAE/RMSE of these two predictions do not differ much.

So I suspect that this has to do with rounding errors during load/save since the MAE/RMSE do not differ as much. Still, I find this strange since binary storing the model should not introduce rounding errors?

Anyone a clue?

PS Uploading and documenting the training process seems not important to me here. I could provide details if necessary, or make a simulation with dummy data to prove the point.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions