You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add example for other model (DeepScaleR-1.5B-Preview) and dataset in docs/guides/eval.md.
(Default eval config is test AIME on Qwen2.5-Math-1.5B-Instruct)
Test and add eval link in the training guides: docs/guides/dpo|grpo|sft.md.
I think there is no need to add convert guide in the training guides since it is already in eval guide.
Support Pass@1 accuracy averaged over n samples.
Link docs/guides/eval.md on the front page readme after all these done.