You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! In our experience, vllm rollouts use 70% of the grpo iteration time (when performed in bf16)
Has anyone tried using more aggressive precision reduction for rollouts (like using FP8) for gaining speed? I wonder if some okay methods exist for online FP8 usage for gaining speed in this rollout phase