You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now we use a manual specified refit_buffer_size_gb to control the buffer_size of refit batching.
We should compute it by the remaining memory at refitting process and remove the param.
After that, it can be adaptively adjusted on different GPUs or different models, and there will be no OOM issue due to the improper setting of refit_buffer_size_gb from user side.