You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are attempting to deploy this framework on Volta GPUs without support of Flash-Attn. I noticed there are Llama models without RmPad that doesn't required flash-attn. Is it possible to employ those non-RmPad models? Are there any other required modifications related to RmPad behavior?