Is non-RmPad version model and RmPad verison mdoel interchangeable?

Hi, thanks for your great work!

We are attempting to deploy this framework on Volta GPUs without support of Flash-Attn. I noticed there are Llama models without RmPad that doesn't required flash-attn. Is it possible to employ those non-RmPad models? Are there any other required modifications related to RmPad behavior？

Thanks a lot.