-
Notifications
You must be signed in to change notification settings - Fork 441
Description
the cli generate
command up until a recent fix (#763 )
created a train_merlinite-7b-Q4_K_M_.....jsonl file with duplicate instructions, this meant that
instead of the default 100 instructions, training was happening on 1000's in instructions.
When training with 100 instructions as intended the process is significantly faster, we should update the docs to
represent this.
I've verified that following the colab notebook with 100 items in the trainset takes 5 minutes
and training on a Linux Server (not a laptop) with no GPU and enough RAM takes 50 minutes (although num epochs defaults to 1 vs 5 in the notebook)
These times are only from a single run through but indicate that training duration's are much lower when they were documented.
I havn't tested on a MAC or Kaggle but assume I can scale the number down also