-
Notifications
You must be signed in to change notification settings - Fork 669
Closed
Description
i use this commend to run evaluation
tune run eleuther_eval --config eleuther_evaluation \
> tasks="[hellaswag, wikitext]" \
> model._component_=torchtune.models.llama3.llama3_8b \
> quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer\
> quantizer.groupsize=128 \
> checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \
> checkpointer.checkpoint_dir="/QAT/output/llama3-8B" \
> checkpointer.output_dir="/QAT/output/llama3-8B" \
> checkpointer.checkpoint_files=[meta_model_2-8da4w.pt] \
> checkpointer.model_type=LLAMA3 \
> tokenizer._component_=torchtune.models.llama3.llama3_tokenizer \
> tokenizer.path=/QAT/Meta-Llama-3-8B/original/tokenizer.model
But i get this.
2024-10-09:08:30:57,790 INFO [_logging.py:101] Running EleutherEvalRecipe with resolved config:
batch_size: 8
checkpointer:
_component_: torchtune.training.FullModelTorchTuneCheckpointer
checkpoint_dir: /QAT/output/llama3-8B
checkpoint_files:
- meta_model_2-8da4w.pt
model_type: LLAMA3
output_dir: /QAT/output/llama3-8B
device: cuda
dtype: bf16
enable_kv_cache: true
limit: null
max_seq_length: 4096
model:
_component_: torchtune.models.llama3.llama3_8b
quantizer:
_component_: torchtune.training.quantization.Int8DynActInt4WeightQuantizer
groupsize: 128
seed: 1234
tasks:
- hellaswag
- wikitext
tokenizer:
_component_: torchtune.models.llama3.llama3_tokenizer
max_seq_len: null
path: /QAT/Meta-Llama-3-8B/original/tokenizer.model
Traceback (most recent call last):
File "/usr/local/bin/tune", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torchtune/_cli/tune.py", line 49, in main
parser.run(args)
File "/usr/local/lib/python3.10/dist-packages/torchtune/_cli/tune.py", line 43, in run
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/torchtune/_cli/run.py", line 196, in _run_cmd
self._run_single_device(args, is_builtin=is_builtin)
File "/usr/local/lib/python3.10/dist-packages/torchtune/_cli/run.py", line 102, in _run_single_device
runpy.run_path(str(args.recipe), run_name="__main__")
File "/usr/lib/python3.10/runpy.py", line 289, in run_path
return _run_module_code(code, init_globals, run_name,
File "/usr/lib/python3.10/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/recipes/eleuther_eval.py", line 576, in <module>
sys.exit(recipe_main())
File "/usr/local/lib/python3.10/dist-packages/torchtune/config/_parse.py", line 99, in wrapper
sys.exit(recipe_main(conf))
File "/usr/local/lib/python3.10/dist-packages/recipes/eleuther_eval.py", line 571, in recipe_main
recipe.setup(cfg=cfg)
File "/usr/local/lib/python3.10/dist-packages/recipes/eleuther_eval.py", line 494, in setup
for k, v in model_state_dict.items():
NameError: name 'model_state_dict' is not defined
i read the code https://github.com/pytorch/torchtune/blob/main/recipes/eleuther_eval.py, i can not find where model_state_dict is defined. A bug ???
I have used this config file, but get same error
model:
_component_: torchtune.models.llama3.llama3_8b
checkpointer:
_component_: torchtune.training.FullModelTorchTuneCheckpointer
checkpoint_dir: /QAT/output/llama3-8B/
checkpoint_files: [
meta_model_2-8da4w.pt
]
output_dir: /QAT/output/llama3-8B/
model_type: LLAMA3
# Tokenizer
tokenizer:
_component_: torchtune.models.llama3.llama3_tokenizer
path: /QAT/Meta-Llama-3-8B/original/tokenizer.model
max_seq_len: null
# Environment
device: cuda
dtype: bf16
seed: 42 # It is not recommended to change this seed, b/c it matches EleutherAI's default seed
# EleutherAI specific eval args
tasks: ["hellaswag"]
limit: null
max_seq_length: 8192
batch_size: 8
# Quantization specific args
quantizer:
_component_: torchtune.training.quantization.Int8DynActInt4WeightQuantizer
groupsize: 256
Anyone can helps, thanks very much !!!
Metadata
Metadata
Assignees
Labels
No labels