-
Notifications
You must be signed in to change notification settings - Fork 124
Open
Description
i think we may need to generalize that bos check. i found another place where it fails:
Lines 61 to 67 in b6269f7
message = tokenizer.apply_chat_template( | |
[user_message], | |
tokenize=False, | |
add_generation_prompt=True, | |
add_special_tokens=False, | |
) | |
user_message["token_ids"] = tokenizer(message, return_tensors="pt")["input_ids"][0] |
in the deepscaler base when you try to eval, it’ll add the double-bos, but the chat-template doesn’t have any obvious indicators b/c it’s a more complicated jinja template
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B/blob/main/tokenizer_config.json#L34
i think we may need some kind of apply_safe_chat_template()
or something that we re-use throughout the repo that remembers the decision to handle this bos-token. do you think you can look into a general fix?
Metadata
Metadata
Assignees
Labels
No labels