Getting gibberish output when running on llama.cpp

Hi, I see the mention of running this model on llama.cpp in README. Did you get a manage to get it to run and quantize with good output? I'm trying to evaluate if this model can be used for speculative decoding for llama 2 7B

With the first checkpoint https://huggingface.co/PY007/TinyLlama-1.1B-step-50K-105b - seems like there might be some issue converting to gguf
```
python convert.py ../TinyLlama-1.1B-step-50K-105b/

./main -m ../TinyLlama-1.1B-step-50K-105b/ggml-model-f32.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -ngl 0 --temp 0
```
Is resulting in the following - Either f16 or f32 would result in this, adding a `<s>` token at the beginning didn't help either:
```
(...)
Building a website can be done in 10 simple steps:\nStep 1:12000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
(...)
```

I can see that running with huggingface/torch is giving a more reasonable result, although it quickly becomes repeated
```
<s> Building a website can be done in 10 simple steps:
Step 1: Create a website.
Step 2: Add a logo.
Step 3: Add a contact form.
Step 4: Add a blog.
Step 5: Add a social media links.
Step 6: Add a contact page.
Step 7: Add a contact form.
Step 8: Add a contact form.
Step 9: Add a contact form.
```

Not sure where this mismatch is coming from

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting gibberish output when running on llama.cpp #24

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Getting gibberish output when running on llama.cpp #24

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions