convert_hf_to_gguf.py : conversion from hf weights to Q6_0 #483

Nexesenex · 2025-06-02T01:48:02Z

This quantization script is obtained by making a sort of "cross multiplication" with the python code for q5_0, and the C code for q5_0 and q6_0 in order to get through trial and error the code for the q6_0 conversion script, this with the help of a 7xB parameters AI model.

It was an interesting experiment!

Tested on Llama 3.2 instruct 1B and Qwen 2.5 instruct 1.5B.
Bitrate of this q6_0 conversion is 6.50BPW straight.
PPL equivalent (+/-0.5%) to a regular q6_0 quant from a fp16 gguf.
Inference is working as intended in my Croco.cpp.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

convert_hf_to_gguf.py

Nexesenex added 2 commits June 2, 2025 03:38

Direct conversion from fp16 to Q6_0

441c035

forgotten comma

732607f

ikawrakow reviewed Jun 2, 2025

View reviewed changes

convert_hf_to_gguf.py Show resolved Hide resolved

More precise infos

ce418b3

Nexesenex force-pushed the conv_q6_0 branch from 9fe4e52 to ce418b3 Compare June 2, 2025 11:50

ikawrakow approved these changes Jun 3, 2025

View reviewed changes

ikawrakow merged commit 4f8b05a into ikawrakow:main Jun 3, 2025

Nexesenex deleted the conv_q6_0 branch June 3, 2025 13:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

convert_hf_to_gguf.py : conversion from hf weights to Q6_0 #483

convert_hf_to_gguf.py : conversion from hf weights to Q6_0 #483

Uh oh!

Nexesenex commented Jun 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

convert_hf_to_gguf.py : conversion from hf weights to Q6_0 #483

convert_hf_to_gguf.py : conversion from hf weights to Q6_0 #483

Uh oh!

Conversation

Nexesenex commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Nexesenex commented Jun 2, 2025 •

edited

Loading