🐇 [Research] Layer Skip SFT #3111

ariG23498 · 2025-03-19T10:50:25Z

We have used unsloth/Llama-3.2-3B and applied Layer Skip SFT on it with the https://huggingface.co/datasets/WillHeld/top_v2 dataset. You can find the fine tuned model here.

Benchmark Results:

Running the throughput benchmark script we get the following:

[------ Generation Speeds -------]
                     |  generation
16 threads: ----------------------
      no layer skip  |    522.1   
      layer skip 1   |    568.9   
      layer skip 2   |    353.0   
      layer skip 3   |    188.3   
      layer skip 4   |    170.6   
      layer skip 5   |    186.7   
      layer skip 6   |    203.2   
      layer skip 7   |    219.0   
      layer skip 8   |    235.3   
      layer skip 9   |    251.1   
      layer skip 10  |    260.7   
      layer skip 11  |    276.1   
      layer skip 12  |    291.9   
      layer skip 13  |    307.8   
      layer skip 14  |    323.1   
      layer skip 15  |    338.9   

Times are in milliseconds (ms).

With Layer number 4 we get around 67% reduction in generation latency.

We would love for the trl team to give us feedback on the technique. Any help or guidance would be appreciated.

CC: @mostafaelhoushi

HuggingFaceDocBuilderDev · 2025-03-19T10:54:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

mostafaelhoushi

Thanks Aritra! I just added some minor comments.

examples/research_projects/layer_skip/scripts/benchmark_layer_skip.py

examples/research_projects/layer_skip/scripts/config.py

examples/research_projects/layer_skip/README.md

examples/research_projects/layer_skip/scripts/benchmark_layer_skip.py

Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

examples/research_projects/layer_skip/README.md

qgallouedec

Nice, thanks for adding in the research projects!

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

ariG23498 · 2025-03-24T08:18:17Z

Hi @qgallouedec are we missing out on something? I am not sure if the CI errors are due to the changes, should I look into it more?

OK

qgallouedec · 2025-03-24T18:00:26Z

Merging, sorry for the delay

Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>

ariG23498 added 2 commits March 19, 2025 14:35

chore: adding layer skip sft

fdd6d04

fix file names

5a383fb

mostafaelhoushi reviewed Mar 19, 2025

View reviewed changes

Apply suggestions from code review

9966a8b

Co-authored-by: Mostafa Elhoushi <m.elhoushi@ieee.org>

qgallouedec reviewed Mar 19, 2025

View reviewed changes

examples/research_projects/layer_skip/README.md Outdated Show resolved Hide resolved

qgallouedec reviewed Mar 19, 2025

View reviewed changes

examples/research_projects/layer_skip/README.md Outdated Show resolved Hide resolved

qgallouedec approved these changes Mar 19, 2025

View reviewed changes

Update examples/research_projects/layer_skip/README.md

e77795f

Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>

ariG23498 added 2 commits March 24, 2025 18:48

Merge branch 'main' into aritra/layer-skip

e4c97b9

OK

style

ec494fb

qgallouedec changed the title ~~[Research] Layer Skip SFT~~ 🐇 [Research] Layer Skip SFT Mar 24, 2025

precommits

c63e48d

qgallouedec merged commit bfe2075 into huggingface:main Mar 24, 2025
8 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐇 [Research] Layer Skip SFT #3111

🐇 [Research] Layer Skip SFT #3111

Uh oh!

ariG23498 commented Mar 19, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Mar 19, 2025

Uh oh!

mostafaelhoushi left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

ariG23498 commented Mar 24, 2025

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!

🐇 [Research] Layer Skip SFT #3111

🐇 [Research] Layer Skip SFT #3111

Uh oh!

Conversation

ariG23498 commented Mar 19, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Mar 19, 2025

Uh oh!

mostafaelhoushi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

ariG23498 commented Mar 24, 2025

Uh oh!

qgallouedec commented Mar 24, 2025

Uh oh!

Uh oh!

Uh oh!