Skip to content

Prevent finetuned models from always generating endlessly #734

@tomaarsen

Description

@tomaarsen

Hello!

I'm afraid I don't have a quick snippet for you to reproduce this, but I've noticed that various models that I've finetuned using SFT+RM+PPO & SFT+DPO endlessly generate texts until max_new_tokens is reached. This is quite frustrating as it always causes text to be cut off, and generally causes the generated text to be much longer than expected.

I'm wondering if you're familiar with this issue, and if you happen to know where the issue might lie? I.e. whether the model fails to train to follow the pattern well enough, or whether the generation configuration or tokenizer pad/eos tokens are set up incorrectly during inference.

To give an example, I have an instruction dataset where answers very consistently range between ~10 and ~15 tokens. Finetuning on this dataset with SFT and/or SFT+RM+PPO has resulted in models that generates 50 tokens if you set max_new_tokens=50.

I'm open to any advice.

  • Tom Aarsen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions