Skip to content

Bad dataset #65

@abacaj

Description

@abacaj

If anyone is curious here is my run on the Alpaca dataset using another decoder model (codegen-16B-nl). Appears the dataset isn't diverse, multiple closely related answers. I believe this dataset is not capable of generalizing well to new data.

The loss from the original Alpaca training script follows a similar pattern used in OPT-IML to compute loss based on the label.

image

My run on codegen-16B-nl

image

Another user's run on LLaMA 7B

image

Some more discussion: https://twitter.com/abacaj/status/1637310768780648448

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions