Skip to content

Conversation

2015aroras
Copy link
Collaborator

We have all the intermediate checkpoints of our official runs in R2, but nobody can access them since they don't know the correct URLs. This PR adds csv files that map run steps to the corresponding 'directory' URLs. A user can download the relevant checkpoint files from these URLs.

I was considering using Cloudflare redirects as an alternative solution. Specifically, we would have nice URLs like https://olmo-checkpoints.org/OLMo-7B/step5000/config.yaml. This solution requires the user to know what step numbers are valid, so something like the csv files I added would be necessary anyways. We can add redirects or other improvements later. For now, we should make the checkpoints available in some form.

@2015aroras 2015aroras requested review from dirkgr and epwalsh March 12, 2024 21:43
Copy link
Member

@dirkgr dirkgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Exactly what I thought we're missing.

Can you add a quick description somewhere that explains what exactly people will find in those directories, i.e., model.pt, optim.pt, and whatever else? Unfortunately those R2 URLs cannot be listed, so we have to describe it elsewhere. Or can we upload an index.html file to all of those directories?

Can we use http:// URLs in the OLMo code to load checkpoints? I thought this would not work if you can't list the directory, but now that I think about it, maybe it works?

@2015aroras
Copy link
Collaborator Author

This is great! Exactly what I thought we're missing.

Can you add a quick description somewhere that explains what exactly people will find in those directories, i.e., model.pt, optim.pt, and whatever else? Unfortunately those R2 URLs cannot be listed, so we have to describe it elsewhere. Or can we upload an index.html file to all of those directories?

Can we use http:// URLs in the OLMo code to load checkpoints? I thought this would not work if you can't list the directory, but now that I think about it, maybe it works?

I'll check the http scenario. Regarding the description of directory contents, I did add a small bit to the README regarding the 4 files in a checkpoint dir. Did you have anything else in mind?

@dirkgr
Copy link
Member

dirkgr commented Mar 12, 2024

Did you have anything else in mind?

No, that's all! Thanks!

@2015aroras
Copy link
Collaborator Author

Running the OLMo-1B config with the load path set to a https URL worked fine (on a somewhat outdated branch). I've updated the README with a bit more details about resuming from a checkpoint.

@2015aroras 2015aroras merged commit 0b835a8 into main Mar 13, 2024
@2015aroras 2015aroras deleted the shanea/expose-official-checkpoints branch March 13, 2024 20:28
@joellliu
Copy link

joellliu commented Jun 10, 2024

Hi @epwalsh, it seems the urls cannot be accessed now. For example, when I tried to download from the last link in OLMo-7B.csv, I got "error 404". Are the files expired or set as private now? Thank you!

@2015aroras
Copy link
Collaborator Author

The checkpoint directory cannot be directly accessed, but files within the directory can be. This is discussed briefly in the README

@joellliu
Copy link

The checkpoint directory cannot be directly accessed, but files within the directory can be. This is discussed briefly in the README

Oh I see. Thank you so much!

@mirandrom
Copy link

Hi, thanks for making these available; really helpful for doing research on learning dynamics in LLMs.

I noticed there are gaps in the 7B (and to a lesser extent 1B) URLs.

Do these missing checkpoints exist and can they be made available? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How can I finetune Olmo-7B using this repo? Full OLMo checkpoints?
5 participants