-
Notifications
You must be signed in to change notification settings - Fork 114
Description
What is the problem?
A common scenario for creating a derived dataset is to use a local clone (with data, possibly in some reckless clone shape) of a publicly available dataset with the intention to publish it then publicly. When we do a local clone first it records local "url" within .gitmodules, requiring user to change it to a public one before publishing. Not a big deal I guess, but still -- user needs to know to do that, how to do that (editor or not immediately obvious git config --file .gitmodules --replace-all submodule.{name}.url {public_url}
followed by save|commit
, etc.
I think we should provide some convenience to streamline such use cases.
Candidate approaches requiring development in DataLad:
- approach 1: add a new option
clone [--use-sibling-url [NAME]]
which would then work only for clones from local repos, and take their (tracked if no NAME provided) remote's URL as the url to use forsubmodule.{}.url
I guess original url would be recorded in submodule.{}.datalad-url
in a similar fashion to what is done for RIA remotes (only AFAIK) in
datalad/datalad/core/distributed/clone.py
Line 357 in b4fafb0
ds.repo.call_git( |
Then it would be
datalad clone --use-sibling-url --reckless ephemeral ../orig sourcedata
... do whatever
datalad push|publish
- approach 2: make it possible to "add" a sibling which would make current annex having a reckless/ephemeral .git/annex/objects, thus possibly mutating what it already has, then it would be
datalad clone -d . public_url sourcedata
datalad siblings add -n local --reckless ephemeral ...
... do whatever
datalad push|publish
we also have ways to extend candidate urls via datalad-specific configuration in .datalad/config
(e.g as described in handbook) but I think it is not directly relevant, since I think it is a question of fixing .gitmodules
.{}.url
entries which aren't usable anywhere but locally in such local reckless clones cases.
may be there are better approaches, which might not even require extending our API?