Skip to content

Conversation

rafvasq
Copy link
Contributor

@rafvasq rafvasq commented Apr 26, 2024

Changes

  • Adds an option --workspace-path to the init command
    • If passed, config.yaml is created there (the workspace directory is created if it doesn't exist)
  • Adds support to pass a workspace-path as an env var ILAB_WORKSPACE_PATH

Which issue is resolved by this Pull Request:
Resolves #729

Description of your changes:
Allows for a user to pass a path to ilab init for config initialization in that given directory.

Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one minor suggestion inline.

I also wonder about the most desirable UX here. It seems kind of weird to do:

ilab init --workspace-path myworkspace
cd myworkspace
ilab download
ilab ...

A slight improvement would be to allow --workspace-path to every command.

ilab init --workspace-path myworkspace
ilab download --workspace-path myworkspace
...

This next part could be a separate PR, because the implementation would apply to more than just this use case, but I want to call it out as it helps the UX further once the workspace option is available for all commands.

On top of that, it would be great if instead of only allowing the --workspace-path option if it also supported an environment variable. More on that is in "Values from Environment variables" here https://click.palletsprojects.com/en/8.1.x/options/

Now we have ...

export ILAB_WORKSPACE_PATH=/path/to/workspace
ilab init
ilab download
ilab serve

What is super nice here is it no longer matters what directory you run ilab from. Once you have a workspace set and you point to it in the env, you can use the ilab commands anywhere.

I think that'd be a pretty nice UX.

Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@rafvasq
Copy link
Contributor Author

rafvasq commented Apr 29, 2024

Thanks @russellb, I made some updates and opened an issue #1034 to capture the changes to the other commands. I could take those on too.

Comment on lines 147 to 150
if workspace_path:
if not exists(workspace_path):
os.mkdir(workspace_path)
os.chdir(workspace_path)
Copy link
Member

@russellb russellb Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another thought about the UX for a workspaces concept. I think the use of os.chdir() results in some surprising behavior. Take this for example:

ilab init --taxonomy-path relative/path/to/taxonomy --workspace-path path/to/workspace

Any time I specify a relative path, I would expect it to be interpreted relative to the working directory where I'm running the command.

For my example, I would expect ilab to take my relative path and write an absolute path into the config file in the workspace that points it to the right place. I think other surprises could happen when the tool is switching its working directory out by surprise.

In other words, while it's more code, I think less surprising behavior would be to not change working directories, but instead, do more careful handling of paths throughout using that workspace path value.

What do you think?

Copy link
Contributor Author

@rafvasq rafvasq May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a different route today, hopefully I didn't veer too far from what you described.

I assumed that it'd be cleaner to use a workspace directory by default and that the config.yaml and taxonomy should go in there unless directed otherwise. To that end there's an added prompt during initialization that confirms the workspace directory. I sum up a few cases and where things would go unless directed otherwise during the prompts:

ilab init

config.yaml -> ./workspace
taxonomy -> ./workspace
ilab init --workspace-path path/to/workspace

config.yaml -> path/to/workspace
taxonomy -> ./workspace/taxonomy
ilab init --taxonomy-path relative/path/to/taxonomy --workspace-path path/to/workspace

config.yaml -> path/to/workspace
taxonomy -> relative/path/to/taxonomy

I had to update tests to check for the config.yaml in the workspace dir and the extra initializing prompt for the workspace directory, now there's something failing. I'll hold off on figuring out what's going wrong there until you have a look at this different implementation, in case I'm completely off the mark and redo it all anyway :)

Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@mergify mergify bot added the testing Relates to testing label May 2, 2024
@rafvasq rafvasq requested a review from russellb May 2, 2024 20:35
@nathan-weinberg
Copy link
Member

@Mergifyio rebase

rafvasq added 3 commits May 3, 2024 18:44
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Copy link
Contributor

mergify bot commented May 3, 2024

rebase

✅ Branch has been successfully rebased

@mergify mergify bot added the needs-rebase This Pull Request needs to be rebased label May 8, 2024
Copy link
Contributor

mergify bot commented May 8, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @rafvasq please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@russellb
Copy link
Member

russellb commented May 9, 2024

just want to apologize for the long delay in coming back to this. I have been traveling the last 1.5 weeks or so.

Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. One comment inline is that I think I'd prefer it disabled by default and to retain the current behavior.

I also wonder if we should try to incorporate this general idea into some work that @cdoern started in #990. He's looking to do an overhaul of the CLI command layout. It would be good to map out how this feature would work in his new design.

DEFAULT_API_KEY = "no_api_key"
DEFAULT_CONFIG = "config.yaml"
DEFAULT_MODEL = "merlinite-7b-lab-Q4_K_M"
DEFAULT_MODEL_PATH = f"models/{DEFAULT_MODEL}.gguf"
DEFAULT_TAXONOMY_REPO = "https://github.com/instructlab/taxonomy.git"
DEFAULT_TAXONOMY_PATH = "taxonomy"
DEFAULT_TAXONOMY_PATH = f"{DEFAULT_WORKSPACE_PATH}/taxonomy"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer not to use the workspace feature by default. Most instructions tell people to do a similar thing manually, I believe.

@cdoern
Copy link
Contributor

cdoern commented May 10, 2024

my work in #990 intends to also create a centralized dir in ~/.config for the ilab config and convert to a .conf file type.

I am unsure of how this would interact with this PR. However, I am totally open to taking whichever approach people are more comfortable with

@russellb
Copy link
Member

my work in #990 intends to also create a centralized dir in ~/.config for the ilab config and convert to a .conf file type.

I am unsure of how this would interact with this PR. However, I am totally open to taking whichever approach people are more comfortable with

I think the ideas are complementary.

The idea in this PR would be to configure a workspace directory so that it didn't matter what directory you were in to run ilab, it would operate on the files in that workspace dir. I was just thinking we should make sure we don't go really far in this PR in any way that might conflict with what you've got planned.

I think it may be mostly a new config file option (and option to init) plus an optional arg to (almost?) all other commands. I think that probably carries over OK.

@nathan-weinberg
Copy link
Member

@rafvasq we have this as part of our 5/30 milestone, do you think it's something we could get done before then or should we push it out to later?

@rafvasq
Copy link
Contributor Author

rafvasq commented May 16, 2024

@nathan-weinberg I'll be travelling until mid-next week but based on @russellb and @cdoern's comments, most of the work is done and I should be able to complete it as soon as I'm back. I'm pretty confident it can get into the 5/30 milestone!

rafvasq added 4 commits May 24, 2024 13:33
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@mergify mergify bot added ci-failure PR has at least one CI failure and removed needs-rebase This Pull Request needs to be rebased labels May 24, 2024
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@mergify mergify bot removed the ci-failure PR has at least one CI failure label May 27, 2024
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@rafvasq rafvasq requested a review from russellb May 27, 2024 20:34
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
@rafvasq rafvasq changed the title feat: Create a workspace with lab init feat: Support creating a workspace with lab init May 27, 2024
@russellb
Copy link
Member

Apologies again for the delay ...

The current implementation is a good start. I see that ilab init takes a workspace path and when specified, it will ensure that the directory exists. I would like to get a more complete UX for the feature before we merge anything, though. I think it would be easiest to write out a design for the spec of the changes to CLI commands before implementing it. We can try doing that just in comments here, or if it gets too difficult, it can be a design doc posted to https://github.com/instructlab/dev-docs.

A starting point would be my original comment here: #1022 (review)

To summarize - once a workspace is created and configured, I'd like to see everything done relative to that workspace instead of the current working directory. Otherwise, it feels incomplete and not enough of an improvement over mkdir workspace ; cd workspace.

@nathan-weinberg nathan-weinberg removed this from the Release - 6/13 milestone Jun 14, 2024
@mergify mergify bot added the needs-rebase This Pull Request needs to be rebased label Jun 14, 2024
Copy link
Contributor

mergify bot commented Jun 14, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @rafvasq please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@leseb
Copy link
Contributor

leseb commented Jun 19, 2024

@rafvasq did you get a chance to propose a small design doc like highlighted here. Thanks!

@rafvasq
Copy link
Contributor Author

rafvasq commented Jun 19, 2024

@leseb I've now created an issue instructlab/dev-docs#100 but sadly I don't have time to commit to seeing this work past what was originally scoped and done in this PR so far.

@leseb
Copy link
Contributor

leseb commented Jun 21, 2024

@leseb I've now created an issue instructlab/dev-docs#100 but sadly I don't have time to commit to seeing this work past what was originally scoped and done in this PR so far.

Thanks for your effort on this and sorry for the delay. I think we should probably close this and wait for someone else to pick up. Do you mind? Thanks

@rafvasq rafvasq closed this Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase This Pull Request needs to be rebased testing Relates to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support creating a workspace folder directly with lab init
6 participants