Skip to content

Conversation

nathan-weinberg
Copy link
Member

@nathan-weinberg nathan-weinberg commented Mar 14, 2024

Which issue is resolved by this Pull Request:
Resolves #515

Description of your changes:
edited from original: for context originally created lab lint command that linted CLI Python code via tox

  • Adds yaml linting to `lab check'
  • Adds tests for `lab check'
  • create-untracked() func can now write byte strings

Also fixes incorrect version pin for Labeler action

@nathan-weinberg
Copy link
Member Author

cc @deanwampler

@deanwampler
Copy link

Hi, is this linting the CLI's Python code? #515 was intended to implement linting of the qna.yaml YAML files, specifically any user-specified files. (I suppose it could do all of them if you don't specify a file or directory.) Make sense?

@nathan-weinberg
Copy link
Member Author

Hi, is this linting the CLI's Python code? #515 was intended to implement linting of the qna.yaml YAML files, specifically any user-specified files. (I suppose it could do all of them if you don't specify a file or directory.) Make sense?

Yup this is for the Python code but I see your point - let me see about extending this to cover linting those YAML files as well 👍

@nathan-weinberg
Copy link
Member Author

@ckadner I wonder if we might be able to turn the raw Shell code in that lint action in taxonomy to a shell script within that repo? The CI there could replace that snippet with calling the script and the CLI could use that for YAML linting, which would ensure any changes made to the YAML scheme in that repo would be seemlessly enforced by the CLI

@danmcp
Copy link
Member

danmcp commented Mar 15, 2024

Some potential concerns about linting the python code through lab lint:

  • This would blur the line between user and developer roles and would clutter up the non developer experience.
  • This would require tox be installed to use. Currently it's in the dev requirements.
  • There is an effort to move away from shelling out from lab

@nathan-weinberg
Copy link
Member Author

nathan-weinberg commented Mar 15, 2024

Some potential concerns about linting the python code through lab lint:

* This would blur the line between user and developer roles and would clutter up the non developer experience.

* This would require tox be installed to use.  Currently it's in the dev requirements.

* There is an effort to move away from shelling out from lab

I tried to address some of these concerns in my followup commit:

  1. Made the default behavior just to do basic YAML linting (for now) - I've kept in the Python linting because I think it's useful for CLI development but it's now behind a --dev flag so average users will not hit it accidentally
  2. As for tox/shelling out, deciding to use tox is a decision that predates me - if there is a way to do the linting by bypassing that I'm open to suggestions, but otherwise I'm not sure how we get around shelling it out (edit: One possible way around this could be calling pytest directly from the CLI code and bypassing tox entirely)

Other changes I made here:

  • Got green/red coloring working for both Python/YAML linting
  • Added yaml_rules.yaml based off current rules in taxonomy GitHub Action (can move this to that repo if it's preferred)
  • Added yamllint to requirements.txt

@deanwampler
Copy link

I recommend the following changes:

  1. Remove the python linting. I agree with the comments above that it confuses internal developer and end-user concerns. Add a make file for useful developer automation instead.
  2. Running yamllint helps, but it doesn't cover all the other yq commands that are also executed. The goal of Add a lab lint command #515 is to reproduce all linting steps. Hence, I recommend you refactor taxonomy/.github/workflows/lint.yml to move all the shell commands to a standalone shell script (e.g., like the existing testrun.sh) and call it from the PR workflow and from lab lint. Of course, the problem with this approach is it only works in *nix compatible shells. That's fine with me 😁.

@nathan-weinberg
Copy link
Member Author

I recommend the following changes:

1. Remove the python linting. I agree with the comments above that it confuses internal developer and end-user concerns. Add a `make` file for useful developer automation instead.

2. Running `yamllint` helps, but it doesn't cover all the other `yq` commands that are also executed. The goal of [Add a `lab lint` command #515](https://github.com/instruct-lab/cli/issues/515) is to reproduce _all_ linting steps. Hence, I recommend you refactor [taxonomy/.github/workflows/lint.yml](https://github.com/instruct-lab/taxonomy/blob/main/.github/workflows/lint.yml) to move all the shell commands to a standalone shell script (e.g., like the existing `testrun.sh`) and call it from the PR workflow and from `lab lint`. Of course, the problem with this approach is it only works in *nix compatible shells. That's fine with me 😁.

I'm fine with this approach, but I'm going to see if we can get some weigh-in from the project maintainers to ensure this is the direction we're all fine going with since it'll require removal of some functionality from this PR (we should probably then open a separate issue for creating a Makefile which I'm happy to address in a seperate PR) as well as a PR to taxonomy to make the changes to the lint action in that repo

@markstur
Copy link
Member

First impression is that it's very confusing because "lab lint" would be expected to be only for "users" not at all like tox lint/fmt for devs.

Looking deeper, I see you could hide some of that with --dev flag, but let's go back to "it's confusing to users" and just not do that. Devs can use tox.

Nice work (thank you), but feels a little misplaced for the user experience when mixing in dev stuff.

@markstur
Copy link
Member

markstur commented Mar 15, 2024

Also curious why we need a Makefile? I don't mind them, but I've seen them removed from projects and we have tox list for devs and lab for users. Need to get some consensus around the right tools for the project (folks keep adding good ideas :) but we are getting a lot of directions (pipx, ruff, make, ...)

There's some benefit in keeping it minimum defacto standards (and that could include make but ? )

note: I'm all for consensus building here, if I didn't convey that ^

@deanwampler
Copy link

That was only a suggestion. It would let a CLI developer type make lint to do the Python linting.

@hickeyma
Copy link
Member

@nathan-weinberg Should this be part of the lab check command that checks the knowledge/skills changes? I am confused what this has to do with the CLI python code and adding GH workflows?

@nathan-weinberg
Copy link
Member Author

@nathan-weinberg Should this be part of the lab check command that checks the knowledge/skills changes? I am confused what this has to do with the CLI python code and adding GH workflows?

CLI linting Python code came from me, but seems the general consensus here is to move away from that.

Not looking to add any workflows here, but I think what @deanwampler was saying in his original issue (feel free to step in Dean if I'm mistaken) was that he wanted a way to do the equivalent linting being done by the Taxonomy repo Lint action locally from the CLI.

@nathan-weinberg nathan-weinberg changed the title feat: implemented lab lint to make local linting easier feat: enhanced lab check to make local linting easier Mar 15, 2024
@nathan-weinberg
Copy link
Member Author

Updated the PR to remove the Python linting and move the basic YAML checking into the existing lab check command rather than introducing the new lab lint command

Copy link
Member

@markstur markstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this more backwards compatible. Not sure this would break anyone/anywhere but I've been burned by that lately. It's not a protected API.

Along with that there's an unnecessary change in the KeyError exception that makes that error useful (I think) and isn't necessary to remove that.

@xukai92 xukai92 added this to the March 21 Musts milestone Mar 19, 2024
@nathan-weinberg nathan-weinberg requested a review from anik120 March 20, 2024 01:06
@nathan-weinberg
Copy link
Member Author

Squashed commits to ease rebasing

@aesteve-rh
Copy link
Contributor

I am no maintainer, but code LGTM, except for the linting issues that point to duplicated imports. Please fix those first.

Also, I'd suggest splitting the update of the labeler action to avoid mixing stuff. If the label is incorrect everyone may benefit of having it updated asap, and otherwise it gets mixed up and delayed by the other changes in this PR.

@nathan-weinberg
Copy link
Member Author

I am no maintainer, but code LGTM, except for the linting issues that point to duplicated imports. Please fix those first.

Also, I'd suggest splitting the update of the labeler action to avoid mixing stuff. If the label is incorrect everyone may benefit of having it updated asap, and otherwise it gets mixed up and delayed by the other changes in this PR.

Those are fixed now - this should be going in today so updating the labeler here should be fine. Thanks for reviewing!

@nathan-weinberg
Copy link
Member Author

nathan-weinberg commented Mar 21, 2024

cc @xukai92

Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, thanks for adding the tests @nathan-weinberg.

I think it'll help to keep only related changes in this PR (ie changes related to lab check). It'll help review the PR if the extra comments are removed, thank you!

@anik120
Copy link
Contributor

anik120 commented Mar 21, 2024

@nathan-weinberg could you un-resolve the comments you didn't address please. It's helpful to have the either have the conversation and let the commenter resolve, or ignore it too but let other reviewers say "this comment doesn't matter much so resolving" instead of the author resolving it themselves.

Copying the highest priority comment here that's been resolved without addressing:

anik120
I'm a little confused by this. I see the file added to the top level project directory. How does it work when I don't have this project cloned locally?
I was expecting the design for the yaml-rules files to be something like the design for prompt.txt file. When you lab generate, you'll see the message No prompt file found, using default prompts. The default prompt template is a hardcoded contanst. ie if you add a prompt file to your workspace (that you initialized with lab init), it overrides the default prompt template.
So I'm not exactly sure how the yaml-rules.yaml you added will be used if I do a fresh installation of the cli in my machine?

nathan-weinberg
I think you're misunderstanding what's happening here - where you've commented has nothing to do with that, this is just fixing the version set in #591

Yea my bad, I placed that comment in the wrong place (I wanna say it's because there were so many other unrelated changes like the code comments that I had to go through, that the review process got derailed, but for the sake of moving things along, let's go with "it's my bad")

But I hope my comment is clear. Let me phrase it in another way:

Quality testing this PR Failed.

$ git fetch downstream pull/621/head:nathan-test 
From https://github.com/instruct-lab/cli
 * [new ref]         refs/pull/621/head -> nathan-test
 $ git checkout nathan-test 
Switched to branch 'nathan-test'

$ pip install .
Processing /Users/anbhatta/DeepLearning/labrador/cli
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: click<9.0.0,>=8.1.7 in /Users/anbhatta/DeepLearning/labrador/venv/lib/python3.11/site-packages (from cli==0.0.1) (8.1.7)
Requirement already satisfied: click-didyoumean<0.4.0,>=0.3.0 in /Users/anbhatta/DeepLearning/labrador/venv/lib/python3.11/site-packages (from cli==0.0.1) (0.3.0)
Requirement already satisfied: llama-cpp-python[server]<0.3.0,>=0.2.55 in /Users/anbhatta/DeepLearning/labrador/venv/lib/python3.11/site-packages (from cli==0.0.1) (0.2.55)
Requirement already satisfied: rouge-score<0.2.0,>=0.1.2 in /Users/anbhatta/DeepLearning/labrador/venv/lib/python3.11/site-packages (from cli==0.0.1) (0.1.2)
.
.
.

$ cd ../test-workspace
$ lab list 
compositional_skills/writing/freeform/jokes/puns-copy/qna.yaml

$ lab check 
INFO 2024-03-21 13:36:19,745 generate_data.py:612 Found new taxonomy files:
INFO 2024-03-21 13:36:19,745 generate_data.py:614 * compositional_skills/writing/freeform/jokes/puns-copy/qna.yaml
ERROR 2024-03-21 13:36:19,745 generate_data.py:585 Exception [Errno 2] No such file or directory: 'yaml_rules.yaml' raised in taxonomy/compositional_skills/writing/freeform/jokes/puns-copy/qna.yaml
1 taxonomy files with errors! Exiting.

@anik120 anik120 added the hold In-progress PR. Tag should be removed before merge. label Mar 21, 2024
@anik120
Copy link
Contributor

anik120 commented Mar 21, 2024

Holding this PR for now @xukai92, might help to have another look, maybe I'm missing something in the setup that Nathans is suggesting here.

@nathan-weinberg
Copy link
Member Author

@anik120 All comments were addressed, not sure what you're referring to but seems you had some issues with your review - let's sync over this in-person

@anik120 anik120 removed the hold In-progress PR. Tag should be removed before merge. label Mar 21, 2024
@xukai92
Copy link
Member

xukai92 commented Mar 22, 2024

not sure why functional tests are failing. can you try rebasing with main?

@xukai92
Copy link
Member

xukai92 commented Mar 22, 2024

I did a rebase with main for you to see if this fixes tests.

Copy link
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the function test is failing coz seed_instructions_data is None here

I'm guessing it's coming from read_taxonomy here

@nathan-weinberg
Copy link
Member Author

Looks like the function test is failing coz seed_instructions_data is None here

I'm guessing it's coming from read_taxonomy here

I have found and fixed this issue (needed an additional with open(file_path, "r", encoding="utf-8") as file as seen on L553 otherwise the YAML load on L554 returns None despite the file being present and valid) - the current failure in GitHub Actions I do not know, seems completely unrelated to my change, but the formerly failing test_generate can be seen passing the functional tests locally:

+2024-03-22 12:21:29 ./scripts/functional-tests.sh:222: test_generate
+2024-03-22 12:21:29 ./scripts/functional-tests.sh:189: test_generate(): cat -
+2024-03-22 12:21:29 ./scripts/functional-tests.sh:199: test_generate(): sed -i -e 's/num_instructions:.*/num_instructions: 1/g' config.yaml
+2024-03-22 12:21:29 ./scripts/functional-tests.sh:202: test_generate(): timeout 5m lab generate --taxonomy-path simple_math.yaml
INFO 2024-03-22 12:21:32,938 lab.py:423 Generating model 'merlinite-7b-Q4_K_M' using 10 cpus, taxonomy: 'simple_math.yaml' and seed 'seed_tasks.json' against http://localhost:28711/v1 server
WARNING 2024-03-22 12:21:32,952 generate_data.py:544 Cannot find yaml_rules.yaml. Using default rules.
Cannot find prompt.txt. Using default prompt.
Synthesizing new instructions. If you aren't satisfied with the generated instructions, interrupt training (Ctrl-C) and try adjusting your YAML files. Adding more examples may help.
  0%|                                                                                                                                                                         | 0/1 [00:00<?, ?it/s]INFO 2024-03-22 12:21:32,958 generate_data.py:378 Selected taxonomy path 
Q> calculate the total cost when each book is $10.
I> <noinput>
3 items
A> The total cost is $30

Q> poll
find out the average age in a group consisting of three people aged 21,24,28 respectively.
I> 
A> The average age is 24.33 years old.

2it [01:06, 33.19s/it]                                                                                                                                                                              INFO 2024-03-22 12:22:39,335 generate_data.py:463 Generation took 66.40s
2it [01:06, 33.19s/it]
+2024-03-22 12:22:39 ./scripts/functional-tests.sh:205: test_generate(): ls -l generated/generated_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.json generated/test_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.jsonl generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.jsonl generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.jsonl generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.jsonl generated/train_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.jsonl generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.jsonl generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.jsonl generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.jsonl
-rw-r--r--. 1 nathan nathan 229 Mar 22 11:47 generated/generated_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.json
-rw-r--r--. 1 nathan nathan 806 Mar 22 12:09 generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.json
-rw-r--r--. 1 nathan nathan 596 Mar 22 12:18 generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.json
-rw-r--r--. 1 nathan nathan 581 Mar 22 12:22 generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.json
-rw-r--r--. 1 nathan nathan 552 Mar 22 11:47 generated/test_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.jsonl
-rw-r--r--. 1 nathan nathan 552 Mar 22 12:09 generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.jsonl
-rw-r--r--. 1 nathan nathan 552 Mar 22 12:18 generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.jsonl
-rw-r--r--. 1 nathan nathan 552 Mar 22 12:22 generated/test_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.jsonl
-rw-r--r--. 1 nathan nathan 304 Mar 22 11:47 generated/train_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.jsonl
-rw-r--r--. 1 nathan nathan 881 Mar 22 12:09 generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.jsonl
-rw-r--r--. 1 nathan nathan 750 Mar 22 12:18 generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.jsonl
-rw-r--r--. 1 nathan nathan 735 Mar 22 12:22 generated/train_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.jsonl
++2024-03-22 12:22:39 ./scripts/functional-tests.sh:206: test_generate(): jq '. | length'
+++2024-03-22 12:22:39 ./scripts/functional-tests.sh:206: test_generate(): tail -n 1
+++2024-03-22 12:22:39 ./scripts/functional-tests.sh:206: test_generate(): ls -tr generated/generated_merlinite-7b-Q4_K_M_2024-03-22T11_46_07.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_07_37.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_16_53.json generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.json
++2024-03-22 12:22:39 ./scripts/functional-tests.sh:206: test_generate(): cat generated/generated_merlinite-7b-Q4_K_M_2024-03-22T12_21_32.json
+2024-03-22 12:22:39 ./scripts/functional-tests.sh:206: test_generate(): '[' 2 -lt 1 ']'

Adds yaml linting to `lab check'
Adds tests for `lab check'
'create-untracked' func can now write byte strings

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
@nathan-weinberg nathan-weinberg requested a review from anik120 March 22, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a lab lint command
9 participants