-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Taxonomy reorg per dewey decimal classifications #1215
Conversation
Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉 I support the following commands:
Note Results or Errors of these commands will be posted as a pull request check in the Checks section below Note Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
If we could, I would prefer to keep compositional_skills and knowledge folders free of readme files. That is, any change in these trees are a contribution to the taxonomy rather than some doc improvements. The current readme in knowledge is annoying in this way :-)
Do you mean just the main parent folders (knowledge, compositional_skills, and foundational_skills)? OR, do you not want readme files in the domain/subdomain folders? We do have a docs folder, where we could put most of the info for the repo, to keep the taxonomy tree folders clear? I do still need to "document" the taxonomy tree -- which I was going to do in readme.txt files for the domains/subdomains?? |
My request would to not have readme files anywhere under the taxonomy folders (knowledge, compositional_skills, and foundational_skills).
Agree but I would rather see that all together in a single readme file since that would give the reader a broad view over the taxonomy organization rather than walking around the tree encountering readme files occasionally. |
Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉 I support the following commands:
Note Results or Errors of these commands will be posted as a pull request check in the Checks section below Note Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands. |
Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉 I support the following commands:
Note Results or Errors of these commands will be posted as a pull request check in the Checks section below Note Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands. |
@jjasghar I have removed the readme files, and updated the main repo's readme file for these changes. There might be some additional changes to verify the qna.yaml files are either "grounded" or "ungrounded" and putting them in those subfolders. And, when we do start merging other knowledge contributions, we need to remember to add the document_type as the final node in the tree. I couldn't quickly find/identify any qna.yaml files to do that with. Please review my latest change here, and I'll let you do the honors of removing DRAFT! :) |
compositional_skills/linguistics/writing/prose/debate/ungrounded/qna.yaml
Outdated
Show resolved
Hide resolved
082cfb2
to
62a19a4
Compare
@bjhargrave can you confirm https://github.com/instructlab/taxonomy/actions/runs/9751722026/job/26913897281?pr=1215 that is the same thing as the /files directory? |
a8baeec
to
67b13c4
Compare
foundational_skills/linguistics/logical_sequence_of_words/qna.yaml
Outdated
Show resolved
Hide resolved
compositional_skills/linguistics/grammar/basic_grammer_tests/ungrounded/qna.yaml
Outdated
Show resolved
Hide resolved
I recognize that unambiguous classification is an extremely complex task. Here are some of my thoughts:
|
Yes, you can always classify topics into different categories, and I think much will depend upon the specific knowledge being submitted. If the knowledge is talking about its function, then classifying it under physics might be best, but if the knowledge is talking about where batteries are used, then maybe it belongs in technology/electronics. I'm not sure that there is a way around this and it is just something that we will have to make a judgement call as to where a piece of knowledge belongs.
The DDC has been around for nearly 150 years, and is in its 20th edition. It has 10 top categories, subdivided in the 100s, subdivided again to the 1000s. Please see this summaries doc: https://www.oclc.org/content/dam/oclc/dewey/resources/summaries/deweysummaries.pdf. It is meant to be a standard classification of all forms of knowledge. And, we will certainly run into some cases where we will have to find a "best fit" for a knowledge, but starting from the top 10 categories and its 10 subcategories seemed to present the best starting point. As a secondary source to help us with classification of knowledge, we can look to Wikipedia to see how/where they placed things or to get additional ideas: https://en.wikipedia.org/wiki/Wikipedia:Contents. The InstructLab taxonomy will not be a direct 1:1 mapping of the DDC, but the starting point to finding a best fit for the topics of knowledge.
For cooking and "culinary arts" I would put them in technology/food_and_drink. |
This pull request has been automatically marked as stale because it has not had activity within 15 days. It will be automatically closed if no further activity occurs within 31 days. |
Beep, boop 🤖, Hi, I'm @instructlab-bot and I'm going to help you with your pull request. Thanks for you contribution! 🎉 I support the following commands:
Note Results or Errors of these commands will be posted as a pull request check in the Checks section below Note Currently only maintainers belongs to [[taxonomy-triagers taxonomy-approvers taxonomy-maintainers labrador-org-maintainers instruct-lab-bot-maintainers]] teams are allowed to run these commands. |
77b376f
to
83e2575
Compare
83e2575
to
1de7328
Compare
94c2c09
to
4eca8c1
Compare
d96158c
to
ca86951
Compare
I have force pushed changes which include making .gitignore files empty. Please don't push any merge commits to the PR. |
I think it's ready to merge! |
ca86951
to
4005d56
Compare
- Reorganized the taxonomy domains and subdomains to align with the Dewey Decimal Classifications - Update readme.md - fixed lint error - changed readme.md to readme.txt in two knowledge domains - removed readme files;; edited repo readme file - Edited the main readme file to represent the bulk of the taxonomy restructuring. More work needs to happen on the main readme file and updating the docs, but this should do for now. Signed-off-by: Michelle Corbin <corbinm@us.ibm.com> Signed-off-by: BJ Hargrave <hargrave@us.ibm.com> Co-Authored-By: JJ Asghar <awesome@ibm.com> Co-Authored-By: Julia Denham <jdenham@redhat.com> Co-Authored-By: Luke Inglis <luke.inglis@ibm.com> Co-Authored-By: Kelly Brown <kelbrown@redhat.com> Co-Authored-By: Olivia <ombuzek@us.ibm.com>
4005d56
to
eb0f0ce
Compare
Updated the diagram because of #1215. /cc @mcorbin-ibm @juliadenham Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Co-authored-by: Costa Shulyupin <costa.shul@redhat.com>
Reorganized the taxonomy domains and subdomains to align with the Dewey Decimal Classifications