-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Add CRediT roles to JATS #10153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CRediT roles to JATS #10153
Conversation
You should add a "command test" to test the proposed feature, and to illustrate how it works. See |
Hi @jgm , I think I figured out making a test file. I have a few questions I bet you could answer very quickly but would take me a long time to figure out:
|
Usually we try to name them after the associated issue. So, if there's an open issue that this fixes you could use that number; otherwise you could use the PR's number, 10153.md. That makes it easy to figure out later why the test was added.
Instead of creating a separate module, I'd prefer if you just included this map in the JATS writer module itself. |
@jgm thanks for that feedback, I addressed both of your points. Now, the only thing left is how to make this dictionary (renamed to My first try was |
what happens when you try that? By the way, our usual convention for variable names would favor |
It seems the issue was the definition of
Switching the signature to use
|
I don't understand why you'd be getting this behavior. Nested Maps seem to work fine in ToContext: here I explore using
|
Btw, you can see what instances are defined for ToContext here: As you can see, it's only defined for Map with Text keys. ToContext typeclass is relevant because of the type of defField :: ToContext a b => Text -> b -> Context a -> Context a |
Are you saying that |
Yes! This is what I'm experiencing |
I'm a bit confused because originally you had a nested map and now it seems not to be nested. With a nested map, in which the value of |
I thought that the nested map was a bit over-complicated and unnecessary so I simplified it to a regular map. This means that the
ref: https://github.com/jgm/pandoc/actions/runs/11021333152/job/30608170051?pr=10153#step:9:1926 |
Ah, I see what you are trying to do. I don't think The way to do this is to resolve these names in the writer. Probably the simplest approach would be to use your Map to add a |
I believe that it may be useful to re-iterate this comment regarding this issue. |
Yes, overriding is actually important for people who don't want it to be in English! |
Thanks @jgm and @estedeahora for the feedback. I've opted to make the input YAML format a bit more explicit, so you have to say exactly what the CRediT name is. You also have the option to override, to support the multi-language things. I also improved the docs to give all of the different possibilities and added to the tests one for each. The tests (with narrative documentation) can be found in https://github.com/cthoyt/pandoc/blob/patch-1/test/command/10152.md. |
cc @sneakers-the-rat - your input is also greatly appreciated! |
It seems to me that this project is a significant and good modification for Pandoc. However, I don't seem to modify Pandoc without modifying the code in JATS.hs to generate Personally I think that if this change is not possible, it is better not to generate the change. If someone needs to implement it, they can read these discussions with alternative solutions with Lua filters and template modifications. |
Apologies @estedeahora, but I'm not quite sure what you're asking for. |
I see that in the last commits you opted to generate the changes by incorporating everything manually in the yaml header and then these in the template (leaving aside the modification of JATS.hs). This is a “correct” solution. However, as you pointed out in #10152 it is an inefficient solution because writing three times variations of the same information can generate errors. I think it would be a contribution to generate this in Haskell, but I don't know how to do it yet. On the other hand, I don't think it is advisable to add this functionality to Pandoc if this point is not solved, because it generates too many fields making it not very concise. As an alternative and provisional solution I think that users can use a lua filter and “inject” these fields through a dictionary. Then this can be placed in a custom template. I think our discussion provides elements for others to do this if they need to. |
It should be completely feasible to add the |
I am feeling like figuring out haskell and the specifics of this codebase might be too much of a lift for me at the moment. @jgm if you are willing to give a hand, then I would be very grateful. I had squashed the commits earlier, but just added back the name lookup dictionary in |
Here are some tips. In Text.Pandoc.Writers.Shared you'll find -- | Retrieve a field value from a template context.
getField :: FromContext a b => Text -> Context a -> Maybe b and -- | Reset a field of a template context. If the field already has a
-- value, the new value replaces it.
-- This is a utility function to be used in preparing template contexts.
resetField :: ToContext a b => Text -> b -> Context a -> Context a So the basic template for adding these fields to the addCreditNames :: Context a -> Context a
addCreditNames context =
case getField "roles" context of
Nothing -> context
Just roles -> resetField "roles" (map addCreditName roles) context
addCreditName :: M.Map Text Text -> M.Map Text Text
addCreditName rolemap =
case M.lookup "credit-name" rolemap of
Just _ -> rolemap
Nothing -> maybe id (M.insert "credit-name")
(M.lookup "credit-id" rolemap >>= flip M.lookup creditNames) Then you'd put |
I'm happy to give this a look, since I'm 100% in favour of getting more useful metadata into the journal ecosystem and getting CRediT roles into more JATS seems like a good thing. Probably won't have time to get properly into it until the weekend though! |
Do you mean this section? pandoc/src/Text/Pandoc/Writers/JATS.hs Lines 160 to 166 in 56fc7d5
I tried sticking it in the middle like let context = defField "body" main
$ defField "back" back
$ addCreditNames
$ resetField "title" title'
$ resetField "date" date
$ defField "mathml" (case writerHTMLMathMethod opts of
MathML -> True
_ -> False) metadata but this doesn't appear to work correctly - what's the right way to apply it? |
The thing you said doesn't appear to work should work. Each line in this pipeline has type |
Don't know if it's useful but I've now pushed up another commit with a small refactor and some doc comments |
@jezcope it looks like the recent commit you pushed broke CI, can you check that please? |
@cthoyt Oh yeah, I missed that warning, fixed now (and it makes the code clearer too, good compiler). I should have remembered to build with |
@tarleb this might also be interesting for you to take a look at |
changelog.md
Outdated
## UNRELEASED | ||
|
||
* JATS writer: | ||
|
||
+ Allow specification of author roles using the CRediT Taxonomy | ||
and export this information in conformant JATS (#10153, Charles | ||
Tapley Hoyt and Jez Cope). | ||
|
||
## pandoc 3.6.2 (2025-01-12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't modify the changelog.md until right before a release. Ideally all the information that
would go in the changelog should be in your commit message. I'd also suggest squashing
your commits into a single one.
@@ -109,6 +109,22 @@ $elseif(author.name)$ | |||
$else$ | |||
<string-name>$author$</string-name> | |||
$endif$ | |||
$if(author.roles)$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this outer if
. If author.roles
is not defined, $for(author.roles)$...$endfor$
just won't do anything (iirc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I addressed this!
test/command/10152.md
Outdated
> | ||
Software | ||
</role> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking: is it okay to have this whitespace around "Software"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the answer is yes, since I ran the output through a JATS validator, but just to be safe, I got rid of this whitespace
test/command/10152.md
Outdated
vocab-identifier="https://credit.niso.org/" | ||
vocab-term-identifier="https://credit.niso.org/contributor-roles/software/" | ||
vocab-term="Software" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this blank line weren't produced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I had some difficulty figuring out how to make this work nicely with leading whitespace and newlines inside the if/else, so I reorganized the order to make sure we don't get ugly new lines
It looks good to me now. @tarleb do you want to have a look? You know more about JATS than I. |
@jgm I am working through getting the whitespace fixed, but overall the idea is done! |
2e71bcb
to
187d171
Compare
Whitespace issues addressed, and commits have all been squashed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I only wonder whether the role descriptions should get translation keys to make it easier to switch languages. JATS documents in languages other than English are probably rare enough, plus there exists a work-around, so I think this is fine for now.
It might also be good to report a warning if the credit role isn't known, but the current behavior seems ok, too.
Cool, thanks @tarleb. I've been wanting to contribute more to |
One more request: can you amend your commit message so that it includes a brief description of the change (wrapped to 80 columns) and |
Enable annotating author roles using the Contribution Role Taxonomy (CRediT) and export this information in conformant JATS Closes jgm#10152. Co-Authored-By: Jez Cope <457628+jezcope@users.noreply.github.com>
Thanks! |
I'm so excited for this to be merged! Thank you again @jgm and @jezcope for all of the excellent feedback and assistance and @estedeahora for the helpful discussion |
Closes #10152
Enable annotating author roles using the Contribution Role Taxonomy (CRediT) and export this information in conformant JATS (#10153, Charles Tapley Hoyt and Jez Cope).
Motivation
I'm motivated to add this to Pandoc since I want the Journal of Open Source Software (JOSS), which is built on top of Pandoc, to be able to create compliant JATS. We're already adding support for encoding this information in article metadata in parallel in openjournals/inara#87.
Demo
Now, you can specify an authors' role(s) in the metadata like this:
where the only required key inside each roles dictionary is
credit
. If that's there, then the primary English label can be automatically looked up and put intocredit-name
. If you want to override for internationalization purposes, you can usename
.The
degree
field is optional, and can be one ofLead
,Supporting
, orEqual
as defined in the JATS standard.Testing
The tests (with narrative documentation) can be found in https://github.com/cthoyt/pandoc/blob/patch-1/test/command/10152.md. Run these tests specifically with
cabal test --test-options="-p 10152.md"
I also put the 7 outputs in
10152.md
as XML files inside examples.zip. I tested all of them are valid/conformant JATS using the web-based tool at https://jats4r-validator.niso.org. The zip archive can be downloaded and you can run these tests yourself, if desired.Acknowledgements
huge shout-out to John for many rounds of helping me get comfortable with Haskell and the Pandoc codebase, and @jezcope for helping me get across the finish line. This PR would have only been a half-finished idea without them.