-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
poc: typst output properties #9623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a proof of concept of output properties for Typst, allowing Lua filters to translate CSS attributes/properties to Typst properties. The Typst writer searches for attributes of the form typst:target:attr Where target is "element" if the attribute should go to the element or "text" if the content should be wrapped in a text element with this attribute. It is assumed that the value is raw Typst code suitable for insertion as a property value, e.g. strings should be quoted and markup should be bracketed. The following cases are implemented: - cell element - table text - block element - span text To be complete, each element which receives attributes would need to process both kinds of attribute, and generate the text element if needed.
Just to add a bit more context: In Quarto, we use the HTML reader to parse table elements and convert them to native Pandoc nodes. Now that Pandoc has support for "fancy" table attributes like rowspans and colspans, we've found that HTML is an attractive format for specifying tables in general, not only for HTML input/output. This is particularly true for tables emitted by code. As a result, we can leverage Pandoc's reader to allow libraries to emit rich HTML table input, and produce output in Docx, PDF, HTML, etc. The feature in the PR would provide a path for table styling to be retargeted to Typst from HTML input in Quarto, but we think it would work equally well in pure Pandoc. We've given a fair amount of thought to how the styles should be specified, and our decision to use attributes and special names was based on the way that the HTML writer treats HTML5 attributes differently from regular attributes: pandoc/src/Text/Pandoc/Writers/HTML.hs Lines 684 to 690 in 758ff05
We felt that a simple prefix like |
I think the implementation can be cleaned up with helper functions to conditionally generate I counted 15 elements with attributes in the Pandoc documentation. Glad to iterate on this. |
I'm not really understanding the motivation for this: can you explain more fully? Is the idea to allow non-structural features of tables (e.g. cell coloring) to be transmitted from HTML to typst? Can you give an example or two? |
Yes, that is exactly the purpose. I specifically targeted this "test pattern" example of https://gt.rstudio.com/reference/data_color.html?q=color#foreground-text-and-background-fill Here is the Typst output, pretty close except it chose a different font from the list: |
So let me see if I understand correctly. The pandoc reader will parse an HTML table and include attributes, including |
Your understanding looks accurate to me. Just to add a bit: it's not only Lua filters that would benefit. This would work well with the following scenarios in addition to Lua filters:
Anyone targeting Pandoc's AST can emit attributes in a way that the Typst writer knows how to emit accurately, in the same way that the HTML writer knows how to differentiate HTML5 attributes from other attributes when it emits |
I see the utility of this, and it doesn't seem too harmful. It wouldn't really affect anyone who didn't explicitly add |
Great! I will work up a PR for the whole feature, with tests. The only side effect currently is that if you add these attributes and then render to html, they are emitted. I checked and these double-colon attributes are valid XML/HTML, and of course the browser ignores them. But you wouldn't add them unless you meant to output to Typst. Typst output should be identical if no |
I don't think it's bad that they are emitted in HTML/XML, as long as pandoc doesn't itself add them. (They will be irrelevant to anyone who doesn't go to the trouble of adding them explicitly with a filter.) I'm envisioning for now that this is just affecting the typst writer, not the reader. I'd have more qualms about it on the reader (partly because these weird attributes would be transmitted to HTML). |
let (textstart, textend) = | ||
(case formatAttrs $ pickTypstTextAttrs tabkvs of | ||
[] -> ("", "") | ||
tkvs -> ("#text" <> parens (literal (T.intercalate ", " tkvs)) <> "[", "]")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More idiomatic Haskell would define a function, rather than textstart and textend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I don't really understand this part: why is a new #text
being added? Shouldn't any typst attributes attached to the table go in table()
?
contents <- blocksToTypst blocks | ||
return $ "#block[" $$ contents $$ ("]" <+> lab) | ||
return $ "#block" <> props <> "[" $$ contents $$ ("]" <+> lab) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here it's more idiomatic to use the doclayout function that puts things in brackets.
I'm not sure I understand the motivation for the |
Thanks for the feedback! Yes, I would also prefer not to have the For example, Each kind of element has its own namespace of properties? Vs CSS having a global namespace? I’m not entirely sure, tbh - wish they had a white paper which explained how it all works. |
I see. You could probably handle this by adding a
|
Might be worth considering something like
instead of
That makes more sense to me (particularly if it's implemented as suggested above, with a |
I like the set-rule and less nesting. I agree that is more idiomatic. There are a couple of places where this isn't possible:
I would always naturally prefer the terse and less repetitive option, even at the cost of inconsistency. But I'm open to ideas and glad to iterate. |
I'm very flexible about the naming of attributes. I do want to clarify that the current scheme is always 3-part and doesn't include the specific element name. E.g. So But using the specific element name is a possibility. I think I'll know better whether there are other naming considerations when I've worked through a few more cases. The separate namespace for text properties was a surprise and I wonder if there are other surprises. |
OK, thanks for the clarification. In that case what about using |
Sure! |
Current draft in #9648 |
We at Quarto would like to contribute a mechanism to Pandoc for Typst property output, allowing translation of CSS attributes/properties to Typst properties using Lua filters.
This is a proof of concept, partial implementation.
The Typst writer searches for attributes with names of the form
typst:target:attr
Where
target
iselement
if the attribute should go to the current element, ortext
if the content should be wrapped in a text element with this attribute.It is assumed that the value is raw Typst code suitable for insertion as a property value, e.g. strings should be quoted and markup should be bracketed.
The following cases are implemented:
To be complete, each element which receives attributes would need to process both kinds of attribute, and generate the text element if needed. So a complete implementation might include a couple dozen more such cases.
An example Lua filter offering partial translation of
<table>
font-family
,font-size
, and<td>
/<th>
.color
,background-color
, andpadding-*
, is available here:https://gist.github.com/gordonwoodhull/f6ea7d4b8a462da83ad90504a65bf3fe
There are various design possibilities, but this seems the simplest and most general, as it would allow round-tripping of Typst properties, and translation from Typst properties to CSS (if anyone wants those features in the future).
All suggestions are welcome!
cc @cscheid @tarleb