Skip to content

commonmark/gfm: -raw_html turns tables with <h3> tag into [TABLE]. #10407

@Atemu

Description

@Atemu

Explain the problem.
Include the exact command line you used and all inputs necessary to reproduce the issue. Please create as minimal an example as possible, to help the maintainers isolate the problem. Explain the output you received and how it differs from what you expected.

I tried to convert an HTML blog post to markdown as I usually do for Lemmy posts but this time the table turned into [TABLE].

I narrowed it down to a rather minimal reproducer:

https://pandoc.org/try/?params=%7B%22text%22%3A%22%3Ctable%3E%5Cn%3Ctbody%3E%5Cn%3Ctr%3E%5Cn%3Ctd%3Efoo%3C%2Ftd%3E%5Cn%3Ctd%3E%5Cn%3Ch3%3Eanything%3C%2Fh3%3E%5Cn%3C%2Ftd%3E%5Cn%3Ctd%3Ebar%3C%2Ftd%3E%5Cn%3Ctd%3Ebaz%3C%2Ftd%3E%5Cn%3C%2Ftr%3E%5Cn%3C%2Ftbody%3E%5Cn%3C%2Ftable%3E%5Cn%22%2C%22to%22%3A%22gfm-raw_html%22%2C%22from%22%3A%22html%22%2C%22standalone%22%3Afalse%2C%22embed-resources%22%3Afalse%2C%22table-of-contents%22%3Afalse%2C%22number-sections%22%3Afalse%2C%22citeproc%22%3Afalse%2C%22html-math-method%22%3A%22plain%22%2C%22wrap%22%3A%22auto%22%2C%22highlight-style%22%3Anull%2C%22files%22%3A%7B%7D%2C%22template%22%3Anull%7D

The cause is the <h3> tag. If you remove it, it works as expected.

markdown-raw_html also converts it fine. markdown_strict-raw_html does not.

That shouldn't happen but what I found even more confusing is that pandoc didn't even print a warning while discarding a large amount of textual content. Loss of layout information is expected when converting between different formats of course but the content should never change or be removed without warning.

Pandoc version?
What version of pandoc are you using, on what OS? (If it's not the latest release, please try with the latest release before reporting the issue. Note that many linux distributions have old versions of pandoc in their repositories.)

It's pandoc 3.1.11.1 via Nix (via stackage LTS) but reproduces on https://pandoc.org/try/ (3.5) if you hack in -raw_html.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions