-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
(Checcked with Pandoc 2.9.1 Windows version to reproduce this problem)
In org-mode, characters that match with `org-emphasis-regexp-components' is considered to be space character for emphasis markup. (cf. DOCSTRING of this variable.)
For example, in markup text " *emph* ", "emph" is "emphasized" because it is surrouneded by "*" and space character " ". Among space characters, U+200B (Zero width space) is important because it enables org-mode to emphasize the component of the word. (e.g. " *emph*asize"). However, pandoc org-mode reader does not recognize this as space character, so that in "\u200b*emph*\u200b", "emph" is not emphasized.
I wish if it could be fixed so that the versatility of pandoc & org-mode can be greately extended.
Regards,
Reproduction.
For text "*abc*" (first character is U+200B),
% pandoc -f org -t json -o test.json --standalone test.org
will produce :
{"blocks":[{"t":"Para",
"c":[{"t":"Str",
"c":"\*abc*"}]}],
"pandoc-api-version":[1,20],"meta":{}}
It is desired to produce something like
{"blocks": [{"t":"Para",
"c": [{"t":"Str",
"c":""}, // U+200B
{"t":"Strong",
"c":[{"t":"Str",
"c":"abc"}]}]}],
"pandoc-api-version":[1,20],"meta":{}}
instead.