-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: MathML combines multidigit numbers with sup/subscript, comma separators, and multicharacter text when outputting to DOM #3999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The code written above will split a number like "4,200.30" into two elements, split at the comma. It will also mangle an expression like In Temml, I struggled a bit and ended up with a more complex solution. My approach modified only the
Elsewhere in the same file, I added this code:
It's quite possible that parts of both solutions could be combined to form a solution that is less verbose than the Temml version, but still is correct. |
@ronkok Wondering if your solution caters for European number style, with periods and commas reversed, so 1.234,5, or with spaces, like 10 000,45? |
I haven't grokked your code yet, but I'm a little confused about the goal. In LaTeX you need to write These heuristics feel a little messy, but it seems they're necessary for going from presentation to semantics. I also don't think they're especially related to this PR. Do you agree that the change in this PR is useful, independent of how the nodes get combined? |
@mbourne The European number style was a major motivation for how I wrote this code. @edemaine I take your point about TeX requiring a |
I agree. I'd like to find code that is less verbose. I haven't found it yet.
It is certainly better than the status quo. I would suggest at least examining the following atom to see if it a |
@ronkok I've taken a stab at handling |
Good stuff! I haven't looked at the code yet, but the preview shows me that you have:
I hate to move the goal posts at this late date, but I have just now realized there is another problem. In the expression |
Thanks for reviewing!
Yep, exactly.
I believe that that is actually correct: here's how it renders in HTML. So the MathML correctly reflects the same.
|
Ah yes, you're correct. Which means that all the behavior I see from this code is good. I don't plan to do a detailed review of code style, but I think it looks good in general. |
🎉 This PR is included in version 0.16.17 🎉 The release is available on: Your semantic-release bot 📦🚀 |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [katex](https://katex.org) ([source](https://github.com/KaTeX/KaTeX)) | dependencies | patch | [`0.16.10` -> `0.16.21`](https://renovatebot.com/diffs/npm/katex/0.16.10/0.16.21) | --- ### KaTeX \htmlData does not validate attribute names [CVE-2025-23207](https://nvd.nist.gov/vuln/detail/CVE-2025-23207) / [GHSA-cg87-wmx4-v546](GHSA-cg87-wmx4-v546) <details> <summary>More information</summary> #### Details ##### Impact KaTeX users who render untrusted mathematical expressions with `renderToString` could encounter malicious input using `\htmlData` that runs arbitrary JavaScript, or generate invalid HTML. ##### Patches Upgrade to KaTeX v0.16.21 to remove this vulnerability. ##### Workarounds - Avoid use of or turn off the `trust` option, or set it to forbid `\htmlData` commands. - Forbid inputs containing the substring `"\\htmlData"`. - Sanitize HTML output from KaTeX. ##### Details `\htmlData` did not validate its attribute name argument, allowing it to generate invalid or malicious HTML that runs scripts. ##### For more information If you have any questions or comments about this advisory: - Open an issue or security advisory in the [KaTeX repository](https://github.com/KaTeX/KaTeX/) - Email us at [katex-security@mit.edu](mailto:katex-security@mit.edu) #### Severity - CVSS Score: 6.3 / 10 (Medium) - Vector String: `CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L` #### References - [https://github.com/KaTeX/KaTeX/security/advisories/GHSA-cg87-wmx4-v546](https://github.com/KaTeX/KaTeX/security/advisories/GHSA-cg87-wmx4-v546) - [https://nvd.nist.gov/vuln/detail/CVE-2025-23207](https://nvd.nist.gov/vuln/detail/CVE-2025-23207) - [https://github.com/KaTeX/KaTeX/commit/ff289955e81aab89086eef09254cbf88573d415c](https://github.com/KaTeX/KaTeX/commit/ff289955e81aab89086eef09254cbf88573d415c) - [https://github.com/KaTeX/KaTeX](https://github.com/KaTeX/KaTeX) This data is provided by [OSV](https://osv.dev/vulnerability/GHSA-cg87-wmx4-v546) and the [GitHub Advisory Database](https://github.com/github/advisory-database) ([CC-BY 4.0](https://github.com/github/advisory-database/blob/main/LICENSE.md)). </details> --- ### Release Notes <details> <summary>KaTeX/KaTeX (katex)</summary> ### [`v0.16.21`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01621-2025-01-17) [Compare Source](KaTeX/KaTeX@v0.16.20...v0.16.21) ##### Bug Fixes - escape \htmlData attribute name ([57914ad](KaTeX/KaTeX@57914ad)) ### [`v0.16.20`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01620-2025-01-12) [Compare Source](KaTeX/KaTeX@v0.16.19...v0.16.20) ##### Bug Fixes - \providecommand does not overwrite existing macro ([#​4000](KaTeX/KaTeX#4000)) ([6d30fe4](KaTeX/KaTeX@6d30fe4)), closes [#​3928](KaTeX/KaTeX#3928) ### [`v0.16.19`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01619-2024-12-29) [Compare Source](KaTeX/KaTeX@v0.16.18...v0.16.19) ##### Bug Fixes - **types:** improve `strict` function type ([#​4009](KaTeX/KaTeX#4009)) ([4228b4e](KaTeX/KaTeX@4228b4e)) ### [`v0.16.18`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01618-2024-12-18) [Compare Source](KaTeX/KaTeX@v0.16.17...v0.16.18) ##### Bug Fixes - Actually publish TypeScript type definitions ([#​4008](KaTeX/KaTeX#4008)) ([629b873](KaTeX/KaTeX@629b873)) ### [`v0.16.17`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01617-2024-12-17) [Compare Source](KaTeX/KaTeX@v0.16.16...v0.16.17) ##### Bug Fixes - MathML combines multidigit numbers with sup/subscript, comma separators, and multicharacter text when outputting to DOM ([#​3999](KaTeX/KaTeX#3999)) ([7d79e22](KaTeX/KaTeX@7d79e22)), closes [#​3995](KaTeX/KaTeX#3995) ### [`v0.16.16`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01616-2024-12-17) [Compare Source](KaTeX/KaTeX@v0.16.15...v0.16.16) ##### Features - ESM exports, TypeScript types ([#​3992](KaTeX/KaTeX#3992)) ([ea9c173](KaTeX/KaTeX@ea9c173)) ### [`v0.16.15`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01615-2024-12-09) [Compare Source](KaTeX/KaTeX@v0.16.14...v0.16.15) ##### Features - italic sans-serif in math mode via `\mathsfit` command ([#​3998](KaTeX/KaTeX#3998)) ([2218901](KaTeX/KaTeX@2218901)) ### [`v0.16.14`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01614-2024-12-08) [Compare Source](KaTeX/KaTeX@v0.16.13...v0.16.14) ##### Features - \dddot and \ddddot support ([#​3834](KaTeX/KaTeX#3834)) ([bda35cd](KaTeX/KaTeX@bda35cd)), closes [#​2744](KaTeX/KaTeX#2744) ### [`v0.16.13`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01613-2024-12-08) [Compare Source](KaTeX/KaTeX@v0.16.12...v0.16.13) ##### Bug Fixes - `\vdots` and `\rule` support in text mode ([#​3997](KaTeX/KaTeX#3997)) ([0e08352](KaTeX/KaTeX@0e08352)), closes [#​3990](KaTeX/KaTeX#3990) ### [`v0.16.12`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01612-2024-12-08) [Compare Source](KaTeX/KaTeX@v0.16.11...v0.16.12) ##### Features - **css:** configurable margin for display math ([#​3638](KaTeX/KaTeX#3638)) ([3405001](KaTeX/KaTeX@3405001)) ### [`v0.16.11`](https://github.com/KaTeX/KaTeX/blob/HEAD/CHANGELOG.md#01611-2024-07-02) [Compare Source](KaTeX/KaTeX@v0.16.10...v0.16.11) ##### Features - add \emph ([#​3963](KaTeX/KaTeX#3963)) ([9f34da4](KaTeX/KaTeX@9f34da4)), closes [#​3566](KaTeX/KaTeX#3566) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "" (UTC), Automerge - "* 0-3 * * *" (UTC). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xMzYuMCIsInVwZGF0ZWRJblZlciI6IjM5LjEzNi4wIiwidGFyZ2V0QnJhbmNoIjoidjcuMC9mb3JnZWpvIiwibGFiZWxzIjpbImRlcGVuZGVuY3ktdXBncmFkZSIsInRlc3Qvbm90LW5lZWRlZCJdfQ==--> Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/6693 Reviewed-by: Gusted <gusted@noreply.codeberg.org> Co-authored-by: Renovate Bot <forgejo-renovate-action@forgejo.org> Co-committed-by: Renovate Bot <forgejo-renovate-action@forgejo.org>
What is the previous behavior before this PR?
When rendering MathML directly to the DOM (as opposed to a string), such as on the katex.org front-page demo, consecutive digits and text characters get rendered as multiple text children of an
<mn>
or<mtext>
elements:What is the new behavior after this PR?
When rendering MathML directly to the DOM, we combine consecutive text children into one:
Fixes #3995 which reports that this causes screen-reading issues.
Merging of consecutive
<mtext>
and<mn>
elements is already done here:KaTeX/src/buildMathML.js
Lines 157 to 167 in 2d1fec9
And when rendering to a string, this was enough. Indeed, we don't have a great way of testing this, without a virtual DOM setup. But the inspect screenshots above confirm that this PR fixes the issue.