-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
I'm submitting this issue after a short discussion on Twitter with @zcorpan today.
I think we should change the rules of escaping a string in attribute mode, and also escape <
and >
to <
and >
respectively.
The fact that these characters are not escaped led to some security issues in HTML parsers and sanitizers.
As an example, see this DOMPurify bypass. The bug was that the following markup
<svg></p>
was parsed into the following DOM tree in Chromium and Safari:
┗ svg svg
┗ html p
Now because the typical usage of sanitizers is as follows:
elem.innerHTML = Sanitizer.sanitize(markup)
it means that the markup is serialized and then reparsed.
Now consider the following markup:
<svg></p><style><a title="</style><img src onerror=alert(1)>">
which is parsed into the following DOM tree:
┗ svg svg
┣ html p
┗ svg style
┗ svg a title="</style><img src onerror=alert(1)">
It doesn't contain any harmful markup, so it is serialized to:
<svg><p></p><style><a title="</style><img src onerror=alert(1)>"></style></svg>
However, after reparsing a different DOM tree is created:
┣ svg svg
┣ html p
┣ html style
┃ ┗ #text: <a title="
┣ html img src="" onerror="alert(1)"
┗ #text: ">
Leading to cross-site scripting. The reason for that is in fact that p
breaks out foreign content.
Please note that if <
and >
were escaped, then the markup would be serialized to:
<svg><p></p><style><a title="</style><img src onerror=alert(1)>"></style></svg>
Making this particular bypass (and many similar ones) impossible.