-
Notifications
You must be signed in to change notification settings - Fork 152
Description
What is the issue with the URL Standard?
The URL Standard, UTS 46, and RFC 3492 don’t specify interoperable behavior for Punycode encode and decode failures when a label is longer than what actually makes sense for DNS purposes.
If the input is too long, at some point an integer internal to the Punycode algorithm overflows. See https://datatracker.ietf.org/doc/html/rfc3492.html#section-6.4
One way to specify this would be to specify that the internal integer size be 32 bits, but that can lead to denial of service attacks with unreasonably long inputs. (Apparently Chrome‘s fuzzers managed to time out when fuzzing Punycode.) For this reason, ICU4C has somewhat arbitrary length limits for the inputs to Punycode decode and encode. https://unicode-org.atlassian.net/browse/ICU-13727 https://searchfox.org/mozilla-central/rev/6bc0f370cc459bf79e1330ef74b21009a9848c91/intl/icu/source/common/punycode.cpp#173-176
The rationale from the issue is:
A well-formed label is limited to 63 bytes, which means at most 59 bytes after "xn--". However, we don't have any limit so far, and people sometimes use libraries and protocols with non-standard inputs.
Something 1000-ish seems like a reasonable compromise to keep n^2 tame and users happy even with somewhat unusual inputs.
The non-arbitrary tight bound would be to fail before decoding Punycode if the decoder input (not counting the xn--
prefix) would exceed 59 (ASCII) characters and to fail during encoding if the encoder is (not counting the xn--
prefix) about to output a 60th (ASCII) character.
Using the tight bound would come pretty close to setting VerifyDNSLength
to true (close, but not exactly: It would still not place a limit for ASCII-only labels and the domain name as a whole). Yet, the URL Standard sets VerifyDNSLength
to false
. This comes from 3bec3b8 , which does not state motivation.
Without knowing the motivation for setting VerifyDNSLength
to false
, it’s hard to assess if placing the tight bounds on Punycode would work.
I think the specs should make the behavior here well defined even if it’s not a particularly pressing issue, since it only concern labels that are too long for DNS anyway. (This probably belongs in UTS 46, but filing this here for discussion before sending UTS 46 feedback.)