Skip to content

propose: default to UTF-8 instead of ISO-8859-1 in Content-Disposition header #813

@Jimmy-Z

Description

@Jimmy-Z

I'm having a issue similar to #425 but a little different, like this:
Content-Disposition: attachment;filename="中文文件名.rar"
of cause it still violates RFC but seems common in China and Chrome/IE/Edge/curl will handle this, firefox won't.

You said in #425 you're ready to accept a patch so I looked a bit into the code, looks like only a few lines of code needed around here, there is already an utf8 test routine there called utf8dfa, but I don't know how to access options yet.

I'm proposing changing default encode from ISO-8859-1 to UTF-8 without an option, shouldn't break anything, UTF-8 is designed that way.

update: sorry I was thinking about ASCII. and I should change the issue title to "an option to default to UTF-8 instead of ISO-8859-1"

BTW, utf8dfa is actually not used correctly there,

aria2/src/util.cc

Line 1115 in df19921

if (utf8dfa(&dfa_state, &dfa_code, *p) == UTF8_REJECT) {

if (utf8dfa(&dfa_state, &dfa_code, *p) == UTF8_REJECT) {
*p is const char, it should be explicitly converted to unsigned const char before implicitly converting to uint_32t, for example -1 will be wrongly converted to 0xffffffff instead of 0x000000ff, which will result in a disastrous array index out of bounds in utf8dfa.

but, luckily, look up two lines, the utf8dfa call is enclosed in a if (inRFC5987AttrChar(*p)) { check block, so it will never trigger, and also render this check entirely pointless, printable ASCII is automatically UTF-8 proof.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions