Skip to content

Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form charset #3223

@bsittler

Description

@bsittler

Specify HTML numeric character reference fallback encoding for multipart upload filename characters not representable in form acceptCharset/form charset.

Rationale:

  • Consistency: this will make filename fallback character replacement consistent with encoding of form element names and values in multipart uploads when a source character is not representable in the acceptCharset/form charset. @annevk points out that this is exactly the "html" error handling of the Encoding Standard. https://encoding.spec.whatwg.org/#concept-encoding-process
  • Predictability: this is consistent with existing behavior in at least two browsers (Firefox and Edge). I have also started an intent to implement and ship thread for this behavior for Chrome. edit: this proposal was accepted, I'm now working to implement it in Chrome
  • Reduced data loss: this change reduces the risk of user confusion and website malfunction when multiple uploaded files with distinct local filenames but identical representation after user agent-specific fallback character replacement are uploaded using <input type=file multiple>; with this behavior standardized, web pages may even be able to portably recover useful user-visible representations of the original filenames, though some ambiguity remains with that approach as a local file could actually contain name parts matching numeric character references (moving to UTF-8 for the form submission of course resolves the ambiguity and should be the only recommended solution for newly-built web pages).

Accidentally filed here too: w3c/html#1077

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions