mk_wcwidth will return outdated widths when glibc 2.26 (unicode 9.0) is out

Unicode 9.0 changes the width of characters with emoji presentation to 2. The transition is going to suck in general, but it's not too bad for us. glibc 2.26 implements it, will be out in august or so.

mk_wcwidth implements unicode 5.0, but returning width of 1 for unknown characters, which is a great guess and an important improvement over glibc's wcwidth. Since there were no new characters with EastAsianWidth=2 in the recent versions (AFAIK, haven't checked everything), this works fine up to unicode 8.0.

The few things that depend on width calculation will be wrong if those characters are present. What I've seen is unaligned /names lists when using bitlbee-discord with utf8_nicks on (given big enough discord servers you'll get a handful of nicks with emoji, every time). Not a big deal. I haven't checked if this affects sideways splits.

We could:

* Make this a setting to let people pick between both implementations.
* Do a test call of the libc wcwidth() with a character that should return 2 in unicode 9.0 and 1 in 8.0 and lower, and if that's the case use that wcwidth(), wrapped to turn -1 (unknown character) into 1 (to be like mk_wcwidth)
* Both, with "auto" as the default setting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mk_wcwidth will return outdated widths when glibc 2.26 (unicode 9.0) is out #720

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mk_wcwidth will return outdated widths when glibc 2.26 (unicode 9.0) is out #720

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions