Allow UTF-8 with BOM for features.fea #3495

NightFurySL2001 · 2024-05-01T14:29:57Z

anthrotype · 2024-05-24T15:32:58Z

Lib/fontTools/feaLib/lexer.py

@@ -269,7 +269,7 @@ def make_lexer_(file_or_path):
            fileobj, closing = file_or_path, False
        else:
            filename, closing = file_or_path, True
-            fileobj = open(filename, "r", encoding="utf-8")
+            fileobj = open(filename, "r", encoding="utf-8-sig")


what if the file is a regular utf-8 that does not not start with a BOM? Will this still work?

Yes (as evident by the tests passing): this encoding just let Python skip the BOM mark if present in UTF-8 text, otherwise functionally it's the same as normal utf-8.

https://docs.python.org/3/library/codecs.html#encodings-and-unicode

On decoding utf-8-sig will skip those three bytes if they appear as the first three bytes in the file.

Do note that this is useful for opening text files (especially made by Windows Notepad). Saving as utf-8-sig is not recommended as it will add the BOM mark in which will break compatibility.

I would personally suggest read everything as utf-8-sig for greatest compatibility and save as utf-8 for standardisation.

I would personally suggest read everything as utf-8-sig for greatest compatibility and save as utf-8 for standardisation.

SGTM.

is this the only place where fontTools reads in human-written (potentially MS Notepad edited) text files? probably not. But sure let's merge this if it helps

So far in the UFO building process, only features.fea had caused this problem. Other components are loaded in through plistlib which probably stripped out the BOM by default.

NightFurySL2001 added 2 commits May 1, 2024 22:29

Allow UTF-8 with BOM for features.fea

cc02ada

Allow UTF-8 with BOM for features.fea

80db8cd

anthrotype reviewed May 24, 2024

View reviewed changes

anthrotype merged commit 4193aea into fonttools:main May 30, 2024

NightFurySL2001 deleted the patch-2 branch June 2, 2024 03:04

NightFurySL2001 mentioned this pull request Mar 5, 2025

Preparing for Google Fonts MoonlitOwen/ThenKhung#6

Merged

anthrotype mentioned this pull request May 12, 2025

Reading UFO failed if feature.fea saved with BOM mark #3822

Closed

NightFurySL2001 mentioned this pull request May 12, 2025

Update text file read to use UTF-8 with BOM #3824

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow UTF-8 with BOM for features.fea #3495

Allow UTF-8 with BOM for features.fea #3495

Uh oh!

NightFurySL2001 commented May 1, 2024

Uh oh!

anthrotype May 24, 2024 •

edited

Loading

Uh oh!

NightFurySL2001 May 25, 2024 •

edited

Loading

Uh oh!

behdad May 25, 2024

Uh oh!

anthrotype May 30, 2024

Uh oh!

NightFurySL2001 May 30, 2024

Uh oh!

Uh oh!

Allow UTF-8 with BOM for features.fea #3495

Allow UTF-8 with BOM for features.fea #3495

Uh oh!

Conversation

NightFurySL2001 commented May 1, 2024

Uh oh!

anthrotype May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NightFurySL2001 May 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

behdad May 25, 2024

Choose a reason for hiding this comment

Uh oh!

anthrotype May 30, 2024

Choose a reason for hiding this comment

Uh oh!

NightFurySL2001 May 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anthrotype May 24, 2024 •

edited

Loading

NightFurySL2001 May 25, 2024 •

edited

Loading