Skip to content

Conversation

valadaptive
Copy link
Contributor

This was discussed on Zulip. Essentially, the current fontconfig backend only roughly approximates fontconfig's actual font matching behavior, which is very intricate and difficult to implement. This has resulted in many fonts failing to match.

This PR reimplements it entirely using FFI calls to the fontconfig library, under the assumption that any platform with fontconfig configuration files probably also has the fontconfig library installed. For bindings, I'm using yeslogic-fontconfig-sys with my own binding layer on top of it. Yeslogic also provides a higher-level fontconfig crate but it's very barebones and not useful here.

I've enabled the dlopen feature of yeslogic-fontconfig-sys; this should mean that Parley-using applications should work even if fontconfig is not installed. We could probably enable the fontconfig backend on other Unixes (various BSDs, illumos, etc). The bindings are not generated at build time, so I don't think there are any issues. I am, however, unsure if dlopening libraries runs into any sandboxing concerns with Flatpak or something.

I've implemented family and fallback matching to the best of my ability (I've done some manual testing, and ("Hani", None) returns a Japanese font whereas ("Hani", Some("zh")) returns a Chinese font). However, I'm still not sure I fully understand fallback matching. If, for instance, you pass a nonexistent script or locale, it'll just return whatever font ends up as the global super-duper-fallback (Bitstream Vera on my machine).

In fact, I believe Fontconfig matching will always return at least some font. For family matches, I've filtered out all families whose names don't match, but for fallback matches, I believe it will always return something. I'm not sure if there's logic somewhere else that expects nonexistent scripts/locales to return nothing and breaks if we return a bogus font instead.

All that being said, I think this is a good improvement over the current implementation.

@valadaptive valadaptive requested a review from dfrg June 8, 2025 17:58
@dhardy
Copy link
Contributor

dhardy commented Jun 9, 2025

I'm less keen on requiring non-Rust deps, but using the standard fontconfig impl does have merit. Testing:

examples/generic_families.rs

GenericFamily::Serif:
Noto Serif
Noto Naskh Arabic
Nimbus Roman
Standard Symbols PS
Caladea
Symbola
STIX
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Tamil
Lohit Kannada
Lohit Telugu
Adwaita Mono
Adwaita Sans
DejaVu Sans
Carlito
Noto Sans Mono
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Mono
Liberation Serif
Lohit Devanagari
RIT Rachana
Mingzat
Nuosu SIL
Waree
D050000L
Droid Sans Ethiopic
Droid Sans Fallback
Noto Sans Gothic
Noto Sans Math
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Padauk
Noto Color Emoji

GenericFamily::SansSerif:
Noto Sans
Noto Sans Arabic
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Nimbus Sans
Carlito
Noto Sans Math
Mingzat
Padauk
Nuosu SIL
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Droid Sans Thai
Waree
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Source Code Pro
Adwaita Mono
DejaVu Sans Mono
Symbola
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
STIX
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Noto Color Emoji

GenericFamily::Monospace:
Noto Sans Mono
DejaVu Sans Mono
Adwaita Mono
Nimbus Mono PS
Source Code Pro
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Tamil
Lohit Kannada
Lohit Telugu
Adwaita Sans
DejaVu Sans
Symbola
Carlito
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Mono
Liberation Serif
Lohit Devanagari
PakType Naskh Basic
RIT Rachana
Mingzat
Nuosu SIL
STIX
Waree
D050000L
Vazirmatn
Droid Arabic Kufi
Droid Sans Ethiopic
Droid Sans Fallback
Noto Sans Gothic
Noto Sans Math
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Naskh Arabic
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Padauk
Noto Color Emoji

GenericFamily::Cursive:
Z003
Noto Sans Arabic
Noto Sans
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Carlito
Noto Sans Math
Mingzat
Padauk
Nuosu SIL
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Droid Sans Thai
Waree
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Source Code Pro
Adwaita Mono
DejaVu Sans Mono
Symbola
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
STIX
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Noto Color Emoji

GenericFamily::Fantasy:
Noto Sans
Noto Sans Symbols 2
Noto Sans Symbols
Noto Sans Arabic
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Nimbus Sans
Carlito
Noto Sans Math
Mingzat
Padauk
Nuosu SIL
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Droid Sans Thai
Waree
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Source Code Pro
Adwaita Mono
Symbola
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
STIX
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Noto Color Emoji

GenericFamily::SystemUi:
Cantarell
Noto Sans Arabic
Noto Sans
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Nimbus Sans
Carlito
Noto Sans Math
Mingzat
Padauk
Nuosu SIL
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Droid Sans Thai
Waree
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Source Code Pro
Adwaita Mono
DejaVu Sans Mono
Symbola
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
STIX
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Noto Color Emoji

GenericFamily::UiSerif:

GenericFamily::UiSansSerif:

GenericFamily::UiMonospace:

GenericFamily::UiRounded:

GenericFamily::Emoji:
Noto Color Emoji
Noto Emoji
Symbola
Noto Sans Arabic
Noto Sans
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Nimbus Sans
Carlito
Noto Sans Math
Mingzat
Padauk
Nuosu SIL
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Droid Sans Thai
Waree
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Adwaita Mono
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
STIX
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Sans Symbols 2
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol

GenericFamily::Math:
STIX
Symbola
Noto Sans Arabic
Noto Sans
DejaVu Sans
Noto Sans Meetei Mayek
Noto Sans Ol Chiki
Adwaita Sans
Nimbus Sans
Carlito
Noto Sans Math
Noto Sans Thai
Mingzat
Padauk
Nuosu SIL
Waree
Droid Sans Devanagari
Droid Sans Ethiopic
Droid Sans Fallback
Droid Sans Hebrew
Droid Sans Tamil
Lohit Bengali
Lohit Gujarati
Lohit Marathi
Lohit Kannada
Lohit Telugu
Adwaita Mono
Noto Sans CJK JP
Noto Serif CJK JP
Jomolhari
Khmer OS System
Liberation Serif
Lohit Devanagari
RIT Rachana
D050000L
Noto Sans Gothic
Noto Sans NKo
Noto Sans Symbols 2
Noto Emoji
Noto Sans Bengali
Noto Sans Canadian Aboriginal
Noto Sans Cherokee
Noto Sans Devanagari
Noto Sans Ethiopic
Noto Sans Georgian
Noto Sans Gujarati
Noto Sans Gurmukhi
Noto Sans Hebrew
Noto Sans Kannada
Noto Sans Lao
Noto Sans Oriya
Noto Sans Sinhala
Noto Sans Syriac
Noto Sans Tamil
Noto Sans Telugu
Noto Sans Thaana
OpenSymbol
Noto Color Emoji

GenericFamily::FangSong:

That's a lot more matches than before, which means more possible fallbacks. I'm not convinced that's a good thing though; it depends on whether there are performance implications for fallback. I suspect (provided that there is some level of filtering) most of these additional fonts won't normally do anything (neither matching glyphs nor causing overhead) anyway, but the latter is contigent on having some other level of filtering (like #377).

Additional note: GenericFamily::UiSerif and other Ui* generic families still don't match anything. As I said before, these should likely be removed.

Testing shows that #377 merges cleanly with this, yielding more possible matches but in a manner that's okay. E.g. here's the matches for arrows (shorter than the matches for Latin):

[2025-06-09T08:10:55Z DEBUG kas_text::fonts::resolver] select: Script::Zyyy, Some(Arrows), GenericFamily::Serif, FontWeight(400), FontWidth(256), Normal
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Symbola
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: STIX
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Adwaita Mono
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Adwaita Sans
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: DejaVu Sans
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Carlito
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Noto Sans Mono
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Liberation Mono
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Liberation Serif
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Droid Sans Fallback
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Noto Sans Math
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: Noto Sans Symbols 2
[2025-06-09T08:10:55Z DEBUG kas_text::fonts::library] match: OpenSymbol

This does solve matching Char 'A'=0xff21 : FULLWIDTH LATIN CAPITAL LETTER A. This was problematic previously because it is Latin, yet isn't included in common fonts. I think we could also solve this using #377 however (using the as-yet-unimplemented part of falling back to any font supporting the given UnicodeRange) since this has its own range: 68 = Halfwidth And Fullwidth Forms.

@dhardy dhardy mentioned this pull request Jun 9, 2025
5 tasks
@nicoburns
Copy link
Contributor

Wanted to note that this doesn't necessarily have to be instead of the existing linux backend. We could have both and let people choose. Not saying we necessarily want to to do that in this case, but just raising it as a possibility. We could probably even abstract the backends out behind a trait if we wanted to (making it easier for users to bring their own custom backends).

@nicoburns
Copy link
Contributor

Additional note: GenericFamily::UiSerif and other Ui* generic families still don't match anything. As I said before, these should likely be removed.

ui-* generic families seem to be a thing on macOS (and in a draft web standard: https://drafts.csswg.org/css-fonts-4/#ui-serif-def). Them currently not matching anything on Linux seems to be expected behaviour. If they are part of a font-stack then presumably that's no different to attempting to find a named font that isn't installed on the system. Perhaps we ought to just hardcode the non-macOS backends to return a non-match for these families for the time being?

@valadaptive
Copy link
Contributor Author

I'm less keen on requiring non-Rust deps

The fontconfig bindings themselves are entirely Rust; they don't require integrating C into the build process. You also don't need to ship fontconfig with whatever application uses Parley; it's a platform library that we can expect to be available on all graphical Linux platforms. It's similar to DirectWrite or CoreText in that regard.

most of these additional fonts won't normally do anything (neither matching glyphs nor causing overhead)

Fontconfig has an option when sorting fonts by Unicode coverage to remove any fonts later in the list that don't provide any more coverage than earlier fonts. That option is already enabled in this PR, so I think all the matched fonts do genuinely provide at least a bit more Unicode coverage.

Wanted to note that this doesn't necessarily have to be instead of the existing linux backend. We could have both and let people choose. Not saying we necessarily want to to do that in this case, but just raising it as a possibility. We could probably even abstract the backends out behind a trait if we wanted to (making it easier for users to bring their own custom backends).

I don't want to do that in this particular case; the previous backend was very half-baked. One possibility is using the existing folder-scanning implementation (used for the Android and macOS backends) to implement a much simpler Linux backend. I don't think there is any benefit to an implementation that parses fontconfig files but deliberately avoids calling into fontconfig; AFAIK, its ABI has been stable since its introduction in the 90s and there aren't really going to be any systems that have fontconfig files but no fontconfig library.

I do have a 20%-finished Rust reimplementation of fontconfig on the backburner, but even then I think dynamically linking to the platform's fontconfig is better, because there's no risk of fontconfig adding a new config option that the application's bundled implementation can't parse, or updating their cache file format. Of particular note is that the version of fontconfig-cache-parser we're currently using supports cache file versions 7 and 8, and upstream is on version 9.

@xorgy
Copy link
Member

xorgy commented Jun 10, 2025

I think that this makes sense, as lame as it is aesthetically (in the abstract I mean; your code is fine!). Thank you for providing the patch. I'll give it my own review but since it is significant I'd like to see more than one reviewer. :+ )

@dfrg
Copy link
Collaborator

dfrg commented Jun 10, 2025

I don’t have time at the moment to do a deep review but the integration looks nice and clean at a glance. I have no objections to landing this once others, particularly those that actively use *nix platforms, are satisfied with the code.

Copy link
Member

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I've run this on a few systems and it seems to improve font selection generally, especially for scripts that I have specific preferences for.

The code all looks about right, and I appreciate the attention to detail in implementing real iterators and such.

Comment on lines 664 to 665
// TODO(valadaptive): this seems incomplete. check fcname.c for their
// constants
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this, I think they are also matched case-insensitive in fcname.c

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was the weights, slants, and widths, then it would be something like

    const SUFFIXES: &[&str] = &[
        // Weights.
        " Thin",
        " ExtraLight",
        " UltraLight",
        " Light",
        " Book",
        " Regular",
        " Medium",
        " DemiBold",
        " SemiBold",
        " Bold",
        " ExtraBold",
        " UltraBold",
        " Black",
        " Heavy",
        // Slants.
        " Roman",
        " Italic",
        " Oblique",
        // Widths.
        " UltraCondensed",
        " ExtraCondensed",
        " Condensed",
        " SemiCondensed",
        " Normal",
        " SemiExpanded",
        " Expanded",
        " ExtraExpanded",
        " UltraExpanded",
    ];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And name.strip_suffix would probably end up being name.trim_end_matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm starting to question if this is really the right approach. For instance, Monaspace Krypton's extra-light weight is named "ExLight", which isn't anywhere in fontconfig's list. Sinkin Sans includes the numeric weight in the RBIZ suffix. Lato has a "Hairline" weight.

Fontconfig provides the FC_STYLE property on fonts which should provide the RBIZ suffix, but for e.g. Lato Hairline Italic, the family names returned from fontconfig are "Lato" and "Lato Hairline", but the FC_STYLEs are "Hairline Italic" and "Italic"--there is nothing there that lets us strip just the "Hairline" suffix.

Looking at Lato Hairline Italic in otexplorer, its name table seems to have multiple FAMILY_NAMEs that differ by platform/encoding/language ID:

│   platform_id: 1                                    1       │ 00 01       │
│   encoding_id: 0                                            │ 00 00       │
│   language_id: 0                                            │ 00 00       │
│   name_id: FAMILY_NAME
│   length: 4                                                 │ 00 04       │
│   string_offset: +195                                       │ 00 C3       │
│     Lato                                                    │             │
│   platform_id: 3                                    18      │ 00 03       │
│   encoding_id: 1                                            │ 00 01       │
│   language_id: 1033                                         │ 04 09       │
│   name_id: FAMILY_NAME
│   length: 26                                                │ 00 1A       │
│   string_offset: +2300                                      │ 08 FC       │
│     Lato Hairline                                           │             │

That level of detail doesn't help us with fontconfig, which just returns both names with no other information, but does explain why it does so. I am not well-versed in OpenType but I think what we really want is the "typographic family name", which fontconfig doesn't give us.

Any ideas or preferences for what we should do here? I'm leaning towards just writing up this state of affairs in the TODO comment and punting on it for now.

Copy link
Member

@xorgy xorgy Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think writing up is the correct move, especially if there's no regression. FontConfig is slowly being rewritten in Rust with Fontations, so this roll cake is being eaten two slices at a time. Eventually we'll meet in the middle I guess.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FontConfig is slowly being rewritten in Rust with Fontations

Well that's interesting, surprising and good news! Is there an announcement/plan of this anywhere, or is it just something that's quietly happening?

This isn’t a complete replacement. We’re simply providing an alternative to the FreeType component that is used to build the font database. The motivation is that we’d like to completely remove the FreeType dependency from Chromium.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing out CSS font matching in Chrome and Firefox, font-family: "Lato Heavy" works but font-family: "Lato Bold" doesn't match any fonts. Maybe we should just strip "Bold" and "Italic"? I can't find a list of which suffixes are actually stripped or which code does so.

I don’t think it’s necessary to strip Bold and Italic since those already fit into the RBIZ model.

Since browsers seem to use whatever fontconfig gives them perhaps we should do the same and just remove this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed strip_rbiz, but noticed that FamilyNameMap::add_alias doesn't actually alias the given family name to the given ID--it creates a new FamilyId for the map entry. Is this intentional?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. That looks like a bug to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put up a separate PR for that.

github-merge-queue bot pushed a commit that referenced this pull request Jun 13, 2025
Discovered this as part of #378; splitting it off because it affects all
the fontique backends.

When adding font aliases, the code was not actually using the family ID
we passed in, but creating a fresh family ID each time. This meant that
font aliases were all mapped to nonexistent family IDs, and all queries
for a font by one of its aliases would fail.

I'm a bit concerned that this has the potential to cause regressions or
other general weirdness, since all font aliases went unused due to this
bug. Not sure how to test that other than to just give it a spin.

/// Fontconfig seems to force RBIZ (regular, bold, italic, bold italic) when
/// categorizing fonts. This removes those suffixes from family names so that
/// we can match on all attributes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I don’t think this comment is true. The first family name in both font files below is the typographic family name (name ID 16) and second one is the RBIZ family name (name ID 1), and similarly the first style is the typographic style name and the second one is the RBIZ style name:

$ fc-query Lato-Hairline.ttf
Pattern has 28 elts (size 32)
	family: "Lato"(s) "Lato Hairline"(s)
	style: "Hairline"(s) "Regular"(s)
	fullname: "Lato Hairline"(s)

$ fc-query Lato-HairlineItalic.ttf
Pattern has 28 elts (size 32)
	family: "Lato"(s) "Lato Hairline"(s)
	style: "Hairline Italic"(s) "Italic"(s)
	fullname: "Lato Hairline Italic"(s)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense actually. I guess I'd assumed fontconfig added all the family names with name ID 1 (see the comment thread above). But looking into fontconfig's source, it seems it adds the typographic family first and then the RBIZ family. I think in the case of multiple names under a given ID, it adds them all and deduplicates them?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the case of multiple names under a given ID, it adds them all and deduplicates them?

This usually happens with localized names, but I haven’t seen duplicated names with the same language.

Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think waiting for a second thorough review here is viable.

I've given this a test with Xilem, and it doesn't appear to have caused any regressions. Thanks!

@xorgy
Copy link
Member

xorgy commented Jul 2, 2025

Yup, this should be merged.

I added this abstraction later and forgot to clean this up
Web browsers let you use named fonts with various suffixes. For
instance, `font-family: "Lato Hairline"` works just fine. We should
match that behavior, which lets us get rid of this hack.
Track duplicate families by ID instead of name.

Also use filter_map to add all the generic families in one go. This also
lets us use the question mark operator in the iterator body, leading to
much cleaner code.
@valadaptive valadaptive added this pull request to the merge queue Jul 3, 2025
Merged via the queue into linebender:main with commit b871142 Jul 3, 2025
24 checks passed
@valadaptive valadaptive deleted the fontconfig branch July 3, 2025 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On Linux, GenericFamily::SystemUi should fall back to sans-serif
7 participants