Skip to content

Conversation

valadaptive
Copy link
Contributor

@valadaptive valadaptive commented Mar 25, 2025

Requires upstream support in Swash (dfrg/swash#89).

This implements the CSS word-break and overflow-wrap style properties (including the latter's influence on the layout's min-content size).

This replaces the "emergency break" logic with something that should be simpler.

Different browsers seem to disagree on exactly which ranges of text the properties apply to. The behavior implemented here should be reasonable.

In the table below, red is the default wrapping settings, blue is overflow-wrap: anywhere, and green is word-break: break-all:

Gecko Blink WebKit Parley
image image image image
image image image image
image image image image
image image image image
image image image image
image image image image *
image image image word_break_break_all_second_half-0

* This doesn't match browsers, but the behavior is officially unspecified per w3c/csswg-drafts#3897

@DJMcNab
Copy link
Member

DJMcNab commented Mar 25, 2025

Wow, this is fantastic. I'm not really best placed to review this, but I do have a few initial thoughts.

That table is excellent! One thing which immediately jumps out at me is that using red/green for significant information is potentially challenging for colourblind users. I don't know of anyone on the team which this would impact, and I'm not suggesting that you should re-make the table (it looks like a lot of work...), but I'd recommend it as something to bear in mind in the future. (My understanding is that the general advice for having "side-channel" important information of this kind is to use e.g. patterns for differentiation; I'm not sure how practical that would be in this case...)

I'm not sure what to do about the new Korean tests. I guess showing that we lay things out correctly is quite cool, even without the actual glyphs being visible. If we can find a "sufficiently" small font with a suitable license, adding it to the repo is probably reasonable.

Are the tests standard? I see you link to wpt.fyi for one of them. Should the rest have a similar source provided/linked?

@valadaptive
Copy link
Contributor Author

valadaptive commented Mar 25, 2025

I've added links to the WPT page for the word-break: keep-all tests. Perhaps I should try implementing the other tests here as well? I think it would involve a lot of tedious manual porting.

As far as I can tell, the other tests (regarding which spans of text the CSS property "attaches" to) are original. I did just glance over at some of the WPT tests, and it looks like there's one (http://wpt.live/css/css-text/word-break/word-break-break-all-inline-007.tentative.html) that fails in all the browsers, and I think corresponds to this case:
image

I had to implement some weird behavior to replicate the browsers in that case (see the comment in layout/context.rs about applying each character's line breaking property to the next character), but actually just reverted it to the behavior that seems more sensible since WPT says it's the browsers' fault.

Regarding the Korean/Japanese tests, I decided not to include any CJK fonts since they're huge (Noto Sans JP is 4.5MB per weight). I added a comment about that.

Regarding accessibility, I changed the tests so that the bit of text with the relevant line-wrapping style property is underlined.

@valadaptive valadaptive reopened this Mar 25, 2025
@valadaptive valadaptive force-pushed the wrapping-styles branch 3 times, most recently from 3f54c4a to a636509 Compare March 25, 2025 19:13
@xorgy
Copy link
Member

xorgy commented Mar 25, 2025

Thank you! I'm wondering how this will work with hyphen insertion (particularly with explicit hyphenation/break opportunities through U+00AD SOFT HYPHEN). Seems like it'll work well once we have that, and these properties (somewhat) affect that.

Copy link
Member

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to consolidate the definitions of WordBreakStrength and just reuse/reexport that here.
Thank you!

@nicoburns
Copy link
Contributor

nicoburns commented Mar 26, 2025

Regarding testing, it's worth noting that we can run some WPT tests via Blitz (which is using Parley for text layout). We can't run everything though, because some tests require JavaScript (and some are designed for manual verification), so it may well be worth porting some of them.


For running word-break WPT tests with Blitz (basically anything that's screenshot based):

  • Clone https://github.com/DioxusLabs/blitz
  • Run cargo run --release --package wpt -- css/css-text/word-break
  • (if you have just installed then you can use just wpt css/css-text/word-break)
  • You can patch in a local version of Parley using the top-level (workspace) Cargo.toml

Currently we can run 5799/103 css/css-text/word-break (of which 2325 pass)
And 1079/1774 of all css/css-text tests (of which 314 pass)

} else {
self.state.append_cluster_to_line(next_x);
self.state.cluster_idx += 1;
else if let Some(prev_emergency) =
Copy link
Contributor

@nicoburns nicoburns Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it always correct to try "prev_boundary" first and then fallback to "prev_emergency"? Are there cases (OverflowWrap::Anywhere?) where we ought to use the later "emergency" breakpoint even if an earlier "regular" breakpoint is available?

I wonder whether we ought to just have one "previous break point" state that just gets set in different conditions depending on the line-breaking styles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases (OverflowWrap::Anywhere?) where we ought to use the later "emergency" breakpoint even if an earlier "regular" breakpoint is available?

Text with OverflowWrap::Anywhere should only use an "emergency" breakpoint if there are no other opportunities to break the line.

The name "anywhere" is a bit confusing since it may give the impression that it behaves like word-break: break-all. but it's just break-word with the difference that it affects the min content size.

I don't think there's any way to get around having two different previous break points since we need that fallback behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "anywhere" is a bit confusing since it may give the impression that it behaves like word-break: break-all

I see. Part of me feels like we ought not to stick to the CSS property names here and come up with something more sensible (but using the CSS properties is also very convenient for Blitz which can translate them 1:1 from Stylo)

Comment on lines +105 to 130
let mut word_break = Default::default();
let mut style_idx = 0;

let mut char_indices = text.char_indices();
loop {
let Some((char_idx, _)) = char_indices.next() else {
break;
};

// Find the style for this character. If the text is empty, we may not have any styles. Otherwise,
// self.styles should span the entire range of the text.
while let Some(style) = self.styles.get(style_idx) {
if style.range.end > char_idx {
word_break = style.style.word_break;
break;
}
style_idx += 1;
}
a.set_break_strength(word_break);

let Some((properties, boundary)) = a.next() else {
break;
};

self.info.push((CharInfo::new(properties, boundary), 0));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty awkward, but I guess fine...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit unfortunate. I thought about other ways to do it in Swash, e.g. passing in the word break strength with each call to next, but that would be a breaking change there and make it no longer an iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could make the analysis take an iterator over (char, WordBreakStrength) rather than just char? And you could have a different constructor for backwards compat.

@nicoburns
Copy link
Contributor

Blitz branch pointed at this PR: https://github.com/DioxusLabs/blitz/tree/word-break-overflow-wrap

This PR is currently only making 1 extra test pass. But I wouldn't read too much into that. I can see some tests failing like https://wpt.live/css/css-text/word-break/word-break-break-all-010.html. But that test uses ch units which Blitz doesn't currently support (need to get font metrics out of fontique for that!).


Notes on debugging WPT tests using Blitz:

To open a specific test in Blitz: cargo run --release --package readme -- https://wpt.live/css/css-text/word-break/word-break-break-all-010.html
To inspect:

  • Press alt-h to enable inspector mode
  • Click on a node to have some debug info logged to console.

For inline formatting contexts you'll get something like the following:

Text content: " XXXXX"
Inline Boxes:

Lines:
Line 0:
  RUN (x: 0, w: 80)
Line 1:
  RUN (x: 0, w: 80)
Line 2:
  RUN (x: 0, w: 80)

@nicoburns nicoburns mentioned this pull request Mar 7, 2025
24 tasks
@valadaptive
Copy link
Contributor Author

Now that swash 0.2.2 has been released, I don't think anything is blocking this other than another review.

@@ -19,7 +19,15 @@ use tiny_skia::{Color, FillRule, Paint, PathBuilder, Pixmap, PixmapMut, Rect, Tr

#[derive(Clone, Copy, Debug, PartialEq)]
pub(crate) struct ColorBrush {
pub(crate) color: Color,
pub(super) color: Color,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this super (and why as part of this PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a change that I forgot to undo

}

impl ColorBrush {
pub(crate) fn new(r: u8, g: u8, b: u8, a: u8) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather just see this take a Color argument and then you can pass in things from color::palette::css instead which also gives them better names than a series of constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know those were available--done

Copy link
Member

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your diligence!

@valadaptive valadaptive added this pull request to the merge queue Apr 3, 2025
Merged via the queue into linebender:main with commit 73f3828 Apr 3, 2025
21 checks passed
@valadaptive valadaptive deleted the wrapping-styles branch April 3, 2025 01:49
@xStrom xStrom added this to the 0.4 Release milestone May 7, 2025
@nilsmartel
Copy link

Thank you for making this more accessible, specifically to germans <3
(/s)

@xorgy
Copy link
Member

xorgy commented May 17, 2025

Thank you for making this more accessible, specifically to germans <3 (/s)

Patches welcome for real German word breaking rules! (e.g. that thing with the eszett) :+ )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants