-
Notifications
You must be signed in to change notification settings - Fork 29.2k
Fix the breaking of multi-code-unit characters in obscure mode #123366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It makes sense to me that we should remove the entire emoji when pressing delete since I'm not sure what a user would expect if only a portion of it is deleted in a password field. I do wonder why
Though looking at this test, it looks like we already expect that |
This is a bit concerning that the whole emoji gets deleted, other people would easily be able to tell there are zwjs in the password. I think the problem isn't with "complex" characters, the crash is likely caused by surrogate pairs being broken up and the crash happens because it isn't valid encoding. Maybe try turning |
This pull request executed golden file tests, but it has not been updated in a while (20+ days). Test results from Gold expire after as many days, so this pull request will need to be updated with a fresh commit in order to get results from Gold. For more guidance, visit Writing a golden file test for Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
(triage) @justinmc Do you still have plans for this PR? |
Taking a look to see if I can turn this PR around, I'll make a decision on it by EOD today. |
I forgot about this PR, but I played around with it today and I think I have a solution. Will put this in draft mode and open it for review soon. |
if (position < 0) { | ||
return null; | ||
} | ||
if (position == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like when position == 0
it's always going to return 0 even when the text is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I guess we should be returning null when the text is empty? I've pushed a commit to do that for both leading/trailing.
return position; | ||
} | ||
|
||
return _breaksSurrogatePair(position) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If position
points to a high surrogate the method should return position
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If position
points to a high surrogate then _breaksSurrogatePair will return false, and position
will be returned. This is correct because position
is always a code point boundary when it points to a high surrogate.
Let me know if I have anything wrong there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see _breaksSurrogatePair also checks for position - 1
. It's probably not necessary since the program would have crashed before getting here if the string isn't valid UTF16. That would simplify the logic in this method too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems possible to get here with a malformed string, at least in some cases. I added a malformed string test that seems to pass on linux, let's see if it passes on other platforms. Also, I can paste a malformed string (👨👩�
) into the app on Linux without a crash.
Maybe we should be checking for this kind of thing somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh interesting. Do you mind adding a test case for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my test ("when inserting a malformed string"): https://github.com/flutter/flutter/pull/123366/files#diff-37dc3ce295078607fe561c83f2ae43fd77eb7b6aead39e5ed57d7fa9cf7525d7R16315
/// * https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF | ||
/// * [isLowSurrogate], which checks the same thing for low (second) | ||
/// surrogates. | ||
static bool isHighSurrogate(int value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe assert value
is in the correct range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried this on your suggestion, but then realized that there were a million test failures. I think these methods check lots of code units that are not necessarily surrogates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I meant valid UTF16 code range 0x0 – 0xFFFFF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh agreed, I'll do that!
return position; | ||
} | ||
|
||
return _breaksSurrogatePair(position) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see _breaksSurrogatePair also checks for position - 1
. It's probably not necessary since the program would have crashed before getting here if the string isn't valid UTF16. That would simplify the logic in this method too.
if (position < 0) { | ||
return 0; | ||
} | ||
if (position == _text.length - 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By defninition, if the last code unit is a low surrogate then getTrailingTextBoundaryAt(_text.length - 1)
should return null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was right as-is, but let me know if I'm misunderstanding. An example would be the string '👨'
, a valid surrogate pair, where the last code unit is a low surrogate. If position
is in the center of that surrogate pair, then the trailing boundary is at the end of the string, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should return the next start index of a code point AFTER position
. There's nothing after the last code unit so null
should be returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked offline and decided that it is possible to return _text.length
here. I've updated the docs to explicitly state this in 754a3b0.
); | ||
|
||
await tester.tap(find.byType(EditableText)); | ||
await tester.enterText(find.byType(EditableText), '👨👩�'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a comment to explain what code unit this is? I can't really copy that since it's rendered as 0xFFFD
and it's copied as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what happens if there's a sequence of high surrogates or low surrogates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. In cacb4fc I've replaced these crazy characters with their \u
equivalents and left comments about where they come from. I also added two tests, one with a series of high surrogates and one with a series of low surrogates.
flutter/flutter@f86c529...216605b 2023-05-24 engine-flutter-autoroll@skia.org Roll Flutter Engine from aba9dc4640c2 to eebcf36cd38f (3 revisions) (flutter/flutter#127458) 2023-05-24 engine-flutter-autoroll@skia.org Roll Flutter Engine from 76afb431a740 to aba9dc4640c2 (1 revision) (flutter/flutter#127453) 2023-05-24 engine-flutter-autoroll@skia.org Roll Flutter Engine from fc6df9512aa2 to 76afb431a740 (3 revisions) (flutter/flutter#127445) 2023-05-23 49699333+dependabot[bot]@users.noreply.github.com Bump actions/labeler from 6b107e7a7ee5e054e0bcce60de5181d21e2f00fb to 2713f7303c96cb1e69627957ec16eea0fd7f94a4 (flutter/flutter#127438) 2023-05-23 engine-flutter-autoroll@skia.org Roll Flutter Engine from 3535e0d301dd to fc6df9512aa2 (1 revision) (flutter/flutter#127440) 2023-05-23 47866232+chunhtai@users.noreply.github.com Migrates android semanitcs integration to integration test (flutter/flutter#127128) 2023-05-23 engine-flutter-autoroll@skia.org Roll Flutter Engine from 23b04314d5d2 to 3535e0d301dd (3 revisions) (flutter/flutter#127428) 2023-05-23 engine-flutter-autoroll@skia.org Roll Flutter Engine from ccf50f344d5d to 23b04314d5d2 (2 revisions) (flutter/flutter#127421) 2023-05-23 ian@hixie.ch Give channel descriptions in `flutter channel`, use branch instead of upstream for channel name (flutter/flutter#126936) 2023-05-23 72562119+tgucio@users.noreply.github.com Avoid catching errors in TextPainter tests (flutter/flutter#127307) 2023-05-23 christopherfujino@gmail.com [flutter_tools] delete entitlements files after copying to macos build dir (flutter/flutter#127417) 2023-05-23 engine-flutter-autoroll@skia.org Roll Flutter Engine from a6e37bde65fd to ccf50f344d5d (7 revisions) (flutter/flutter#127416) 2023-05-23 113669907+NikolajHarderNota@users.noreply.github.com modal bottom sheet barrier label (flutter/flutter#126431) 2023-05-23 jmccandless@google.com Fix the breaking of multi-code-unit characters in obscure mode (flutter/flutter#123366) 2023-05-23 engine-flutter-autoroll@skia.org Roll Flutter Engine from 168b0bf3f70d to a6e37bde65fd (1 revision) (flutter/flutter#127397) 2023-05-23 vashworth@google.com Revert "Log all output of ios-deploy" (flutter/flutter#127405) 2023-05-23 engine-flutter-autoroll@skia.org Roll Packages from 83959fb to d449a17 (6 revisions) (flutter/flutter#127394) 2023-05-23 vashworth@google.com Log all output of ios-deploy (flutter/flutter#127222) If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-packages Please CC bmparr@google.com,rmistry@google.com,stuartmorgan@google.com on the revert to ensure that a human is aware of the problem. To file a bug in Packages: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
…er#123366) This PR changes the character boundary behavior of obscured fields to be based on code points instead of code units. So it used to be possible to traverse and delete obscured text inside of code points (and breaking a code point like that would cause a crash):  But now moving the cursor and deleting is based on code points:  ### Native behavior Native iOS deletes part of the emoji, native Mac deletes the whole emoji. See flutter#122381 (comment). So it's unclear what the desired behavior should actually be. ### Resources Fixes flutter#122381 I thought this might not fix the case where a broken emoji is directly pasted into the field, but it seems to work by trying this: �â��ð��©â��ð��¦â��ð��¦ CC @LongCatIsLooong
This PR changes the character boundary behavior of obscured fields to be based on code points instead of code units.
So it used to be possible to traverse and delete obscured text inside of code points (and breaking a code point like that would cause a crash):
But now moving the cursor and deleting is based on code points:
Native behavior
Native iOS deletes part of the emoji, native Mac deletes the whole emoji. See #122381 (comment). So it's unclear what the desired behavior should actually be.
Resources
Fixes #122381
I thought this might not fix the case where a broken emoji is directly pasted into the field, but it seems to work by trying this: �👩👦👦
CC @LongCatIsLooong