compile: make Regex::new(r"(?-u:\B)") fail again #1007

BurntSushi · 2023-06-05T12:39:30Z

This regex failed to compile in regex <1.8, but the migration to regex-automata tweaked the rules in a subtle way that permitted it to compile despite the fact that the old/status-quo matching engines can't handle it correctly. By that, I mean that they may permit the \B to match between code units. That in turn results in panicking when slicing a &str.

In regex 1.9, this regex will actually be able to be compiled, but the matching engines will correctly and robustly never report matches that split UTF-8 code units. For now, we just add code that causes regex 1.8 to have the same behavior as previous releases.

Fixes #1006

This regex failed to compile in `regex <1.8`, but the migration to regex-automata tweaked the rules in a subtle way that permitted it to compile despite the fact that the old/status-quo matching engines can't handle it correctly. By that, I mean that they may permit the \B to match between code units. That in turn results in panicking when slicing a &str. In `regex 1.9`, this regex will actually be able to be compiled, but the matching engines will correctly and robustly never report matches that split UTF-8 code units. For now, we just add code that causes `regex 1.8` to have the same behavior as previous releases. Fixes #1006

BurntSushi mentioned this pull request Jun 5, 2023

panicked at 'not a char boundary' when using Regex::replace #1006

Closed

BurntSushi merged commit b2ca9c1 into master Jun 5, 2023

BurntSushi deleted the ag/fix-1006 branch July 5, 2023 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compile: make Regex::new(r"(?-u:\B)") fail again #1007

compile: make Regex::new(r"(?-u:\B)") fail again #1007

Uh oh!

BurntSushi commented Jun 5, 2023

Uh oh!

Uh oh!

compile: make Regex::new(r"(?-u:\B)") fail again #1007

compile: make Regex::new(r"(?-u:\B)") fail again #1007

Uh oh!

Conversation

BurntSushi commented Jun 5, 2023

Uh oh!

Uh oh!