-
Notifications
You must be signed in to change notification settings - Fork 481
Closed
Description
What version of regex are you using?
v1.8.3
Describe the bug at a high level.
I have fuzzed the regex package with afl.rs, and afl reports some cases panicked at "char boundary":
thread 'main' panicked at 'byte index 1 is not a char boundary; it is inside 'ο' (bytes 0..2) of `ο00000000000`', <my_project_path>/regex/src/re_unicode.rs:574:31
It seems replace function have some problems with utf-8 character. I have found such panic in several fuzz drivers.
What are the steps to reproduce the behavior?
fn main() {
let re = regex::Regex::new(r"\B|00(?-u)\B").unwrap();
let text = r"𐾁00000000";
let rep = r"0𐾁Ű000ο";
let _ = re.replace(text, rep);
}
fn main(){
let re = regex::Regex::new(r"\B|00(?-u)0\B").unwrap();
let text = "ο";
let rep = "000";
let _ = re.replace(text, rep);
}
fn main(){
let re = regex::Regex::new(r"(()(?-u)\B)0|\B").unwrap();
let _ = re.replace("Ԑ0000000000000" ,"000000000000000");
}
fn main(){
let re = regex::Regex::new(r"0()|\B|(?-u)\B0").unwrap();
let _ = re.replace("ⳅ000000000000" ,"000000000000000");
}
fn main(){
let re = regex::Regex::new(r"\B|(?-u)\B0").unwrap();
let _ = re.replace_all("00000^睓00" ,"00000000000");
}
I also found that if I replace "replace_all" with "replace", the last case will not panic. The rest of these will panic after being replaced. It seems weird.
What is the actual behavior?
Here is an error message sample; the error messages in these cases are similar.
thread 'main' panicked at 'byte index 1 is not a char boundary; it is inside 'ο' (bytes 0..2) of `ο00000000000`', <my_project_path>/regex/src/re_unicode.rs:574:31
What is the expected behavior?
Regex::replace should deal with utf-8 character correctly.
Metadata
Metadata
Assignees
Labels
No labels