-
-
Notifications
You must be signed in to change notification settings - Fork 144
Open
Labels
questionFurther information is requestedFurther information is requested
Description
I'm getting a strange error when a regex could match the prefix of another regex. Maybe. I just don't know what the problem is. Here's a simplified case:
#[derive(Logos, Debug, PartialEq)]
#[logos(skip r".|[\r\n]")] // skip everything not recognized
pub enum LogosToken {
// any letter except capital Z
#[regex(r"[a-zA-Y]+", priority = 3)]
WordExceptZ,
// any number
#[regex(r"[0-9]+", priority = 3)]
Number,
/*
This expression is:
(letter or number)* [Z] (letter or number)*
In other words, a token with any number of letters or numbers,
including at least one capital Z.
*/
#[regex(r"[a-zA-Z0-9]*[Z][a-zA-Z0-9]*", priority = 3)]
TermWithZ,
}
#[pg_extern]
fn test_logos() {
let mut lex = LogosToken::lexer("hello 42world fooZfoo");
while let Some(result) = lex.next() {
let slice = lex.slice();
println!("{:?} {:?}", slice, result);
}
}
This generates:
"hello" Ok(WordExceptZ)
"42world" Err(())
"fooZfoo" Ok(TermWithZ)
If I replace the regex over TermWithZ
with #[regex(r"Z", priority = 3)]
, I get:
"hello" Ok(WordExceptZ)
"42" Ok(Number)
"world" Ok(WordExceptZ)
"foo" Ok(WordExceptZ)
"Z" Ok(TermWithZ)
"foo" Ok(WordExceptZ)
The "42world" is getting recognized correctly as a number and word.
What I don't understand is, why does the first TermWithZ regex mess up the recognition of "42world"? It doesn't contain a Z, so TermWithZ should ignore it completely and let the first two variants do their job.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested