Skip to content

Backtrack trouble in Context #481

@martin-juhlin

Description

@martin-juhlin

Hi,

I'm have run into a trouble when writing a lexer with below tokens and having source input "<?a <?p <?ph <?php <?=a", when reaching "<?p" it does not return PhpStartShort as expected, but return StartTag instead.

#[derive(Logos, Debug, PartialEq)]
pub enum Token {
    // Match anything until the start of PHP code
    #[regex(r#"[^<]+"#, |lex| lex.slice().to_string())]
    Text(String),

    // Match PHP open tag
    #[token("<?php")]
    PhpStartLong,

    #[token("<?")]
    PhpStartShort,

    // match php echo statement
    #[token("<?=")]
    PhpEcho,

    #[token("<")]
    StartTag,
}

I have done some debugging and from what I can tell, the Context::backtrace is not updated correctly when reaching "<?", its still left for the original match "<".

Here is a debug output (slightly modified with more details), node 23 have miss of node 12 (correctly), but in Genertor::goto, it for node 23 never update backtrack to 12, but instead stay at with old 14 that was previously set in node 25.

Generating code from graph (start 26):
{
    1: leaf: ::Text (<inline>),
    2: fork (miss 1) {
        [00-;] ⇒ 2,
        [=-FF] ⇒ 2,
        _ ⇒ 1,
    },
    3: rope: (miss: none) [80-BF] ⇒ 2,
    4: rope: (miss: none) [A0-BF][80-BF] ⇒ 2,
    5: rope: (miss: none) [80-BF][80-BF] ⇒ 2,
    6: rope: (miss: none) [80-9F][80-BF] ⇒ 2,
    7: rope: (miss: none) [90-BF][80-BF][80-BF] ⇒ 2,
    8: rope: (miss: none) [80-BF][80-BF][80-BF] ⇒ 2,
    9: rope: (miss: none) [80-8F][80-BF][80-BF] ⇒ 2,
    11: leaf: ::PhpStartLong,
    12: leaf: ::PhpStartShort,
    13: leaf: ::PhpEcho,
    14: leaf: ::StartTag,
    23: fork (miss 12) {
        = ⇒ 13,
        p ⇒ 24,
        _ ⇒ 12,
    },
    24: rope: (miss: none) hp ⇒ 11,
    25: rope: (miss: first 14) [
        ? ⇒ 23,
        _ ⇒ 14,
    ],
    26: fork (no miss) {
        [00-;] ⇒ 2,
        < ⇒ 25,
        [=-7F] ⇒ 2,
        [C2-DF] ⇒ 3,
        [E0] ⇒ 4,
        [E1-EC] ⇒ 5,
        [ED] ⇒ 6,
        [EE-EF] ⇒ 5,
        [F0] ⇒ 7,
        [F1-F3] ⇒ 8,
        [F4] ⇒ 9,
    },
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions