Skip to content

Lexer.getCharIndex() return value not behaving as expected #3606

@tianshuang

Description

@tianshuang
  • I have reproduced my issue using the latest version of ANTLR
  • I have asked at stackoverflow
  • Responses from the above seem to indicate that my issue could be an ANTLR bug
  • I have done a search of the existing issues to make sure I'm not sending in a duplicate

Language: Java
ANTLR Version: 4.9.3

parser grammar TestParser;

options { tokenVocab=TestLexer; }

root
    : LINE+ EOF
    ;
lexer grammar TestLexer;

@lexer::members {
    private int startIndex = 0;

    private void updateStartIndex() {
        startIndex = getCharIndex();
    }

    private void printNumber() {
        String number = _input.getText(Interval.of(startIndex, getCharIndex() - 1));
        System.out.println(number);
    }
}

LINE:                          {getCharPositionInLine() == 0}? ANSWER SPACE {updateStartIndex();} NUMBER {printNumber();} DOT .+? NEWLINE;
OTHER:                         . -> skip;

fragment NUMBER:               [0-9]+;
fragment ANSWER:               '( ' [A-D] ' )';
fragment SPACE:                ' ';
fragment NEWLINE:              '\n';
fragment DOT:                  '.';

Execute the following code:

import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.Lexer;
import org.antlr.v4.runtime.tree.ParseTree;

public class TestParseTest {

    public static void main(String[] args) {
        CharStream charStream = CharStreams.fromString("( B ) 12. hahaha\n"+
                "( B ) 123. hahaha\n");
        Lexer lexer = new TestLexer(charStream);

        CommonTokenStream tokens = new CommonTokenStream(lexer);
        TestParser parser = new TestParser(tokens);
        ParseTree parseTree = parser.root();

        System.out.println(parseTree.toStringTree(parser));
    }

}

The output is as follows:

12
12
(root ( B ) 12. hahaha\n ( B ) 123. hahaha\n <EOF>)

Expected output:

12
123
(root ( B ) 12. hahaha\n ( B ) 123. hahaha\n <EOF>)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions