Newline characters in grammars can produce illegal escape codes when targeting Java

A grammar containing Unicode code point references for newlines (carriage return and line feed) may produce invalid Java code.

Sample grammar:
```
grammar Demo;
linebreak: LF
         | CR
         ;
CR : '\u000D';
LF : '\u000A';
```

Snippet from resulting DemoLexer.java:
```
	private static final String[] _LITERAL_NAMES = {
		null, "'\u000D'", "'\u000A'"
	};
	private static final String[] _SYMBOLIC_NAMES = {
		null, "CR", "LF"
	};
```

This will fail to compile. It turns out that since the Java compiler interprets Unicode character escapes *before* parsing, `"\u000D"` is equivalent to having a literal carriage return in the middle of a string. Instead of `\u000D` and `\u000A`, ANTLR should emit `\r` and `\n`.

- [x] I am not submitting a question on how to use ANTLR; instead, go to [antlr4-discussion google group](https://groups.google.com/forum/#!forum/antlr-discussion) or ask at [stackoverflow](http://stackoverflow.com/questions/tagged/antlr4)
- [x] I have done a search of the existing issues to make sure I'm not sending in a duplicate


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Newline characters in grammars can produce illegal escape codes when targeting Java #2281

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Newline characters in grammars can produce illegal escape codes when targeting Java #2281

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions