Skip to content

C -family backends generate incorrect parser #235

@justinmeiners

Description

@justinmeiners

Overview

This grammar is used to generate parsers in several languages. The output works fine for Java output, but not for C or C++.
Here is the error when building the C++ output:

screenshot_from_2018-09-21_17-48-19

Observations

I have tracked down the issue to being a mix up of SimpleTypeBool and GroundBool. (similarly for a few other variables)

A Bool is defined to be either true or false and is part of the Ground non terminal.

BoolTrue.   Bool ::= "true" ;
BoolFalse.  Bool ::= "false" ;
GroundBool. Ground ::= Bool ;

SimpleTypeBool this is defined to be the token "Bool"

SimpleTypeBool. SimpleType ::= "Bool" ;

Clearly these are not the same. The token true or false should be parsed as a GroundBool, the token Bool should be parsed as a SimpleTypeBool. However, these cases are mixed in the lex and yacc file.

In the lex file, I can see the token for SimpleTypeBool:

<YYINITIAL>"Bool"        return _SYMB_38;

But, _SYMB_38 is referenced in the ground rule in the bison file:

 Ground : _SYMB_38 {  $$ = new GroundBool($1); $$->line_number = yy_mylinenumber; YY_RESULT_Ground_= $$; 

This is incorrect. _SYMB_38 has nothing to do with Ground. _SYMB_38 should instead be Bool which is defined properly:

Bool : _SYMB_60 {  $$ = new BoolTrue(); $$->line_number = yy_mylinenumber; YY_RESULT_Bool_= $$; }
| _SYMB_50 {  $$ = new BoolFalse(); $$->line_number = yy_mylinenumber; YY_RESULT_Bool_= $$; }

Where the Problem might be

I am not a Haskeller, but I am working on tracking down the problem. My guess is that there is a Map somewhere which uses the same key for a token literal and for a token name.

I found something like that here:
https://github.com/BNFC/bnfc/blob/master/source/src/BNFC/Backend/CPP/NoSTL/CFtoFlex.hs#L58

If I trace env' I can see that there are duplicate keys. Notice Uri.

[("{","_SYMB_0"),("}","_SYMB_1"),("~","_SYMB_2"),("/\\","_SYMB_3"),("\\/","_SYMB_4"),("*","_SYMB_5"),(".","_SYMB_6"),("(","_SYMB_7"),(")","_SYMB_8"),("-","_SYMB_9"),("/","_SYMB_10"),("%%","_SYMB_11"),("+","_SYMB_12"),("++","_SYMB_13"),("--","_SYMB_14"),("<","_SYMB_15"),("<=","_SYMB_16"),(">","_SYMB_17"),(">=","_SYMB_18"),("==","_SYMB_19"),("!=","_SYMB_20"),("=","_SYMB_21"),("|","_SYMB_22"),(",","_SYMB_23"),("_","_SYMB_24"),("@","_SYMB_25"),("bundle+","_SYMB_26"),("bundle-","_SYMB_27"),("<-","_SYMB_28"),(";","_SYMB_29"),("!","_SYMB_30"),("!!","_SYMB_31"),("=>","_SYMB_32"),("[","_SYMB_33"),("]","_SYMB_34"),(":","_SYMB_35"),(",)","_SYMB_36"),("...","_SYMB_37"),("Bool","_SYMB_38"),("ByteArray","_SYMB_39"),("Int","_SYMB_40"),("Nil","_SYMB_41"),("Set","_SYMB_42"),("String","_SYMB_43"),("Uri","_SYMB_44"),("and","_SYMB_45"),("bundle","_SYMB_46"),("bundle0","_SYMB_47"),("contract","_SYMB_48"),("else","_SYMB_49"),("false","_SYMB_50"),("for","_SYMB_51"),("if","_SYMB_52"),("in","_SYMB_53"),("match","_SYMB_54"),("matches","_SYMB_55"),("new","_SYMB_56"),("not","_SYMB_57"),("or","_SYMB_58"),("select","_SYMB_59"),("true","_SYMB_60"),("Long","_SYMB_61"),("Uri","_SYMB_62"),("Var","_SYMB_63")]

I imagine something similar is happening in the bison generation code. I will keep tracking this down, but if you know what is wrong or have any suggestions that would be very helpful. I can also clarify if you have questions.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions