-
Notifications
You must be signed in to change notification settings - Fork 173
Description
Overview
This grammar is used to generate parsers in several languages. The output works fine for Java output, but not for C or C++.
Here is the error when building the C++ output:
Observations
I have tracked down the issue to being a mix up of SimpleTypeBool
and GroundBool
. (similarly for a few other variables)
A Bool is defined to be either true
or false
and is part of the Ground
non terminal.
BoolTrue. Bool ::= "true" ;
BoolFalse. Bool ::= "false" ;
GroundBool. Ground ::= Bool ;
SimpleTypeBool
this is defined to be the token "Bool"
SimpleTypeBool. SimpleType ::= "Bool" ;
Clearly these are not the same. The token true
or false
should be parsed as a GroundBool
, the token Bool
should be parsed as a SimpleTypeBool
. However, these cases are mixed in the lex and yacc file.
In the lex file, I can see the token for SimpleTypeBool:
<YYINITIAL>"Bool" return _SYMB_38;
But, _SYMB_38
is referenced in the ground
rule in the bison file:
Ground : _SYMB_38 { $$ = new GroundBool($1); $$->line_number = yy_mylinenumber; YY_RESULT_Ground_= $$;
This is incorrect. _SYMB_38
has nothing to do with Ground
. _SYMB_38
should instead be Bool
which is defined properly:
Bool : _SYMB_60 { $$ = new BoolTrue(); $$->line_number = yy_mylinenumber; YY_RESULT_Bool_= $$; }
| _SYMB_50 { $$ = new BoolFalse(); $$->line_number = yy_mylinenumber; YY_RESULT_Bool_= $$; }
Where the Problem might be
I am not a Haskeller, but I am working on tracking down the problem. My guess is that there is a Map
somewhere which uses the same key for a token literal and for a token name.
I found something like that here:
https://github.com/BNFC/bnfc/blob/master/source/src/BNFC/Backend/CPP/NoSTL/CFtoFlex.hs#L58
If I trace env'
I can see that there are duplicate keys. Notice Uri
.
[("{","_SYMB_0"),("}","_SYMB_1"),("~","_SYMB_2"),("/\\","_SYMB_3"),("\\/","_SYMB_4"),("*","_SYMB_5"),(".","_SYMB_6"),("(","_SYMB_7"),(")","_SYMB_8"),("-","_SYMB_9"),("/","_SYMB_10"),("%%","_SYMB_11"),("+","_SYMB_12"),("++","_SYMB_13"),("--","_SYMB_14"),("<","_SYMB_15"),("<=","_SYMB_16"),(">","_SYMB_17"),(">=","_SYMB_18"),("==","_SYMB_19"),("!=","_SYMB_20"),("=","_SYMB_21"),("|","_SYMB_22"),(",","_SYMB_23"),("_","_SYMB_24"),("@","_SYMB_25"),("bundle+","_SYMB_26"),("bundle-","_SYMB_27"),("<-","_SYMB_28"),(";","_SYMB_29"),("!","_SYMB_30"),("!!","_SYMB_31"),("=>","_SYMB_32"),("[","_SYMB_33"),("]","_SYMB_34"),(":","_SYMB_35"),(",)","_SYMB_36"),("...","_SYMB_37"),("Bool","_SYMB_38"),("ByteArray","_SYMB_39"),("Int","_SYMB_40"),("Nil","_SYMB_41"),("Set","_SYMB_42"),("String","_SYMB_43"),("Uri","_SYMB_44"),("and","_SYMB_45"),("bundle","_SYMB_46"),("bundle0","_SYMB_47"),("contract","_SYMB_48"),("else","_SYMB_49"),("false","_SYMB_50"),("for","_SYMB_51"),("if","_SYMB_52"),("in","_SYMB_53"),("match","_SYMB_54"),("matches","_SYMB_55"),("new","_SYMB_56"),("not","_SYMB_57"),("or","_SYMB_58"),("select","_SYMB_59"),("true","_SYMB_60"),("Long","_SYMB_61"),("Uri","_SYMB_62"),("Var","_SYMB_63")]
I imagine something similar is happening in the bison generation code. I will keep tracking this down, but if you know what is wrong or have any suggestions that would be very helpful. I can also clarify if you have questions.