Does Java.g [version 1.0.6] handle unicode characters?
Roberto Mannai
robermann at gmail.com
Sat Aug 28 00:19:20 PDT 2010
Hello
[I sent the following message to the antlr-interest mailing list,
sorry for the cross posting]
I'm trying to understand whether the Java grammar from
http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
processes correctly the Unicode chars or not.
In the file's header I read:
<<
* Know problems:
* Won't pass input containing unicode sequence like this
* char c = '\uffff'
* String s = "\uffff";
* Because Antlr does not treat '\uffff' as an valid char. This
will be fixed in the next Antlr
* release. [Fixed in Antlr-3.1.1]
>>
So, it seems that antlr 3.2 should handle the Unicode charset. Anyway,
when I try to parse the following class:
public class TestUnicode {
public static void test (String[] args){
char c = '\uffff';
}
}
I get the following error:
line 3:27 no viable alternative at character 'u'
line 3:34 mismatched character '\r' expecting '''
line 1:7 mismatched input 'class' expecting MONKEYS_AT
line 2:22 mismatched input 'void' expecting MONKEYS_AT
line 3:21 mismatched input 'c' expecting DOT
line 3:23 no viable alternative at input '='
line 4:8 no viable alternative at input '}'
line 4:8 no viable alternative at input '}'
If I replace the unicode character it of course works. Am I missing
anything? Please note that version 1.0.5 didn't have this problem.
Thanks for your help.
Roberto
More information about the compiler-grammar-dev
mailing list