Does Java.g [version 1.0.6] handle unicode characters?

Roberto Mannai robermann at gmail.com
Sat Aug 28 00:19:20 PDT 2010


Hello

[I sent the following message to the antlr-interest mailing list,
sorry for the cross posting]

I'm trying to understand whether the Java grammar from
http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
processes correctly the Unicode chars or not.

In the file's header I read:
<<
 *  Know problems:
 *    Won't pass input containing unicode sequence like this
 *      char c = '\uffff'
 *      String s = "\uffff";
 *    Because Antlr does not treat '\uffff' as an valid char. This
will be fixed in the next Antlr
 *    release. [Fixed in Antlr-3.1.1]
>>

So, it seems that antlr 3.2 should handle the Unicode charset. Anyway,
when I try to parse the following class:

public class TestUnicode {
        public static void test (String[] args){
                char c = '\uffff';
        }
}

I get the following error:
     line 3:27 no viable alternative at character 'u'
     line 3:34 mismatched character '\r' expecting '''
     line 1:7 mismatched input 'class' expecting MONKEYS_AT
     line 2:22 mismatched input 'void' expecting MONKEYS_AT
     line 3:21 mismatched input 'c' expecting DOT
     line 3:23 no viable alternative at input '='
     line 4:8 no viable alternative at input '}'
     line 4:8 no viable alternative at input '}'

If I replace the unicode character it of course works. Am I missing
anything? Please note that version 1.0.5 didn't have this problem.

Thanks for your help.

Roberto


More information about the compiler-grammar-dev mailing list