Does Java.g [version 1.0.6] handle unicode characters?

Yang Jiang yang.jiang.z at gmail.com
Sat Aug 28 00:42:19 PDT 2010


You can change '\uffff' to some other valid chars like '\u0096' etc..
If that works, then looks like the problem gets back in Antlr 3.2.


yang

On 08/28/2010 03:19 PM, Roberto Mannai wrote:
> Hello
>
> [I sent the following message to the antlr-interest mailing list,
> sorry for the cross posting]
>
> I'm trying to understand whether the Java grammar from
> http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
> processes correctly the Unicode chars or not.
>
> In the file's header I read:
> <<
>   *  Know problems:
>   *    Won't pass input containing unicode sequence like this
>   *      char c = '\uffff'
>   *      String s = "\uffff";
>   *    Because Antlr does not treat '\uffff' as an valid char. This
> will be fixed in the next Antlr
>   *    release. [Fixed in Antlr-3.1.1]
>    
>>>        
> So, it seems that antlr 3.2 should handle the Unicode charset. Anyway,
> when I try to parse the following class:
>
> public class TestUnicode {
>          public static void test (String[] args){
>                  char c = '\uffff';
>          }
> }
>
> I get the following error:
>       line 3:27 no viable alternative at character 'u'
>       line 3:34 mismatched character '\r' expecting '''
>       line 1:7 mismatched input 'class' expecting MONKEYS_AT
>       line 2:22 mismatched input 'void' expecting MONKEYS_AT
>       line 3:21 mismatched input 'c' expecting DOT
>       line 3:23 no viable alternative at input '='
>       line 4:8 no viable alternative at input '}'
>       line 4:8 no viable alternative at input '}'
>
> If I replace the unicode character it of course works. Am I missing
> anything? Please note that version 1.0.5 didn't have this problem.
>
> Thanks for your help.
>
> Roberto
>    



More information about the compiler-grammar-dev mailing list