Does Java.g [version 1.0.6] handle unicode characters?
Roberto Mannai
robermann at gmail.com
Sat Aug 28 03:04:32 PDT 2010
Yes, it does not work with '\u0096'. So am I supposed to (re)open a bug? Where?
On Sat, Aug 28, 2010 at 9:42 AM, Yang Jiang <yang.jiang.z at gmail.com> wrote:
> You can change '\uffff' to some other valid chars like '\u0096' etc..
> If that works, then looks like the problem gets back in Antlr 3.2.
>
>
> yang
>
> On 08/28/2010 03:19 PM, Roberto Mannai wrote:
>>
>> Hello
>>
>> [I sent the following message to the antlr-interest mailing list,
>> sorry for the cross posting]
>>
>> I'm trying to understand whether the Java grammar from
>> http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g
>> processes correctly the Unicode chars or not.
>>
>> In the file's header I read:
>> <<
>> * Know problems:
>> * Won't pass input containing unicode sequence like this
>> * char c = '\uffff'
>> * String s = "\uffff";
>> * Because Antlr does not treat '\uffff' as an valid char. This
>> will be fixed in the next Antlr
>> * release. [Fixed in Antlr-3.1.1]
>>
>>>>
>>>>
>>
>> So, it seems that antlr 3.2 should handle the Unicode charset. Anyway,
>> when I try to parse the following class:
>>
>> public class TestUnicode {
>> public static void test (String[] args){
>> char c = '\uffff';
>> }
>> }
>>
>> I get the following error:
>> line 3:27 no viable alternative at character 'u'
>> line 3:34 mismatched character '\r' expecting '''
>> line 1:7 mismatched input 'class' expecting MONKEYS_AT
>> line 2:22 mismatched input 'void' expecting MONKEYS_AT
>> line 3:21 mismatched input 'c' expecting DOT
>> line 3:23 no viable alternative at input '='
>> line 4:8 no viable alternative at input '}'
>> line 4:8 no viable alternative at input '}'
>>
>> If I replace the unicode character it of course works. Am I missing
>> anything? Please note that version 1.0.5 didn't have this problem.
>>
>> Thanks for your help.
>>
>> Roberto
>>
>
>
More information about the compiler-grammar-dev
mailing list