RFR: 8245153 Unicode encoded double-quoted empty string does not compile

Thu May 28 13:29:03 UTC 2020

+1

> On May 28, 2020, at 10:22 AM, Adam Sotona <adam.sotona at oracle.com> wrote:
> 
> Right, that is even better :)
> Here is new webrev with the new patch and extended test:
> http://cr.openjdk.java.net/~asotona/8245153/webrev.01/
> 
> Thanks,
> Adam 
> 
> 
>> On 28 May 2020, at 14:04, Jim Laskey <james.laskey at oracle.com> wrote:
>> 
>> Test should probably also include
>> 
>>  String s0 = "";
>> 
>> :-)
>> 
>>> On May 28, 2020, at 9:02 AM, Jim Laskey <james.laskey at oracle.com> wrote:
>>> 
>>> I've since rewritten this code (targetting for 16) to not use reset at all for this very reason. Your solution may work but the safer solution is to
>>> 
>>>     case 2: // Starting an empty string literal.
>>>          tk = Tokens.TokenKind.STRINGLITERAL;
>>>          return;
>>> 
>>> 
>>> Your test should include:
>>> 
>>>  String s1 = \u0022\u0022;
>>>  String s2 = "\u0022;
>>>  String s3 = \u0022";
>>>  String s4 = \u0022\\u0022\u0022;
>>> 
>>> Cheers,
>>> 
>>> -- Jim
>>> 
>>> 
>>>> On May 28, 2020, at 5:38 AM, Adam Sotona <adam.sotona at oracle.com> wrote:
>>>> 
>>>> Hi,
>>>> please help me to review fix of Unicode encoded double-quoted empty string compilation.
>>>> I found the root cause is in com.sun.tools.javac.parser.JavaTokenizer::scanString(int pos). It is trying to un-read unicode quotes by calling com.sun.tools.javac.parser.UnicodeReader::reset(int pos), however that approach works only luckily when one source character matches to one String character (standard quotes in this case).
>>>> If the quotes are written in Unicode notation \u0022\u0022 , the reset call moves reader.bp cursor to original pos-1 position and reads one character.  
>>>> As the initial pos parameter points AFTER the last parsed character, so position of the first backslash from \u0022\u0022 is already lost and next character parsed is number 2 instead of unicode quotes.
>>>> The fix just repositions reader to the right place, no matter if quotes are standard nor unicode encoded. 
>>>> Plus there is a new test added for this case.
>>>> 
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245153
>>>> webrev: http://cr.openjdk.java.net/~asotona/8245153/
>>>> 
>>>> All Tier 1, 2 and 3 tests are passing.
>>>> 
>>>> Thanks for the review,
>>>> Adam
>>> 
>> 
>