RFR: 8245153 Unicode encoded double-quoted empty string does not compile
Jim Laskey
james.laskey at oracle.com
Thu May 28 13:29:03 UTC 2020
+1
> On May 28, 2020, at 10:22 AM, Adam Sotona <adam.sotona at oracle.com> wrote:
>
> Right, that is even better :)
> Here is new webrev with the new patch and extended test:
> http://cr.openjdk.java.net/~asotona/8245153/webrev.01/
>
> Thanks,
> Adam
>
>
>> On 28 May 2020, at 14:04, Jim Laskey <james.laskey at oracle.com> wrote:
>>
>> Test should probably also include
>>
>> String s0 = "";
>>
>> :-)
>>
>>> On May 28, 2020, at 9:02 AM, Jim Laskey <james.laskey at oracle.com> wrote:
>>>
>>> I've since rewritten this code (targetting for 16) to not use reset at all for this very reason. Your solution may work but the safer solution is to
>>>
>>> case 2: // Starting an empty string literal.
>>> tk = Tokens.TokenKind.STRINGLITERAL;
>>> return;
>>>
>>>
>>> Your test should include:
>>>
>>> String s1 = \u0022\u0022;
>>> String s2 = "\u0022;
>>> String s3 = \u0022";
>>> String s4 = \u0022\\u0022\u0022;
>>>
>>> Cheers,
>>>
>>> -- Jim
>>>
>>>
>>>> On May 28, 2020, at 5:38 AM, Adam Sotona <adam.sotona at oracle.com> wrote:
>>>>
>>>> Hi,
>>>> please help me to review fix of Unicode encoded double-quoted empty string compilation.
>>>> I found the root cause is in com.sun.tools.javac.parser.JavaTokenizer::scanString(int pos). It is trying to un-read unicode quotes by calling com.sun.tools.javac.parser.UnicodeReader::reset(int pos), however that approach works only luckily when one source character matches to one String character (standard quotes in this case).
>>>> If the quotes are written in Unicode notation \u0022\u0022 , the reset call moves reader.bp cursor to original pos-1 position and reads one character.
>>>> As the initial pos parameter points AFTER the last parsed character, so position of the first backslash from \u0022\u0022 is already lost and next character parsed is number 2 instead of unicode quotes.
>>>> The fix just repositions reader to the right place, no matter if quotes are standard nor unicode encoded.
>>>> Plus there is a new test added for this case.
>>>>
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245153
>>>> webrev: http://cr.openjdk.java.net/~asotona/8245153/
>>>>
>>>> All Tier 1, 2 and 3 tests are passing.
>>>>
>>>> Thanks for the review,
>>>> Adam
>>>
>>
>
More information about the compiler-dev
mailing list