RFR: 8245153 Unicode encoded double-quoted empty string does not compile
Jim Laskey
james.laskey at oracle.com
Thu May 28 12:04:33 UTC 2020
Test should probably also include
String s0 = "";
:-)
> On May 28, 2020, at 9:02 AM, Jim Laskey <james.laskey at oracle.com> wrote:
>
> I've since rewritten this code (targetting for 16) to not use reset at all for this very reason. Your solution may work but the safer solution is to
>
> case 2: // Starting an empty string literal.
> tk = Tokens.TokenKind.STRINGLITERAL;
> return;
>
>
> Your test should include:
>
> String s1 = \u0022\u0022;
> String s2 = "\u0022;
> String s3 = \u0022";
> String s4 = \u0022\\u0022\u0022;
>
> Cheers,
>
> -- Jim
>
>
>> On May 28, 2020, at 5:38 AM, Adam Sotona <adam.sotona at oracle.com> wrote:
>>
>> Hi,
>> please help me to review fix of Unicode encoded double-quoted empty string compilation.
>> I found the root cause is in com.sun.tools.javac.parser.JavaTokenizer::scanString(int pos). It is trying to un-read unicode quotes by calling com.sun.tools.javac.parser.UnicodeReader::reset(int pos), however that approach works only luckily when one source character matches to one String character (standard quotes in this case).
>> If the quotes are written in Unicode notation \u0022\u0022 , the reset call moves reader.bp cursor to original pos-1 position and reads one character.
>> As the initial pos parameter points AFTER the last parsed character, so position of the first backslash from \u0022\u0022 is already lost and next character parsed is number 2 instead of unicode quotes.
>> The fix just repositions reader to the right place, no matter if quotes are standard nor unicode encoded.
>> Plus there is a new test added for this case.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8245153
>> webrev: http://cr.openjdk.java.net/~asotona/8245153/
>>
>> All Tier 1, 2 and 3 tests are passing.
>>
>> Thanks for the review,
>> Adam
>
More information about the compiler-dev
mailing list