RFR - JDK-8223780 String::translateEscapes (Preview)

Jim Laskey james.laskey at oracle.com
Tue May 28 11:43:32 UTC 2019


It is critical to replicate what the javac compiler is doing presently, otherwise we break existing code.

    char leadch = reader.ch;
    int oct = reader.digit(pos, 8);
    reader.scanChar();
    if ('0' <= reader.ch && reader.ch <= '7') {
        oct = oct * 8 + reader.digit(pos, 8);
        reader.scanChar();
        if (leadch <= '3' && '0' <= reader.ch && reader.ch <= '7') {
            oct = oct * 8 + reader.digit(pos, 8);
            reader.scanChar();
        }
    }

If you look at the JSL 3.10.6, this is also the pattern used.

OctalEscape:
    \ OctalDigit
    \ OctalDigit OctalDigit
    \ ZeroToThree OctalDigit OctalDigit

Cheers,

-- Jim



> On May 28, 2019, at 3:18 AM, Ivan Gerasimov <ivan.gerasimov at oracle.com> wrote:
> 
> Hi Jim!
> 
> I wonder why it was chosen to represent octal values as \[0-3][0-9][0-9] | \[0-9][0-9] | \[0-9] ?
> 
> First, it will allow multiple leading zeroes.
> 
> Second, it does not require a leading zero, while, I think, many users are used to octal numbers starting with a mandatory leading zero.
> 
> Was it considered to mirror the syntax from regex [1] ?
> 
> [1] https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/util/regex/Pattern.html#sum <https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/util/regex/Pattern.html#sum>
> With kind regards,
> 
> Ivan
> 
> 
> On 5/21/19 12:24 PM, Jim Laskey wrote:
>> Revised at http://cr.openjdk.java.net/~jlaskey/8223780/webrev-03/src/java.base/share/classes/java/lang/String.java.frames.html <http://cr.openjdk.java.net/%7Ejlaskey/8223780/webrev-03/src/java.base/share/classes/java/lang/String.java.frames.html>
>> 
>> 
>> 
>>> On May 21, 2019, at 3:23 PM, Jim Laskey <james.laskey at oracle.com <mailto:james.laskey at oracle.com>> wrote:
>>> 
>>> 
>>> 
>>>> On May 21, 2019, at 2:57 PM, Ivan Gerasimov <ivan.gerasimov at oracle.com <mailto:ivan.gerasimov at oracle.com>> wrote:
>>>> 
>>>> Hi Jim!
>>>> 
>>>> A few comments:
>>>> 
>>>> 1)
>>>> Probably, there's no need to update ch in these cases:
>>>>               case '\'':
>>>>                   ch = '\'';
>>>>                   break;
>>>>               case '\"':
>>>>                   ch = '\"';
>>>>                   break;
>>> 
>>> True
>>> 
>>>> 
>>>> 2)
>>>> Character.digit(ch, 8) will accept non-Latin1 digits.
>>>> So, a sequence \\3\uFF17\uFF17 will be parsed as \\377.
>>>> (Note, that the first digit still can only be from range '0'-'9').
>>> 
>>> Also true. Subtlety in UnicodeReader.
>>> 
>>>> 
>>>> 3)
>>>> It's not obvious how this condition can be triggered:
>>>>                   if (0377 < code) {
>>>>                       throw new MalformedEscapeException(from);
>>>>                   }
>>>> I might be missing something, but I cannot see how 'code' can become > 0377.
>>> 
>>> I removed this check in round 2.
>>> 
>>>> 
>>>> 4)
>>>> throw new MalformedEscapeException(from);
>>>> This will report the next index after the error.  Was it intentional?
>>>> 
>>> 
>>> I switched to using an IllegalArgumentError and displaying the actual character.
>>> 
>>> 
>>>> With kind regards,
>>>> Ivan
>>> 
>>> Much thanks.
>>> 
>>> 
>>>> 
>>>> 
>>>> On 5/21/19 7:56 AM, Jim Laskey wrote:
>>>>> Please do a code review of the new String:: translateEscapes instance method. This instance method is being introduced to support JEP-355: Text Blocks, by translating escape sequences in the text block content.
>>>>> 
>>>>> Thank you.
>>>>> 
>>>>> -- Jim
>>>>> 
>>>>> webrev: http://cr.openjdk.java.net/~jlaskey/8223780/webrev-01 <http://cr.openjdk.java.net/%7Ejlaskey/8223780/webrev-01> <http://cr.openjdk.java.net/~jlaskey/8223780/webrev-01 <http://cr.openjdk.java.net/%7Ejlaskey/8223780/webrev-01>>
>>>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8223780 <https://bugs.openjdk.java.net/browse/JDK-8223780> <https://bugs.openjdk.java.net/browse/JDK-8223780 <https://bugs.openjdk.java.net/browse/JDK-8223780>>
>>>>> csr: https://bugs.openjdk.java.net/browse/JDK-8223781 <https://bugs.openjdk.java.net/browse/JDK-8223781> <https://bugs.openjdk.java.net/browse/JDK-8223781 <https://bugs.openjdk.java.net/browse/JDK-8223781>>
>>>>> jep: https://bugs.openjdk.java.net/browse/JDK-8222530 <https://bugs.openjdk.java.net/browse/JDK-8222530> <https://bugs.openjdk.java.net/browse/JDK-8222530 <https://bugs.openjdk.java.net/browse/JDK-8222530>>
>>>>> 
>>>>> 
>>>> 
>>>> -- 
>>>> With kind regards,
>>>> Ivan Gerasimov
>>>> 
>>> 
>> 
> 
> -- 
> With kind regards,
> Ivan Gerasimov



More information about the core-libs-dev mailing list