RFR - JDK-8202442 - String::unescape (Code Review)
Jan Lahoda
jan.lahoda at oracle.com
Wed Sep 19 14:10:44 UTC 2018
I guess Jon's comment was that (per JLS) the outcome of unicode
unescapes can then participate in the escape sequences in String
literals. So, this:
"\u005ct"
is (as far as I know) a single character-literal (a tab), while it seems
that
`\u005ct`.unescape()
is two characters:
\t
Not sure if that's an intent or not.
Jan
On 18.9.2018 20:55, Jim Laskey wrote:
> The intent, of course, is to offset the raw string literals non-translation of Unicode escapes and escape sequences. That is, have the multi-line cake and eat the escapes too.
>
> So a developer could have
>
> String s = `
> \t\tTitle
> \t\t\tbody
> ...
>
> `.align().escape();
>
> to have tabs inserted in the string.
>
> "\\" "\u005c\u005c" and `\` all translate to the same string. `\u005c` translates to "\\u005c”. `\u005c`.unescape() thustranslates to be the same as "\\”, "\u005c\u005c" and `\`.
>
> Cheers,
>
> — Jim
>
>
>
>> On Sep 18, 2018, at 3:33 PM, Jonathan Gibbons <jonathan.gibbons at oracle.com> wrote:
>>
>> Jim,
>>
>> In JLS, and hence javac, Unicode escape handling happens early on at a low level, before string escape handling. This means that it is technically possible to write string escape sequences in terms of Unicode escapes.
>>
>> I'm not suggesting you should do the same here, but you should be aware of the difference, compared to javac behavior.
>>
>> -- Jon
>>
>>
>> On 9/18/18 10:51 AM, Jim Laskey wrote:
>>> Please review the code for String::unescape. Used to translate escape sequences in a string, typically in a raw string literal, into characters represented by those escapes.
>>>
>>> webrev: http://cr.openjdk.java.net/~jlaskey/8202442/webrev/index.html
>>> jbs: https://bugs.openjdk.java.net/browse/JDK-8202442
>>> csr: https://bugs.openjdk.java.net/browse/JDK-8202443
>>>
>>> Cheers,
>>>
>>> — Jim
>>>
>>
>
More information about the core-libs-dev
mailing list