RFR: JDK-8263261 Extend String::translateEscapes to support unicode escapes [v12]
Jim Laskey
jlaskey at openjdk.org
Fri Jan 26 17:36:56 UTC 2024
On Fri, 26 Jan 2024 16:54:14 GMT, Roger Riggs <rriggs at openjdk.org> wrote:
>> Jim Laskey has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision:
>>
>> - Merge remote-tracking branch 'upstream/master' into 8263261
>> - Update unicode to Unicode
>> - Requested changes
>> - Update String.java
>> - Requested changes
>> - Update Copyright
>> - Update copyright year of test
>> - Add JLS Unicode Escapes reference
>> - Update comment
>> - Update copyright year
>> - ... and 2 more: https://git.openjdk.org/jdk/compare/af9bfd62...040bda82
>
> src/java.base/share/classes/java/lang/String.java line 4229:
>
>> 4227: * <th scope="row">{@code \u005Cu...uXXXX}</th>
>> 4228: * <td>Unicode escape</td>
>> 4229: * <td>single UTF-16 code unit equivalent</td>
>
> The `...` makes it less clear what is being shown. It might be clearer to include the XXXX in the resulting value and drop the multiple `u` case.
Changed
> src/java.base/share/classes/java/lang/String.java line 4245:
>
>> 4243: * escape sequences and Unicode escapes are translated as encountered in one pass and
>> 4244: * <strong>not</strong> done as an Unicode escapes pass followed by an escape sequences
>> 4245: * pass.
>
> I would move the description of the compiler behavior to the end and remove "also". For example,
> Suggestion:
>
> * @implNote As a convenience for use with constructed
> * strings, this method translates Unicode escapes. For example, this
> * method could be used when ASCII encoded text files need to maintain Unicode
> * content. The translation is done in a single pass and is non-recursive. That is,
> * escape sequences and Unicode escapes are translated as encountered in one pass and
> * <strong>not</strong> done as an Unicode escapes pass followed by an escape sequences
> * pass. By comparison, the compiler translates all Unicode escapes before string
> * literals are translated.
Changed
> test/jdk/java/lang/String/TranslateEscapes.java line 97:
>
>> 95: verifyUnicodeEscape("\\u2022", "\u2022");
>> 96: verifyUnicodeEscape("\\ud83c\\udf09", "\ud83c\udf09");
>> 97: verifyUnicodeEscape("\\uuuuu2022", "\uuuuu2022");
>
> Include the code from the example as a test case too.
None present. Was a mis-paste.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467926349
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467926483
PR Review Comment: https://git.openjdk.org/jdk/pull/17491#discussion_r1467929023
More information about the core-libs-dev
mailing list