RFR: 8321514: UTF16 string gets constructed incorrectly from codepoints if CompactStrings is not enabled [v2]

Roger Riggs rriggs at openjdk.org
Tue Dec 12 19:15:29 UTC 2023


On Tue, 12 Dec 2023 10:47:48 GMT, Aleksei Voitylov <avoitylov at openjdk.org> wrote:

>> Since JDK-8311906, if CompactStrings is not enabled, index is not considered when calling extractCodepoints from StringUTF16.toBytes(). Because of that the last elements of the source codepoints array are stripped from the resulting UTF16 string, which fires in other places (e.g. during RegEx processing).
>>  
>> The fix replaces len in extractCodepoints parameters with end that is index + len.
>
> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review comments

test/jdk/java/lang/String/Chars.java line 50:

> 48:             testChars(cc, ccExp);
> 49:             testCharsSubrange(cc, ccExp);
> 50:             testIntsSubrange(ccExp);

Please also add the same call after line 74 to apply the test to the surrogates case.
(As suggested by @rgiulietti in https://github.com/openjdk/jdk/pull/17066)

test/jdk/java/lang/String/Chars.java line 126:

> 124:                 System.err.println("expected: " + Arrays.toString(expected));
> 125:                 System.err.println("actual: " + Arrays.toString(actual));
> 126:                 throw new RuntimeException("testCharsSubrange failed!");

"testCharsSubrange" -> "testIntsSubrange"  as commented in the dependent PR#17066.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17057#discussion_r1424466407
PR Review Comment: https://git.openjdk.org/jdk/pull/17057#discussion_r1424451093


More information about the core-libs-dev mailing list