RFR: 8321514: UTF16 string gets constructed incorrectly from codepoints if CompactStrings is not enabled [v3]

Wed Dec 13 11:42:39 UTC 2023

On Wed, 13 Dec 2023 11:39:19 GMT, Aleksei Voitylov <avoitylov at openjdk.org> wrote:

>> Since JDK-8311906, if CompactStrings is not enabled, index is not considered when calling extractCodepoints from StringUTF16.toBytes(). Because of that the last elements of the source codepoints array are stripped from the resulting UTF16 string, which fires in other places (e.g. during RegEx processing).
>>  
>> The fix replaces len in extractCodepoints parameters with end that is index + len.
>
> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review comments

1. Yes, I'm aware of -XX:-CompactStrings being used in production in server deployments (no, I didn't collect the reasons for that).
2. Yes, it was discovered as part of release testing. Also related to our support efforts for ARM32, but not just that.

If you think it's worth it, I can go through some tests and add some more -XX:-CompactStrings mode to java.lang.String tests here and there, lightly. It's not the first time we hit this issue.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17057#issuecomment-1853758819