RFR: 8285255: refine StringLatin1.regionMatchesCI_UTF16 [v3]
XenoAmess
duke at openjdk.java.net
Mon Apr 25 20:16:26 UTC 2022
On Mon, 25 Apr 2022 15:10:01 GMT, Claes Redestad <redestad at openjdk.org> wrote:
> Unfortunately this leads to an error for case-insensitive `regionMatches` between a latin-1-string that contains either of `\u00b5` or `\u00ff` (these two code-points have upper case codepoints outside of the latin-1 range) and a UTF-16 string:
>
> ```
> jshell> "\u00b5".regionMatches(true, 0, "\u0100", 0, 1)
> | Exception java.lang.ArrayIndexOutOfBoundsException: Index 924 out of bounds for length 256
> | at CharacterDataLatin1.getProperties (CharacterDataLatin1.java:74)
> | at CharacterDataLatin1.toLowerCase (CharacterDataLatin1.java:140)
> | at StringLatin1.regionMatchesCI_UTF16 (StringLatin1.java:420)
> | at String.regionMatches (String.java:2238)
> | at (#4:1)
> ```
@cl4es Yes, you are correct.
And if change it to `CharacterDataLatin1.instance.toLowerCase(u1)`, it would strongly harm performance.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8308
More information about the core-libs-dev
mailing list