RFR: 8248655: Support supplementary characters in String case insensitive operations
naoto.sato at oracle.com
naoto.sato at oracle.com
Wed Jul 15 18:32:40 UTC 2020
Hi Joe,
Thank you for your review.
On 7/15/20 10:57 AM, Joe Wang wrote:
> Hi Naoto,
>
> In StringUTF16.java, if one is isHighSurrogate and the other not, you
> may quickly return without going through the rest of the process,
> probably not significant as cp1 and cp2 and/or u1 and u2 won't be equal
> anyways. But it could skip a couple of
> toCodePoint/toUpperCase/toLowerCase calls.
Yes, that is correct as of now, which is based on the assumption that
case mappings do not cross BMP and supplementary planes boundary. I
could not find any description where that's given or not. So I just took
it to be safe.
Naoto
>
> -Joe
>
> On 7/15/20 9:00 AM, naoto.sato at oracle.com wrote:
>> Hello,
>>
>> Please review the fix to the following issues:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8248655
>> https://bugs.openjdk.java.net/browse/JDK-8248434
>>
>> The proposed changeset and its CSR are located at:
>>
>> https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8248664
>>
>> A bug was filed against SimpleDateFormat (8248434) where
>> case-insensitive date format/parse failed in some of the new locales
>> in JDK15. The root cause was that case-insensitive
>> String.regionMatches() method did not work with supplementary
>> characters. The problem is that the method's spec does not expect case
>> mappings of supplementary characters, possibly because it was
>> overlooked in the first place, JSR 204 - "Unicode Supplementary
>> Character support". Similar behavior is observed in other two
>> case-insensitive methods, i.e., compareToIgnoreCase() and
>> equalsIgnoreCase().
>>
>> The fix is straightforward to compare strings by code point basis,
>> instead of code unit (16bit "char") basis. Technically this change
>> will introduce a backward incompatibility, but I believe it is an
>> incompatibility to wrong behavior, not true to the meaning of those
>> methods' expectations.
>>
>> Naoto
>
More information about the core-libs-dev
mailing list