Case insensitive regexes and collators
naoto.sato at oracle.com
Fri Sep 11 16:33:57 UTC 2020
Glad to hear that you are delighted with the recent fix (JDK-8248655).
The scope of the fix is limited to the String class, so it may or may
not affect the said RegEx and/or Collator case insensitive operations. I
created the following two issues to track your observations:
And happy to take a look at them.
PS. "jdk-dev" is for the technical discussion related to the "JDK
Project", so I'd recommend choosing either core-libs and/or i18n-dev
mailing lists for the further discussion.
On 9/10/20 3:52 PM, Dai Conrad wrote:
> I was delighted to hear the longstanding problem with
> case-insensitive comparisons of strings with astral
> characters (ones outside the basic multilingual plane)
> was fixed in JDK 16 build 8. Methods equalsIgnoreCase,
> regionMatches, and compareToIgnoreCase all work
> correctly now.
> I had assumed this would also fix case-insensitive regular
> expressions and java.text.Collator, since I guessed they
> boiled down to a call to regionMatches somewhere under
> the covers. But this appears not to be the... case.
> For scripts Deseret, Osage, Old Hungarian, Warang Citi,
> Medefaidrin, and Adlam, for strings with upper- and
> lowercase variants of the same letter, the following
> code fails:
> Pattern pattern = Pattern.compile(lower, Pattern.CASE_INSENSITIVE);
> Matcher matcher = pattern.matcher(upper);
> Collator collator = Collator.getInstance();
> assertThat(collator.compare(lower, upper)).isEqualTo(0);
> I'm not sure why the fix didn't fix these, but it would be
> a shame to overlook them while fixing it in other places.
More information about the jdk-dev