Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

Jim Laskey james.laskey at oracle.com
Tue Sep 7 12:29:39 UTC 2021


Bug submitted on your behalf.

https://bugs.openjdk.java.net/browse/JDK-8273430

> On Sep 6, 2021, at 4:16 AM, Andrey Turbanov <turbanoff at gmail.com> wrote:
> 
> Hello.
> I found suspicious condition in the method
> java.util.regex.Grapheme#isExcludedSpacingMark
> It's detected by IntelliJ IDEA inspection 'Condition is covered by
> further condition'
> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/regex/Grapheme.java#L157
> 
> ```
> private static boolean isExcludedSpacingMark(int cp) {
>   return  cp == 0x102B || cp == 0x102C || cp == 0x1038 ||
>           cp >= 0x1062 && cp <= 0x1064 ||
>           cp >= 0x1062 && cp <= 0x106D ||  // <== here is the warning
>           cp == 0x1083 ||
>           cp >= 0x1087 && cp <= 0x108C ||
>           cp == 0x108F ||
>           cp >= 0x109A && cp <= 0x109C ||
>           cp == 0x1A61 || cp == 0x1A63 || cp == 0x1A64 ||
>           cp == 0xAA7B || cp == 0xAA7D;
> }
> ```
> There are 2 sub-conditions in this complex condition:
> cp >= 0x1062 && cp <= 0x1064 ||
> cp >= 0x1062 && cp <= 0x106D ||
> 
> The second condition is _wider_ than the first one.
> I believe it's a bug. The second condition (according to
> https://www.compart.com/en/unicode/category/Mc) should look like this:
> 
> cp >= 0x1067 && cp <= 0x106D ||
> 
> 0x1065, 0x1066 are not from the Spacing_Mark category.
> 
> 
> Andrey Turbanov



More information about the core-libs-dev mailing list