RFR: 8360192: C2: Make the type of count leading/trailing zero nodes more precise [v9]

Qizheng Xing qxing at openjdk.org
Fri Aug 8 08:24:13 UTC 2025


On Fri, 8 Aug 2025 08:21:42 GMT, Qizheng Xing <qxing at openjdk.org> wrote:

>> The result of count leading/trailing zeros is always non-negative, and the maximum value is integer type's size in bits. In previous versions, when C2 can not know the operand value of a CLZ/CTZ node at compile time, it will generate a full-width integer type for its result. This can significantly affect the efficiency of code in some cases.
>> 
>> This patch makes the type of CLZ/CTZ nodes more precise, to make C2 generate better code. For example, the following implementation runs ~115% faster on x86-64 with this patch:
>> 
>> 
>> public static int numberOfNibbles(int i) {
>>   int mag = Integer.SIZE - Integer.numberOfLeadingZeros(i);
>>   return Math.max((mag + 3) / 4, 1);
>> }
>> 
>> 
>> Testing: tier1, IR test
>
> Qizheng Xing has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add microbench
>  - Add missing test method declarations

Hi @jatin-bhateja, I've added a micro benchmark that includes the `numberOfNibbles` implementation from this PR description and your micro kernel.

Here's my test results on an Intel(R) Xeon(R) Platinum:


# Baseline:
Benchmark                                  Mode  Cnt     Score   Error  Units
CountLeadingZeros.benchClzLongConstrained  avgt   15  1517.888 ± 5.691  ns/op
CountLeadingZeros.benchNumberOfNibbles     avgt   15  1094.422 ± 1.753  ns/op

# This patch:
Benchmark                                  Mode  Cnt    Score   Error  Units
CountLeadingZeros.benchClzLongConstrained  avgt   15    0.948 ± 0.002  ns/op
CountLeadingZeros.benchNumberOfNibbles     avgt   15  942.438 ± 1.742  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25928#issuecomment-3166981089


More information about the hotspot-compiler-dev mailing list