RFR: 8360192: C2: Make the type of count leading/trailing zero nodes more precise [v9]
Qizheng Xing
qxing at openjdk.org
Fri Aug 8 08:24:13 UTC 2025
On Fri, 8 Aug 2025 08:21:42 GMT, Qizheng Xing <qxing at openjdk.org> wrote:
>> The result of count leading/trailing zeros is always non-negative, and the maximum value is integer type's size in bits. In previous versions, when C2 can not know the operand value of a CLZ/CTZ node at compile time, it will generate a full-width integer type for its result. This can significantly affect the efficiency of code in some cases.
>>
>> This patch makes the type of CLZ/CTZ nodes more precise, to make C2 generate better code. For example, the following implementation runs ~115% faster on x86-64 with this patch:
>>
>>
>> public static int numberOfNibbles(int i) {
>> int mag = Integer.SIZE - Integer.numberOfLeadingZeros(i);
>> return Math.max((mag + 3) / 4, 1);
>> }
>>
>>
>> Testing: tier1, IR test
>
> Qizheng Xing has updated the pull request incrementally with two additional commits since the last revision:
>
> - Add microbench
> - Add missing test method declarations
Hi @jatin-bhateja, I've added a micro benchmark that includes the `numberOfNibbles` implementation from this PR description and your micro kernel.
Here's my test results on an Intel(R) Xeon(R) Platinum:
# Baseline:
Benchmark Mode Cnt Score Error Units
CountLeadingZeros.benchClzLongConstrained avgt 15 1517.888 ± 5.691 ns/op
CountLeadingZeros.benchNumberOfNibbles avgt 15 1094.422 ± 1.753 ns/op
# This patch:
Benchmark Mode Cnt Score Error Units
CountLeadingZeros.benchClzLongConstrained avgt 15 0.948 ± 0.002 ns/op
CountLeadingZeros.benchNumberOfNibbles avgt 15 942.438 ± 1.742 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25928#issuecomment-3166981089
More information about the hotspot-compiler-dev
mailing list