RFR: 8360192: C2: Make the type of count leading/trailing zero nodes more precise [v10]

Tue Aug 19 14:05:45 UTC 2025

On Tue, 19 Aug 2025 01:34:03 GMT, Qizheng Xing <qxing at openjdk.org> wrote:

>> The result of count leading/trailing zeros is always non-negative, and the maximum value is integer type's size in bits. In previous versions, when C2 can not know the operand value of a CLZ/CTZ node at compile time, it will generate a full-width integer type for its result. This can significantly affect the efficiency of code in some cases.
>> 
>> This patch makes the type of CLZ/CTZ nodes more precise, to make C2 generate better code. For example, the following implementation runs ~115% faster on x86-64 with this patch:
>> 
>> 
>> public static int numberOfNibbles(int i) {
>>   int mag = Integer.SIZE - Integer.numberOfLeadingZeros(i);
>>   return Math.max((mag + 3) / 4, 1);
>> }
>> 
>> 
>> Testing: tier1, IR test
>
> Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove redundant `@require` in IR test

Looks like a good patch, thanks for the work and patience with the review - it's been a bit slow over summer with vacation/travel.

src/hotspot/share/opto/countbitsnode.cpp line 47:

> 45:     if (x >> 30 == 0) { n +=  2; x <<=  2; }
> 46:     n -= x >> 31;
> 47:     return TypeInt::make(n);

Is there already a test that covers all the cases that constant fold here? Just to make sure we do not get regressions here.

src/hotspot/share/opto/countbitsnode.cpp line 57:

> 55:   const TypeInt* ti = t->is_int();
> 56:   return TypeInt::make(count_leading_zeros_int(~ti->_bits._zeros),
> 57:                        count_leading_zeros_int(ti->_bits._ones),

I think this is correct, but I would like to see a short comment why it is correct.

test/hotspot/jtreg/compiler/c2/gvn/TestCountBitsRange.java line 164:

> 162:         return Long.numberOfTrailingZeros(l) / 8;
> 163:     }
> 164: }

Nice examples! Could you please add a short description to most of them, explaining what you are testing with each? It would help me as a reviewer to see if you cover enough cases.

I'm also missing some cases where you have non-trivial input ranges. And then verification that the output range is correct.

You could look at this example:
https://github.com/openjdk/jdk/pull/25254/files#diff-0e3d89ac8cf0548b69d9bdb0859380bc31de0a772fa7ff211f446a4a5abd4197R220-R248

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25928#pullrequestreview-3132415938
PR Review Comment: https://git.openjdk.org/jdk/pull/25928#discussion_r2285354628
PR Review Comment: https://git.openjdk.org/jdk/pull/25928#discussion_r2285342030
PR Review Comment: https://git.openjdk.org/jdk/pull/25928#discussion_r2285373835