RFR: 8365205: C2: Optimize popcount value computation using knownbits [v9]
Emanuel Peter
epeter at openjdk.org
Fri Sep 19 09:41:00 UTC 2025
On Fri, 19 Sep 2025 08:23:50 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> This patch optimizes PopCount value transforms using KnownBits information.
>> Following are the results of the micro-benchmark included with the patch
>>
>>
>>
>> System: 13th Gen Intel(R) Core(TM) i3-1315U
>>
>> Baseline:
>> Benchmark Mode Cnt Score Error Units
>> PopCountValueTransform.LogicFoldingKerenLong thrpt 2 215460.670 ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt thrpt 2 294014.826 ops/s
>>
>> Withopt:
>> Benchmark Mode Cnt Score Error Units
>> PopCountValueTransform.LogicFoldingKerenLong thrpt 2 389978.082 ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt thrpt 2 417261.583 ops/s
>>
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> Review comments resolutions
As I'm about to board my plane for a 3-week vacation, I'm leaving a last little **note for the reviewers**.
I think this is a really nice addition, so thanks for doing it @jatin-bhateja 😊 . Though it will only reach its full potential once we implement more "basic" KnownBits optimizations such as [JDK-8367341](https://bugs.openjdk.org/browse/JDK-8367341).
Please make sure you **test** it, and make sure the random values generated with the Generators in `test/hotspot/jtreg/compiler/intrinsics/TestPopCountValueTransforms.java` make sense. Currently, there is for example a 32 bit range for a 64 bit long value, which is not correct, I think.
By default, my recommendation is to **not** constrain the Generators ranges, unless there is a really good reason. Generators are already built to produce values close to zero at an over-proportional rate. But by not restricting we may at some point also hit cases that we did not anticipate, and catch bugs that way.
test/hotspot/jtreg/compiler/intrinsics/TestPopCountValueTransforms.java line 54:
> 52: static final long rand_bndL2 = G.longs().next();
> 53: static final long rand_popcL1 = G.uniformLongs(0, 32).next();
> 54: static final long rand_popcL2 = G.uniformLongs(0, 32).next();
Why did you limit the range for longs to 32? Can it not go up to 64?
I asked for an explanation (in a code comment) of those that you restrict here, which you have not done, and just "resolved" it instead:
https://github.com/openjdk/jdk/pull/27075#discussion_r2351166568
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/27075#pullrequestreview-3244008016
PR Review Comment: https://git.openjdk.org/jdk/pull/27075#discussion_r2362301238
More information about the hotspot-compiler-dev
mailing list