RFR: 8365205: C2: Optimize popcount value computation using knownbits [v9]

Emanuel Peter epeter at openjdk.org
Fri Sep 19 09:41:00 UTC 2025


On Fri, 19 Sep 2025 08:23:50 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> This patch optimizes PopCount value transforms using KnownBits information.
>> Following are the results of the micro-benchmark included with the patch
>> 
>> 
>> 
>> System: 13th Gen Intel(R) Core(TM) i3-1315U
>> 
>> Baseline:
>> Benchmark                                      Mode  Cnt       Score   Error  Units
>> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  215460.670          ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  294014.826          ops/s
>> 
>> Withopt:
>> Benchmark                                      Mode  Cnt       Score   Error  Units
>> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  389978.082          ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  417261.583          ops/s
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments resolutions

As I'm about to board my plane for a 3-week vacation, I'm leaving a last little **note for the reviewers**.

I think this is a really nice addition, so thanks for doing it @jatin-bhateja 😊 . Though it will only reach its full potential once we implement more "basic" KnownBits optimizations such as [JDK-8367341](https://bugs.openjdk.org/browse/JDK-8367341).

Please make sure you **test** it, and make sure the random values generated with the Generators in `test/hotspot/jtreg/compiler/intrinsics/TestPopCountValueTransforms.java` make sense. Currently, there is for example a 32 bit range for a 64 bit long value, which is not correct, I think.

By default, my recommendation is to **not** constrain the Generators ranges, unless there is a really good reason. Generators are already built to produce values close to zero at an over-proportional rate. But by not restricting we may at some point also hit cases that we did not anticipate, and catch bugs that way.

test/hotspot/jtreg/compiler/intrinsics/TestPopCountValueTransforms.java line 54:

> 52:     static final long rand_bndL2 = G.longs().next();
> 53:     static final long rand_popcL1 = G.uniformLongs(0, 32).next();
> 54:     static final long rand_popcL2 = G.uniformLongs(0, 32).next();

Why did you limit the range for longs to 32? Can it not go up to 64?
I asked for an explanation (in a code comment) of those that you restrict here, which you have not done, and just "resolved" it instead:
https://github.com/openjdk/jdk/pull/27075#discussion_r2351166568

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27075#pullrequestreview-3244008016
PR Review Comment: https://git.openjdk.org/jdk/pull/27075#discussion_r2362301238


More information about the hotspot-compiler-dev mailing list