RFR: 8365205: C2: Optimize popcount value computation using knownbits

Hannes Greule hgreule at openjdk.org
Thu Sep 4 06:29:40 UTC 2025


On Wed, 3 Sep 2025 16:10:43 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> This patch optimizes PopCount value transforms using KnownBits information.
> Following are the results of the micro-benchmark included with the patch
> 
> System: INTEL(R) XEON(R) PLATINUM 8581C CPU @ 2.30GHz (Emerald Rapids)
> 
> 
> Baseline:-
> Benchmark                                      Mode  Cnt       Score   Error  Units
> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  151997.051          ops/s
> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  161261.825          ops/s
> PopCountValueTransform.StockKernelInt         thrpt    2  194680.419          ops/s
> PopCountValueTransform.StockKernelLong        thrpt    2  216580.319          ops/s
> 
> Withopt:-
> Benchmark                                      Mode  Cnt       Score   Error  Units
> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  216502.647          ops/s
> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  193400.575          ops/s
> PopCountValueTransform.StockKernelInt         thrpt    2  195595.989          ops/s
> PopCountValueTransform.StockKernelLong        thrpt    2  217776.426          ops/s 
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

The change looks good, but I wonder:

- if it makes sense to have some kind of IR tests (i.e., it's folded away when unneeded, when the input is a constant, ...)?
- whether the explanation could be simplified: Assuming a correct implementation of the KnownBits canonicalization, we can argue
	- `_zeroes` has the bits set that are known to be always 0. So `BitsPer<Type> - popCount(x)` gives you an upper limit of how many bits *might* be 1. And `BitsPer<Type> - popCount(_zeroes)` is equivalent to `popCount(~_zeroes)`.
	- `_ones` has the bits set that are known to be always 1. Trivially, `popCount(_ones)` is a valid lower bound.
	- The rest repeats how `adjust_bits_from_unsigned_bounds` works, but that's not specific to the popcount nodes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27075#issuecomment-3252114288


More information about the hotspot-compiler-dev mailing list