RFR: 8347645: C2: XOR bounded value handling blocks constant folding [v3]
Johannes Graham
duke at openjdk.org
Wed Jan 22 20:12:49 UTC 2025
On Wed, 22 Jan 2025 17:31:42 GMT, Johannes Graham <duke at openjdk.org> wrote:
>> C2 does not eliminate XOR nodes with constant arguments. This has a noticeable effect on `Long.expand` with a constant mask, on architectures that don't have instructions equivalent to `PDEP` to be used in an intrinsic.
>>
>> This patch demonstrates a potential fix to the problem, but there might well be better ways to do it.
>
> Johannes Graham has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits:
>
> - Merge branch 'openjdk:master' into xor_const
> - move tests, add correctness checks
> - Merge remote-tracking branch 'origin/xor_const' into xor_const
> - move logic from Value to add_node XorL
> - move logic from Value to add_node XorI
> - format, copyright date
> - xor const folding
> - test for xor const folding
> - move logic from Value to add_node XorL
> - move logic from Value to add_node XorI
> - ... and 3 more: https://git.openjdk.org/jdk/compare/b720517c...fdede511
Some rough performance numbers with existing tests:
Aarch64 M4 Pro
Baseline:
Benchmark (size) Mode Cnt Score Error Units
Longs.compress 500 avgt 5 0.512 ? 0.002 us/op
Longs.expand 500 avgt 5 0.503 ? 0.014 us/op
This PR:
Benchmark (size) Mode Cnt Score Error Units
Longs.compress 500 avgt 5 0.420 ? 0.008 us/op
Longs.expand 500 avgt 5 0.312 ? 0.002 us/op
It's sensitive to what the mask value is. With the mask from https://github.com/openjdk/jdk/pull/22928, the numbers are:
Baseline:
`Longs.expand2 500 avgt 15 0.494 ? 0.002 us/op`
This PR:
`Longs.expand2 500 avgt 5 0.242 ? 0.002 us/op
`
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23089#issuecomment-2608168599
More information about the hotspot-compiler-dev
mailing list