RFR: 8347645: C2: XOR bounded value handling blocks constant folding [v3]

Johannes Graham duke at openjdk.org
Wed Jan 22 20:12:49 UTC 2025


On Wed, 22 Jan 2025 17:31:42 GMT, Johannes Graham <duke at openjdk.org> wrote:

>> C2 does not eliminate XOR nodes with constant arguments. This has a noticeable effect on `Long.expand` with a constant mask, on architectures that don't have instructions equivalent  to `PDEP` to be used in an intrinsic.
>> 
>> This patch demonstrates a potential fix to the problem, but there might well be better ways to do it.
>
> Johannes Graham has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits:
> 
>  - Merge branch 'openjdk:master' into xor_const
>  - move tests, add correctness checks
>  - Merge remote-tracking branch 'origin/xor_const' into xor_const
>  - move logic from Value to add_node XorL
>  - move logic from Value to add_node XorI
>  - format, copyright date
>  - xor const folding
>  - test for xor const folding
>  - move logic from Value to add_node XorL
>  - move logic from Value to add_node XorI
>  - ... and 3 more: https://git.openjdk.org/jdk/compare/b720517c...fdede511

Some rough performance numbers with existing tests:

Aarch64 M4 Pro
Baseline:

Benchmark       (size)  Mode  Cnt  Score   Error  Units
Longs.compress     500  avgt    5  0.512 ? 0.002  us/op
Longs.expand       500  avgt    5  0.503 ? 0.014  us/op


This PR:

Benchmark       (size)  Mode  Cnt  Score   Error  Units
Longs.compress     500  avgt    5  0.420 ? 0.008  us/op
Longs.expand       500  avgt    5  0.312 ? 0.002  us/op


It's sensitive to what the mask value is. With the mask from https://github.com/openjdk/jdk/pull/22928, the numbers are:
Baseline:
`Longs.expand2      500  avgt   15  0.494 ? 0.002  us/op`

This PR:
`Longs.expand2      500  avgt    5  0.242 ? 0.002  us/op
`

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23089#issuecomment-2608168599


More information about the hotspot-compiler-dev mailing list