RFR: 8347405: MergeStores with reverse bytes order value [v5]

Emanuel Peter epeter at openjdk.org
Mon Jan 20 07:36:40 UTC 2025


On Mon, 20 Jan 2025 04:17:55 GMT, kuaiwei <duke at openjdk.org> wrote:

>> This patch enhance MergeStores optimization to support merge value with reverse byte order.
>> 
>> Below is benchmark result before and after the patch:
>> 
>> On aliyun g8y (aarch64)
>> |name   | before        | score2        | ratio |
>> |---|---|---|---|
>> |MergeStoreBench.setCharBS      |5669.655000    |5669.566000    | 0.00 %|
>> |MergeStoreBench.setCharBV      |5516.911000    |5516.273000    | 0.01 %|
>> |MergeStoreBench.setCharC       |5578.644000    |5552.809000    | 0.47 %|
>> |MergeStoreBench.setCharLS      |5782.140000    |5779.264000    | 0.05 %|
>> |MergeStoreBench.setCharLV      |5496.403000    |5499.195000    | -0.05 %|
>> |MergeStoreBench.setIntB        |6087.703000    |2768.385000    | 119.90 %|
>> |MergeStoreBench.setIntBU       |6733.813000    |2950.240000    | 128.25 %|
>> |MergeStoreBench.setIntBV       |1362.233000    |1361.821000    | 0.03 %|
>> |MergeStoreBench.setIntL        |2834.785000    |2833.042000    | 0.06 %|
>> |MergeStoreBench.setIntLU       |2947.145000    |2946.874000    | 0.01 %|
>> |MergeStoreBench.setIntLV       |5506.791000    |5506.229000    | 0.01 %|
>> |MergeStoreBench.setIntRB       |7634.279000    |5611.058000    | 36.06 %|
>> |MergeStoreBench.setIntRBU      |7766.737000    |5551.281000    | 39.91 %|
>> |MergeStoreBench.setIntRL       |5689.793000    |5689.385000    | 0.01 %|
>> |MergeStoreBench.setIntRLU      |5628.287000    |5628.789000    | -0.01 %|
>> |MergeStoreBench.setIntRU       |5536.039000    |5534.910000    | 0.02 %|
>> |MergeStoreBench.setIntU        |5595.363000    |5567.810000    | 0.49 %|
>> |MergeStoreBench.setLongB       |13722.671000   |6811.098000    | 101.48 %|
>> |MergeStoreBench.setLongBU      |13728.844000   |4280.240000    | 220.75 %|
>> |MergeStoreBench.setLongBV      |2785.255000    |2785.949000    | -0.02 %|
>> |MergeStoreBench.setLongL       |5714.615000    |5710.402000    | 0.07 %|
>> |MergeStoreBench.setLongLU      |4128.746000    |4129.324000    | -0.01 %|
>> |MergeStoreBench.setLongLV      |2793.125000    |2794.438000    | -0.05 %|
>> |MergeStoreBench.setLongRB      |14465.223000   |7015.050000    | 106.20 %|
>> |MergeStoreBench.setLongRBU     |14546.954000   |6173.210000    | 135.65 %|
>> |MergeStoreBench.setLongRL      |6816.145000    |6813.348000    | 0.04 %|
>> |MergeStoreBench.setLongRLU     |4289.445000    |4284.239000    | 0.12 %|
>> |MergeStoreBench.setLongRU      |3132.471000    |3133.093000    | -0.02 %|
>> |MergeStoreBench.setLongU       |3086.779000    |3087.298000    | -0.02 %|
>> 
>> AMD EPYC 9T24
>> ...
>
> kuaiwei has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
> 
>  - Fix build failure after merging
>  - Merge remote-tracking branch 'origin/master' into pr/merge_stores_reverse
>  - Exclude RISCV for Reverse nodes
>  - Fix ppc64le
>  - Update enum ValueOrder type
>  - Fix as review comments
>  - Remove unused test option
>  - 8347405: MergeStores with reverse bytes order value

Thanks for all the updates! I have some more comments / suggestions ;)

test/hotspot/jtreg/compiler/c2/TestMergeStores.java line 803:

> 801:                    IRNode.REVERSE_BYTES_L, "1"},
> 802:         applyIf = {"UseUnalignedAccesses", "true"},
> 803:         applyIfPlatformAnd = {"little-endian", "true", "riscv64", "false"})   // Exclude RISV64 because ReverseBytesI/L are not supported

Can you please leave in the checks for the other stores (B, C, I)? I would like to have an exact count of each. This is so that we are sure that the others have disappeared as expected, not that some of them linger around and worsen performance.

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/23030#pullrequestreview-2561523184
PR Review Comment: https://git.openjdk.org/jdk/pull/23030#discussion_r1921913538


More information about the hotspot-compiler-dev mailing list