RFR: 8347405: MergeStores with reverse bytes order value [v11]
Emanuel Peter
epeter at openjdk.org
Wed Feb 26 06:31:58 UTC 2025
On Tue, 28 Jan 2025 03:09:28 GMT, kuaiwei <duke at openjdk.org> wrote:
>> This patch enhance MergeStores optimization to support merge value with reverse byte order.
>>
>> Below is benchmark result before and after the patch:
>>
>> On aliyun g8y (aarch64)
>> |name | before | score2 | ratio |
>> |---|---|---|---|
>> |MergeStoreBench.setCharBS |5669.655000 |5669.566000 | 0.00 %|
>> |MergeStoreBench.setCharBV |5516.911000 |5516.273000 | 0.01 %|
>> |MergeStoreBench.setCharC |5578.644000 |5552.809000 | 0.47 %|
>> |MergeStoreBench.setCharLS |5782.140000 |5779.264000 | 0.05 %|
>> |MergeStoreBench.setCharLV |5496.403000 |5499.195000 | -0.05 %|
>> |MergeStoreBench.setIntB |6087.703000 |2768.385000 | 119.90 %|
>> |MergeStoreBench.setIntBU |6733.813000 |2950.240000 | 128.25 %|
>> |MergeStoreBench.setIntBV |1362.233000 |1361.821000 | 0.03 %|
>> |MergeStoreBench.setIntL |2834.785000 |2833.042000 | 0.06 %|
>> |MergeStoreBench.setIntLU |2947.145000 |2946.874000 | 0.01 %|
>> |MergeStoreBench.setIntLV |5506.791000 |5506.229000 | 0.01 %|
>> |MergeStoreBench.setIntRB |7634.279000 |5611.058000 | 36.06 %|
>> |MergeStoreBench.setIntRBU |7766.737000 |5551.281000 | 39.91 %|
>> |MergeStoreBench.setIntRL |5689.793000 |5689.385000 | 0.01 %|
>> |MergeStoreBench.setIntRLU |5628.287000 |5628.789000 | -0.01 %|
>> |MergeStoreBench.setIntRU |5536.039000 |5534.910000 | 0.02 %|
>> |MergeStoreBench.setIntU |5595.363000 |5567.810000 | 0.49 %|
>> |MergeStoreBench.setLongB |13722.671000 |6811.098000 | 101.48 %|
>> |MergeStoreBench.setLongBU |13728.844000 |4280.240000 | 220.75 %|
>> |MergeStoreBench.setLongBV |2785.255000 |2785.949000 | -0.02 %|
>> |MergeStoreBench.setLongL |5714.615000 |5710.402000 | 0.07 %|
>> |MergeStoreBench.setLongLU |4128.746000 |4129.324000 | -0.01 %|
>> |MergeStoreBench.setLongLV |2793.125000 |2794.438000 | -0.05 %|
>> |MergeStoreBench.setLongRB |14465.223000 |7015.050000 | 106.20 %|
>> |MergeStoreBench.setLongRBU |14546.954000 |6173.210000 | 135.65 %|
>> |MergeStoreBench.setLongRL |6816.145000 |6813.348000 | 0.04 %|
>> |MergeStoreBench.setLongRLU |4289.445000 |4284.239000 | 0.12 %|
>> |MergeStoreBench.setLongRU |3132.471000 |3133.093000 | -0.02 %|
>> |MergeStoreBench.setLongU |3086.779000 |3087.298000 | -0.02 %|
>>
>> AMD EPYC 9T24
>> ...
>
> kuaiwei has updated the pull request incrementally with three additional commits since the last revision:
>
> - Allow ValueOrder::Reverse on big-endian platforms
> - Revert "Merge more stores"
>
> This reverts commit 1e1113ed02ec5a9fe181f215d5667e8de487fe47.
> - Revert "Fix test502aBE"
>
> This reverts commit f773fa368577c4f67957c4d40968c5c45e3ae205.
@kuaiwei sorry for the long delay. Thanks for your patience, and more importantly your good work here!
I have 2 nit-picks, but they are not very important. I'll run some internal testing now.
src/hotspot/share/opto/memnode.cpp line 3020:
> 3018: ValueOrder input_value_order = find_adjacent_input_value_order(n1, n2, memory_size);
> 3019:
> 3020: if (input_value_order == ValueOrder::NotAdjacent) {
You check `input_value_order` against various cases here. Maybe a `switch` could be a good alternative.
src/hotspot/share/opto/memnode.cpp line 3028:
> 3026: !Matcher::match_rule_supported(Op_ReverseBytesI) ||
> 3027: !Matcher::match_rule_supported(Op_ReverseBytesL) ||
> 3028: !Matcher::match_rule_supported(Op_ReverseBytesS)
Nit-pick: please order it from small to large: S -> I -> L
Suggestion:
!Matcher::match_rule_supported(Op_ReverseBytesS) ||
!Matcher::match_rule_supported(Op_ReverseBytesI) ||
!Matcher::match_rule_supported(Op_ReverseBytesL)
test/hotspot/jtreg/compiler/c2/TestMergeStores.java line 727:
> 725: applyIf = {"UseUnalignedAccesses", "true"},
> 726: applyIfPlatform = {"little-endian", "true"})
> 727: @IR(counts = {IRNode.STORE_B_OF_CLASS, "byte\\\\[int:>=0] \\\\(java/lang/Cloneable,java/io/Serializable\\\\)", "8",
You may want to add the bug number here at the top of the file.
-------------
PR Review: https://git.openjdk.org/jdk/pull/23030#pullrequestreview-2643204863
PR Review Comment: https://git.openjdk.org/jdk/pull/23030#discussion_r1970974802
PR Review Comment: https://git.openjdk.org/jdk/pull/23030#discussion_r1970973927
PR Review Comment: https://git.openjdk.org/jdk/pull/23030#discussion_r1970984644
More information about the hotspot-compiler-dev
mailing list