RFR: 8347405: MergeStores with reverse bytes order value [v10]
Emanuel Peter
epeter at openjdk.org
Mon Jan 27 07:25:50 UTC 2025
On Fri, 24 Jan 2025 12:51:27 GMT, kuaiwei <duke at openjdk.org> wrote:
>> This patch enhance MergeStores optimization to support merge value with reverse byte order.
>>
>> Below is benchmark result before and after the patch:
>>
>> On aliyun g8y (aarch64)
>> |name | before | score2 | ratio |
>> |---|---|---|---|
>> |MergeStoreBench.setCharBS |5669.655000 |5669.566000 | 0.00 %|
>> |MergeStoreBench.setCharBV |5516.911000 |5516.273000 | 0.01 %|
>> |MergeStoreBench.setCharC |5578.644000 |5552.809000 | 0.47 %|
>> |MergeStoreBench.setCharLS |5782.140000 |5779.264000 | 0.05 %|
>> |MergeStoreBench.setCharLV |5496.403000 |5499.195000 | -0.05 %|
>> |MergeStoreBench.setIntB |6087.703000 |2768.385000 | 119.90 %|
>> |MergeStoreBench.setIntBU |6733.813000 |2950.240000 | 128.25 %|
>> |MergeStoreBench.setIntBV |1362.233000 |1361.821000 | 0.03 %|
>> |MergeStoreBench.setIntL |2834.785000 |2833.042000 | 0.06 %|
>> |MergeStoreBench.setIntLU |2947.145000 |2946.874000 | 0.01 %|
>> |MergeStoreBench.setIntLV |5506.791000 |5506.229000 | 0.01 %|
>> |MergeStoreBench.setIntRB |7634.279000 |5611.058000 | 36.06 %|
>> |MergeStoreBench.setIntRBU |7766.737000 |5551.281000 | 39.91 %|
>> |MergeStoreBench.setIntRL |5689.793000 |5689.385000 | 0.01 %|
>> |MergeStoreBench.setIntRLU |5628.287000 |5628.789000 | -0.01 %|
>> |MergeStoreBench.setIntRU |5536.039000 |5534.910000 | 0.02 %|
>> |MergeStoreBench.setIntU |5595.363000 |5567.810000 | 0.49 %|
>> |MergeStoreBench.setLongB |13722.671000 |6811.098000 | 101.48 %|
>> |MergeStoreBench.setLongBU |13728.844000 |4280.240000 | 220.75 %|
>> |MergeStoreBench.setLongBV |2785.255000 |2785.949000 | -0.02 %|
>> |MergeStoreBench.setLongL |5714.615000 |5710.402000 | 0.07 %|
>> |MergeStoreBench.setLongLU |4128.746000 |4129.324000 | -0.01 %|
>> |MergeStoreBench.setLongLV |2793.125000 |2794.438000 | -0.05 %|
>> |MergeStoreBench.setLongRB |14465.223000 |7015.050000 | 106.20 %|
>> |MergeStoreBench.setLongRBU |14546.954000 |6173.210000 | 135.65 %|
>> |MergeStoreBench.setLongRL |6816.145000 |6813.348000 | 0.04 %|
>> |MergeStoreBench.setLongRLU |4289.445000 |4284.239000 | 0.12 %|
>> |MergeStoreBench.setLongRU |3132.471000 |3133.093000 | -0.02 %|
>> |MergeStoreBench.setLongU |3086.779000 |3087.298000 | -0.02 %|
>>
>> AMD EPYC 9T24
>> ...
>
> kuaiwei has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix test502aBE
src/hotspot/share/opto/memnode.cpp line 3013:
> 3011: #else
> 3012: return shift_n1 > shift_n2 ? ValueOrder::Platform // Pattern: [n1 = base >> (shift + memory_size), n2 = base >> shift]
> 3013: : ValueOrder::NotAdjacent; // TODO: Reverse order in BE machine not tested
Drive-by comment: are you going to address your TODO in this PR? We usually don't just leave TODOs in the code, rather we should file RFE's if we plan to do something in the future. TODOs just get forgotten about.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23030#discussion_r1930071318
More information about the hotspot-compiler-dev
mailing list