RFR: 8256488: [aarch64] Use ldpq/stpq instead of ld4/st4 for small copies in StubGenerator::copy_memory
Evgeny Astigeevich
github.com+42899633+eastig at openjdk.java.net
Tue Nov 24 14:14:58 UTC 2020
On Tue, 24 Nov 2020 13:37:05 GMT, Evgeny Astigeevich <github.com+42899633+eastig at openjdk.org> wrote:
>>> I think we need also some non-Neoverse N1 numbers. We need to keep in mind that this software runs on many implementations.
>>
>> For all modern Cortex-A ldpq is either faster or the same as ld4, e.g see calculation for Cortex-A72 above. I cannot find any optimizations guides for Ampere eMAG, ThunderX/ThunderX2 and HiSilicon TSV110 to check what latencies and throughput ld4/ldpq have on them. I appreciate if someone helps with this. I don't expect non-Cortex implementations differ much from Cortex.
>> The main issue with ld4 is its low throughput. The intent of ld4 as I understand it is to load data and to process it after that.
>>
>>> I'll have a look at some others.
>>
>> Could you please share more information what CPUs you will check?
>
>> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-compiler-dev](mailto:hotspot-compiler-dev at openjdk.java.net):_
>>
>> On 24/11/2020 10:19, Evgeny Astigeevich wrote:
>>
>> > The microbenchmarks are ArrayCopy* microbenchmarks which are a part of OpenJDK: https://github.com/openjdk/jdk/tree/master/test/micro/org/openjdk/bench/java/lang
>>
>> Sorry, my mistake. I'll try this now.
>>
>
> Not a problem. I am new to GitHub reviewing process and the OpenJDK project. I am still learning things.
> Let me know if I need to run any additional benchmarks.
> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [hotspot-compiler-dev](mailto:hotspot-compiler-dev at openjdk.java.net):_
>
> On 24/11/2020 13:39, Evgeny Astigeevich wrote:
>
> > I am new to GitHub reviewing process and the OpenJDK project. I am still learning things.
> > Let me know if I need to run any additional benchmarks.
>
> Test output in CSV form would be nice: it's very hard to read the test
> results you provided, and CSV can make noce graphs.
Thank you for the feedback. It helped me to find how files can be attached to PR. Usually you look for a clip on a panel. Here it is a little bit unusual. :) ��
-------------
PR: https://git.openjdk.java.net/jdk/pull/1293
More information about the hotspot-compiler-dev
mailing list