RFR: 8301852: RISC-V: Optimize class atomic when order is memory_order_relaxed

Dingli Zhang dzhang at openjdk.org
Fri Feb 10 01:41:42 UTC 2023


On Tue, 7 Feb 2023 17:26:57 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:

>> This PR will optimize some function in class atomic. We will not need FULL_MEM_BARRIER anymore when order is memory_order_relaxed. This is the first version and maybe we can improve this part with different memory order in the further, just like [JDK-8274615](https://bugs.openjdk.org/browse/JDK-8274615) and [JDK-8261649](https://bugs.openjdk.org/browse/JDK-8261649).
>> 
>> The following call stack uses memory_order_relaxed:
>> 
>> #0 Atomic::PlatformAdd<8ul>::add_and_fetch<unsigned long, unsigned long> (this=0x40038851d0, dest=0x40035b0820 <MallocMemorySummary::_snapshot+576>, add_value=1, order=memory_order_relaxed)
>>     at /home/dingli/jdk/src/hotspot/os_cpu/linux_riscv/atomic_linux_riscv.hpp:40
>> #1 0x0000004001d200a2 in Atomic::AddImpl<unsigned long, unsigned long, void>::add_and_fetch (dest=0x40035b0820 <MallocMemorySummary::_snapshot+576>, add_value=1, order=memory_order_relaxed)
>>     at /home/dingli/jdk/src/hotspot/share/runtime/atomic.hpp:681
>> #2 0x0000004001d1ff9a in Atomic::add<unsigned long, unsigned long> (dest=0x40035b0820 <MallocMemorySummary::_snapshot+576>, add_value=1, order=memory_order_relaxed)
>>     at /home/dingli/jdk/src/hotspot/share/runtime/atomic.hpp:662
>> #3 0x0000004001d1f55c in MemoryCounter::allocate (this=0x40035b0820 <MallocMemorySummary::_snapshot+576>, sz=96) at /home/dingli/jdk/src/hotspot/share/services/mallocTracker.hpp:63
>> #4 0x0000004002613608 in MallocMemory::record_malloc (this=0x40035b0820 <MallocMemorySummary::_snapshot+576>, sz=96) at /home/dingli/jdk/src/hotspot/share/services/mallocTracker.hpp:113
>> #5 0x0000004002613676 in MallocMemorySummary::record_malloc (size=96, flag=MEMFLAGS::mtInternal) at /home/dingli/jdk/src/hotspot/share/services/mallocTracker.hpp:244
>> 
>> 
>> By the way, this PR will also polish inline assembly code of PlatformCmpxchg.
>> 
>> ## Testing:
>> 
>> - all tier1-3 on unmatched board without new failures
>
> Hello
> Have you made any measurements, should it improve performance somewhere ?

Hi @VladimirKempik Sorry for the late reply.
I ran five rounds of tests on unmatched using SPECjbb2015's composite mode with `-Xmx8g` and this data is just for reference only.
|        | before   |               | after    |               |
| ------ | -------- | ------------- | -------- | ------------- |
|        | max-jOPS | critical-jOPS | max-jOPS | critical-jOPS |
| round1 | 509      | 74            | 545      | 79            |
| round2 | 490      | 62            | 503      | 83            |
| round3 | 570      | 75            | 498      | 65            |
| round4 | 521      | 81            | 590      | 80            |
| round5 | 583      | 73            | 596      | 79            |
| avg    | 534.6    | 73            | 546.4    | 77.2          |

-------------

PR: https://git.openjdk.org/jdk/pull/12434


More information about the hotspot-runtime-dev mailing list