RFR: 8291550: RISC-V: jdk uses misaligned memory access when AvoidUnalignedAccess enabled [v14]

Fei Yang fyang at openjdk.org
Fri May 12 07:10:51 UTC 2023


On Thu, 11 May 2023 16:41:19 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:

>> Please review this attempt to remove misaligned loads and stores in risc-v specific part of jdk.
>> 
>> The patch has two main parts:
>>  - opcodes loads/stores is now using put_native_uX/get_native_uX
>>  - some code in template interp got changed to prevent misaligned loads
>>  
>> perf stat numbers for trp_lam ( misaligned loads) and trp_sam ( misaligned stores) before the patch: 
>> 
>>  169598      trp_lam                                          
>>   13562      trp_sam  
>> 
>> 
>> after the patch both numbers are zeroes.
>> I can see template interpreter to be ~40 % faster on hifive unmatched ( 1 repetition of renaissance philosophers in -Xint mode), and the same performance ( before and after the patch) on thead rvb-ice ( which supports misaligned stores/loads in hw)
>> 
>> tier testing on hw is in progress
>
> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Create load_long_misaligned and start using it

Thanks for the update. Would you mind a few more tweaks?

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1163:

> 1161:   } else {
> 1162:     add(tmp1, cnt1, wordSize);
> 1163:     beqz(tmp1, SAME);

I think this change here resolves my previous concern. I witnessed some usage of registers `t0` and `t1` in this function. I think we should replace them with their aliases 'tmp1' and 'tmp2' respectively. Could you please help do that cleanup while you are on it?

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1714:

> 1712: void MacroAssembler::load_long_misaligned(Register dst, Address src, Register tmp, int granularity) {
> 1713:   if (AvoidUnalignedAccesses && (granularity != 8)) {
> 1714:     assert_different_registers(dst, tmp);

Suggestion: s/assert_different_registers(dst, tmp)/assert_different_registers(dst, tmp, src.base())/

src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 1102:

> 1100:     __ mv(t1, unsatisfied);
> 1101:     if (AvoidUnalignedAccesses) {
> 1102:       __ mv(t, t1);

Seems that this `mv` instruction could be saved by putting address `unsatisfied` in `t` instread of `t1` before at line #1100.

src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 1103:

> 1101:     if (AvoidUnalignedAccesses) {
> 1102:       __ mv(t, t1);
> 1103:       __ MacroAssembler::load_long_misaligned(t1, Address(t,0), t0, 2); // 2 bytes aligned, but not 4 or 8

Suggestion: s/Address(t,0)/Address(t, 0)/
And do we need the `MacroAssembler` namespace here?

-------------

Changes requested by fyang (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/13645#pullrequestreview-1423880627
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1191988224
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1191979950
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1191977935
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1191978891


More information about the hotspot-dev mailing list