RFR: 8291550: RISC-V: jdk uses misaligned memory access when AvoidUnalignedAccess enabled [v5]

Fei Yang fyang at openjdk.org
Thu May 4 03:39:17 UTC 2023


On Tue, 2 May 2023 08:28:14 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:

>> Please review this attempt to remove misaligned loads and stores in risc-v specific part of jdk.
>> 
>> The patch has two main parts:
>>  - opcodes loads/stores is now using put_native_uX/get_native_uX
>>  - some code in template interp got changed to prevent misaligned loads
>>  
>> perf stat numbers for trp_lam ( misaligned loads) and trp_sam ( misaligned stores) before the patch: 
>> 
>>  169598      trp_lam                                          
>>   13562      trp_sam  
>> 
>> 
>> after the patch both numbers are zeroes.
>> I can see template interpreter to be ~40 % faster on hifive unmatched ( 1 repetition of renaissance philosophers in -Xint mode), and the same performance ( before and after the patch) on thead rvb-ice ( which supports misaligned stores/loads in hw)
>> 
>> tier testing on hw is in progress
>
> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision:
> 
>   simpify branching in branch opcodes

Thanks for the update. Would you mind several more tweaks? Otherwise, LGTM.

src/hotspot/cpu/riscv/nativeInst_riscv.hpp line 222:

> 220:   void set_uint_at(int offset, jint  i)       { Bytes::put_native_u4(addr_at(offset), i); }
> 221:   void set_ptr_at (int offset, address  ptr)  { Bytes::put_native_u8(addr_at(offset), (u8)ptr); }
> 222:   void set_oop_at (int offset, oop  o)        { Bytes::put_native_u8(addr_at(offset), cast_from_oop<u8>(o)); }

I see there are two spaces between type and name for the second parameter. We should remove one.

src/hotspot/cpu/riscv/templateTable_riscv.cpp line 292:

> 290:   }
> 291:   __ revb_w_w(x10, x10);
> 292:   __ sraiw(x10, x10, 16);

I think we can further simplify this sequence into something like:

if (AvoidUnalignedAccesses) {
  __ load_signed_byte(x10, at_bcp(1));
  __ load_unsigned_byte(t1, at_bcp(2));
  __ slli(x10, x10, 8);
  __ add(x10, x10, t1);
} else {
  __ load_unsigned_short(x10, at_bcp(1));
  __ revb_w_w(x10, x10); // reverse bytes in word and sign-extend
  __ sraiw(x10, x10, 16);
}

src/hotspot/cpu/riscv/templateTable_riscv.cpp line 1627:

> 1625:       __ lhu(x12, at_bcp(1));
> 1626:     }
> 1627:     __ revb_h_h(x12, x12); // reverse bytes in half-word and sign-extend

Similar here. Consider further optimizing this sequence into something like:

if (AvoidUnalignedAccesses) {
  __ lb(x12, at_bcp(1));
  __ lbu(t1, at_bcp(2));
  __ slli(x12, x12, 8);
  __ add(x12, x12, t1);
} else {
  __ lhu(x12, at_bcp(1));
  __ revb_h_h(x12, x12); // reverse bytes in half-word and sign-extend
}

-------------

Changes requested by fyang (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/13645#pullrequestreview-1412228938
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184508318
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184508821
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184509126


More information about the hotspot-dev mailing list