RFR: 8291550: RISC-V: jdk uses misaligned memory access when AvoidUnalignedAccess enabled [v5]
Fei Yang
fyang at openjdk.org
Thu May 4 03:39:17 UTC 2023
On Tue, 2 May 2023 08:28:14 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:
>> Please review this attempt to remove misaligned loads and stores in risc-v specific part of jdk.
>>
>> The patch has two main parts:
>> - opcodes loads/stores is now using put_native_uX/get_native_uX
>> - some code in template interp got changed to prevent misaligned loads
>>
>> perf stat numbers for trp_lam ( misaligned loads) and trp_sam ( misaligned stores) before the patch:
>>
>> 169598 trp_lam
>> 13562 trp_sam
>>
>>
>> after the patch both numbers are zeroes.
>> I can see template interpreter to be ~40 % faster on hifive unmatched ( 1 repetition of renaissance philosophers in -Xint mode), and the same performance ( before and after the patch) on thead rvb-ice ( which supports misaligned stores/loads in hw)
>>
>> tier testing on hw is in progress
>
> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision:
>
> simpify branching in branch opcodes
Thanks for the update. Would you mind several more tweaks? Otherwise, LGTM.
src/hotspot/cpu/riscv/nativeInst_riscv.hpp line 222:
> 220: void set_uint_at(int offset, jint i) { Bytes::put_native_u4(addr_at(offset), i); }
> 221: void set_ptr_at (int offset, address ptr) { Bytes::put_native_u8(addr_at(offset), (u8)ptr); }
> 222: void set_oop_at (int offset, oop o) { Bytes::put_native_u8(addr_at(offset), cast_from_oop<u8>(o)); }
I see there are two spaces between type and name for the second parameter. We should remove one.
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 292:
> 290: }
> 291: __ revb_w_w(x10, x10);
> 292: __ sraiw(x10, x10, 16);
I think we can further simplify this sequence into something like:
if (AvoidUnalignedAccesses) {
__ load_signed_byte(x10, at_bcp(1));
__ load_unsigned_byte(t1, at_bcp(2));
__ slli(x10, x10, 8);
__ add(x10, x10, t1);
} else {
__ load_unsigned_short(x10, at_bcp(1));
__ revb_w_w(x10, x10); // reverse bytes in word and sign-extend
__ sraiw(x10, x10, 16);
}
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 1627:
> 1625: __ lhu(x12, at_bcp(1));
> 1626: }
> 1627: __ revb_h_h(x12, x12); // reverse bytes in half-word and sign-extend
Similar here. Consider further optimizing this sequence into something like:
if (AvoidUnalignedAccesses) {
__ lb(x12, at_bcp(1));
__ lbu(t1, at_bcp(2));
__ slli(x12, x12, 8);
__ add(x12, x12, t1);
} else {
__ lhu(x12, at_bcp(1));
__ revb_h_h(x12, x12); // reverse bytes in half-word and sign-extend
}
-------------
Changes requested by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/13645#pullrequestreview-1412228938
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184508318
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184508821
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1184509126
More information about the hotspot-dev
mailing list