RFR: 8291550: RISC-V: jdk uses misaligned memory access when AvoidUnalignedAccess enabled
Feilong Jiang
fjiang at openjdk.org
Wed Apr 26 01:56:52 UTC 2023
On Tue, 25 Apr 2023 15:37:30 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:
> Please review this attempt to remove misaligned loads and stores in risc-v specific part of jdk.
>
> The patch has two main parts:
> - opcodes loads/stores is now using put_native_uX/get_native_uX
> - some code in template interp got changed to prevent misaligned loads
>
> perf stat numbers for trp_lam ( misaligned loads) and trp_sam ( misaligned stores) before the patch:
>
> 169598 trp_lam
> 13562 trp_sam
>
>
> after the patch both numbers are zeroes.
> I can see template interpreter to be ~40 % faster on hifive unmatched ( 1 repetition of renaissance philosophers in -Xint mode), and the same performance ( before and after the patch) on thead rvb-ice ( which supports misaligned stores/loads in hw)
>
> tier testing on hw is in progress
Looks good, with some nits:
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 207:
> 205: if (index_size == sizeof(u2)) {
> 206: if (AvoidUnalignedAccesses)
> 207: {
Suggestion:
if (AvoidUnalignedAccesses) {
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 208:
> 206: if (AvoidUnalignedAccesses)
> 207: {
> 208: assert(index != tmp, "must use different register");
Suggestion:
assert_different_registers(index, tmp);
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 210:
> 208: assert(index != tmp, "must use different register");
> 209: load_unsigned_byte(index, Address(xbcp, bcp_offset));
> 210: load_unsigned_byte(tmp, Address(xbcp, bcp_offset+1));
Suggestion:
load_unsigned_byte(tmp, Address(xbcp, bcp_offset + 1));
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 218:
> 216: } else if (index_size == sizeof(u4)) {
> 217: if (AvoidUnalignedAccesses)
> 218: {
Suggestion:
if (AvoidUnalignedAccesses) {
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 221:
> 219: assert(index != tmp, "must use different register");
> 220: load_unsigned_byte(index, Address(xbcp, bcp_offset));
> 221: load_unsigned_byte(tmp, Address(xbcp, bcp_offset+1));
Suggestion:
load_unsigned_byte(tmp, Address(xbcp, bcp_offset + 1));
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 224:
> 222: slli(tmp, tmp, 8);
> 223: add(index, index, tmp);
> 224: load_unsigned_byte(tmp, Address(xbcp, bcp_offset+2));
Suggestion:
load_unsigned_byte(tmp, Address(xbcp, bcp_offset + 2));
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 227:
> 225: slli(tmp, tmp, 16);
> 226: add(index, index, tmp);
> 227: load_unsigned_byte(tmp, Address(xbcp, bcp_offset+3));
Suggestion:
load_unsigned_byte(tmp, Address(xbcp, bcp_offset + 3));
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 259:
> 257: assert_different_registers(cache, index);
> 258: assert_different_registers(cache, xcpool);
> 259: //register "cache" is trashed in next shadd, so lets use it as a temporary register
Suggestion:
// register "cache" is trashed in next shadd, so lets use it as a temporary register
src/hotspot/cpu/riscv/interp_masm_riscv.cpp line 297:
> 295: size_t index_size) {
> 296: assert_different_registers(cache, tmp);
> 297: //register "cache" is trashed in next ld, so lets use it as a temporary register
Suggestion:
// register "cache" is trashed in next ld, so lets use it as a temporary register
src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 1102:
> 1100: __ mv(t1, unsatisfied);
> 1101: if (AvoidUnalignedAccesses)
> 1102: {
Suggestion:
if (AvoidUnalignedAccesses) {
src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 1115:
> 1113: __ add(t1, t1, t0);
> 1114: } else {
> 1115: __ ld(t1, Address(t1, 0)); //2 bytes alligned, but not 4 or 8
Suggestion:
__ ld(t1, Address(t1, 0)); // 2 bytes alligned, but not 4 or 8
Should we put this comment into `if` statement?
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 378:
> 376: // non-null object (String, MethodType, etc.)
> 377: assert_different_registers(result, tmp);
> 378: //register result is trashed by next load, let's use it as temporary register
Suggestion:
// register result is trashed by next load, let's use it as temporary register
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 1626:
> 1624: } else {
> 1625: __ lhu(x12, at_bcp(1));
> 1626: }
Suggestion:
if (AvoidUnalignedAccesses) {
__ lbu(t1, at_bcp(1));
__ lbu(x12, at_bcp(2));
__ slli(x12, x12, 8);
__ add(x12, t1, x12);
} else {
__ lhu(x12, at_bcp(1));
}
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2027:
> 2025: __ shadd(temp, h, array, temp, 3);
> 2026: if (AvoidUnalignedAccesses)
> 2027: {
Suggestion:
if (AvoidUnalignedAccesses) {
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2028:
> 2026: if (AvoidUnalignedAccesses)
> 2027: {
> 2028: //array is BytesPerInt (aka 4) alligned
Suggestion:
// array is BytesPerInt (aka 4) alligned
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2034:
> 2032: __ add(temp, temp, t1);
> 2033: } else
> 2034: {
Suggestion:
} else {
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2060:
> 2058: __ shadd(temp, i, array, temp, 3);
> 2059: if (AvoidUnalignedAccesses)
> 2060: {
Suggestion:
if (AvoidUnalignedAccesses) {
src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2061:
> 2059: if (AvoidUnalignedAccesses)
> 2060: {
> 2061: //array is BytesPerInt (aka 4) alligned
Suggestion:
// array is BytesPerInt (aka 4) alligned
-------------
Changes requested by fjiang (Author).
PR Review: https://git.openjdk.org/jdk/pull/13645#pullrequestreview-1400997892
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177214715
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177213849
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177215730
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177216198
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177215848
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177215934
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177216005
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177224810
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177225034
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177220415
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177231442
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177226383
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177221375
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177221673
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177226617
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177221897
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177222132
PR Review Comment: https://git.openjdk.org/jdk/pull/13645#discussion_r1177222582
More information about the hotspot-dev
mailing list