RFR: 8287418: riscv: Fix correctness issue of MacroAssembler::movptr

Fri May 27 04:46:46 UTC 2022

Hi team,

`MacroAssembler::movptr()` is designed to load a 47-bit (unsigned) address constant, ranging `[0x0, 0x7FFF_FFFF_FFFF]`, and a special case -1 (`the Universe::non_oop_word()` as we know, which is `0xFFFF_FFFF_FFFF_FFFF`). The former ones are inside a sv48 address space range[1]. Please note that under sv48 a valid address has the bit 47 equal to 0 in user space, so that `MacroAssembler::movptr()` could cover all cases under sv48. However, when loading an immediate value ranging `[0x7FFF_8000_0000, 0x7FFF_FFFF_FFFF]` using it, the results would wrongly become `[0xFFFF_7FFF_8000_0000, 0xFFFF_7FFF_FFFF_FFFF]`, which indicates the MSB has polluted high bits in rare cases.

`MacroAssembler::movptr()` is a composition of `lui+addi+slli+addi+slli+addi`, and all of them are signed operations, MIPS alike.
Precisely, the first `lui+addi` aims to load the first `32-bit`; then the `slli+addi` would load the `11-bit`; finally the last `slli+addi` is going to load the remaining `5-bit`.

To deal with this, there are two approaches:

(a) Use an `addiw` to replace the first `addi`. `addiw` has nearly the same semantics as `addi`, but after the operation the result would be sign-extended according to the bit 31. Due to this feature, we could use this to clean up the dirty high bits at all times. This could also handle the (-1) case. However, `Assembler::li32()`, which is composed of `lui+addiw`, will conflict with the new implementation, needing further adaptations. (Personally I a bit dislike of that)

(b) Alike V8's implementation [2], the trick here is it loads only the first 31-bit using `lui+addi`, with a leading 0 as the bit 31. So this one could prevent this issue at the beginning. As a trade-off, we need to shift one another bit because the leading 0 occupies one bit. Also this one could also handle the (-1) case as well after minor adaptations. (I like this one)

This problem could be reproduced using `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` with fastdebug build, and on Qemu only, for currently I have no access to hardware that supports sv48, and the kernel Ubuntu[3] relies on is Linux 5.15. The kernel (TIP) would first check if hardware sponsors sv57, if not then fall back to sv48, and so on. It is not until Linux 5.17 that sv48 is supported[4]. So this issue could never be reproduced on my boards. But fortunately Qemu could sponsor this, because one could mmap an address in 48-bit address space even in a user-level Qemu.

Tested with `-XX:CompressedClassSpaceBaseAddress=0x7FFFF8000000 -XX:CompressedClassSpaceSize=40M -Xshare:off` (reproducible) on Qemu with hotspot tier1 (we should ignore OOM caused the compressed class space), and other tiers are on the way.
Testing sanity hotspot tier1~tier4 (could not reproduce). Tier1 is finished without new failures.

Thanks,
Xiaolin

[1] https://github.com/riscv/riscv-isa-manual/blob/9ec8c0105dbf1492b57f6cafdb90a268628f476a/src/supervisor.tex#L1999-L2006
[2] https://github.com/v8/v8/blob/main/src/codegen/riscv64/assembler-riscv64.cc#L3479-L3495
[3] https://cdimage.ubuntu.com/releases/22.04/release/
[4] https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.17-RISC-V-sv48

-------------

Commit messages:
 - Fix MSB overflow in signed lui operations

Changes: https://git.openjdk.java.net/jdk/pull/8913/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8913&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8287418
  Stats: 19 lines in 5 files changed: 1 ins; 0 del; 18 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8913.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8913/head:pull/8913

PR: https://git.openjdk.java.net/jdk/pull/8913