RFR: 8296448: RISC-V: Fix temp usages of heapbase register killed by MacroAssembler::en/decode_klass_not_null [v2]
Gui Cao
gcao at openjdk.org
Mon Oct 21 01:44:48 UTC 2024
On Tue, 8 Nov 2022 09:00:38 GMT, Xiaolin Zheng <xlinzheng at openjdk.org> wrote:
>> Please see the JBS issue for more crash details.
>>
>> To reproduce using a cross-compiled build:
>>
>> # dump one cds-nocoops.jsa
>> <java> -XX:-UseCompressedOops -XX:+UseCompressedClassPointers -Xshare:dump -Xlog:cds* -version
>>
>> # reproduce
>> <java> -XX:-UseCompressedOops -XX:+UseCompressedClassPointers -Xshare:on -XX:-TieredCompilation \
>> -Xlog:cds* -Xlog:gc+metaspace=info -jar renaissance-gpl-0.14.1.jar -r 1 movie-lens
>>
>>
>> `MacroAssembler::en/decode_klass_not_null` uses the heapbase register as a temp register in the interpreter, which may kill the in-use value when enabling C2 compilation and `UseCompressedClassPointers` meanwhile disabling `UseCompressedOops`. C1 won't have this issue for the xheapbase is not its allocation candidate. When CDS is enabled, the narrow klass base is mapped to some address like `0x0000000800000000`, so `MacroAssembler::decode_klass_not_null`, which lacks registers, will use `xheapbase` as a temp to load the klass base and kill the register in the interpreter. So adding a `-XX:+DeoptimizeALot` can speedily reproduce the issue.
>>
>> To solve this, we shall decouple the xheapbase used as a temp register in `MacroAssembler::en/decode_klass_not_null`. AArch64 has advanced instructions so one register is enough to handle the logic. But in RISC-V we require at least two.
>>
>> This patch introduces another argument `tmp` to related functions to decouple and eliminate such heapbase usages. One thing that deserves noticing is the `cmp_klass` case, which usually gets used at the beginning of a method entry when `t1` is used as ic holder klass and `t0` is occupied there. These positions are special since nearly all registers are usable except ones used for arguments and special purposes (thread register, etc.). I propose to use a call-clobbered `t2` register here, to keep aligning the `i2c2i_adapter` logic[1].
>>
>> Tested hotspot tier1~4 on QEMU; jdk tier1~tier2 and hotspot tier1~tier2 on my Hifive unmatched board, and the reproducible movie-lens benchmark.
>>
>> Thanks,
>> Xiaolin
>>
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp#L629
>
> Xiaolin Zheng has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix as to comments
> > There is a potential conflict with the use of t0 in decode_klass_not_null:
> > ```
> > assert_different_registers(t0, xbase);
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > since tmp in load_klass defaults to t0.
>
> Thank you for pointing this out. Yes, code here in `MacroAssembler::decode_klass_not_null` seems to have a potential issue. But in reality I assume it is dead code.
>
> ```
> // In MacroAssembler::decode_klass_not_null()
> // -> CompressedKlassPointers::base() is non-null also when reaching here
> if (CompressedKlassPointers::shift() != 0) {
> assert(LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong");
> assert_different_registers(t0, xbase);
> shadd(dst, src, xbase, t0, LogKlassAlignmentInBytes);
> }
> ```
>
> As per Thomas' clear comments here [1], `base > 0 && shift > 0` should not happen. Perhaps we can perform some cleanup here to remove unreachable code that causes confusion. Maybe something like [zhengxiaolinX at 973a071](https://github.com/zhengxiaolinX/jdk/commit/973a071e51d116909b9713e52850b4990c025183).
>
> But it is untested due to a lack of resources, so waiting for someone kind to take it over.
>
> BTW the current PR (11010) is still needed because the heapbase register should not be used as a temp register here.
>
> Hope this can explain well.
>
> Best, Xiaolin
>
> [1]
>
> https://github.com/openjdk/jdk/blob/4dcc7f3f2629e857b20f72e99189db8781aa65ff/src/hotspot/share/oops/compressedKlass.cpp#L103-L104
@zhengxiaolinX Hi, I have tested tier1(fastdebug) with this patch [zhengxiaolinX at 973a071](https://github.com/zhengxiaolinX/jdk/commit/973a071e51d116909b9713e52850b4990c025183) and everything works fine.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/11010#issuecomment-2425386081
More information about the hotspot-dev
mailing list