RFR: 8332265: RISC-V: Materialize pointers faster by using a temp register [v2]

Tue May 21 06:00:07 UTC 2024

On Mon, 20 May 2024 13:15:15 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> Hi, please consider!
>> 
>> Materializing a 48-bit pointer, using an additional register, we can do with:
>> lui + lui + slli + add + addi
>> This 15% faster both on VF2 and in CPU models, compared to movptr().
>> 
>> As we often materialize during calls there is free registers.
>> 
>> I have choose just a few spot to use it, many more can use.
>> E.g. la() with tmp register can use li48 instead of movptr.
>> 
>> Running tests now (so far so good), as if I screwed up IC calls it should be seen fast.
>> And benchmarks when hardware is free.
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - li48 -> movptr
>  - Merge branch 'master' into 8332265
>  - li48

Thanks for the update. Taking a more closer look.

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1426:

> 1424: }
> 1425: 
> 1426: static int patch_addr_in_movptr2(address instruction_address, address target) {

Can we have a common entry of `patch_addr_in_movptr` which delegates work to `patch_addr_in_movptr1` and `patch_addr_in_movptr2`?

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1526:

> 1524: }
> 1525: 
> 1526: static address get_target_of_movptr2(address insn_addr) {

Similar here. Maybe we can have a common entry of `get_target_of_movptr` which delegates work to `get_target_of_movptr1` and `get_target_of_movptr2`?

-------------

PR Review: https://git.openjdk.org/jdk/pull/19246#pullrequestreview-2067597362
PR Review Comment: https://git.openjdk.org/jdk/pull/19246#discussion_r1607681377
PR Review Comment: https://git.openjdk.org/jdk/pull/19246#discussion_r1607681874