RFR: 8332265: RISC-V: Materialize pointers faster by using a temp register [v4]

Fei Yang fyang at openjdk.org
Wed May 22 14:58:10 UTC 2024


On Wed, 22 May 2024 08:35:33 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> Hi, please consider!
>> 
>> Materializing a 48-bit pointer, using an additional register, we can do with:
>> lui + lui + slli + add + addi
>> This 15% faster both on VF2 and in CPU models, compared to movptr().
>> 
>> As we often materialize during calls there is free registers.
>> 
>> I have choose just a few spot to use it, many more can use.
>> E.g. la() with tmp register can use li48 instead of movptr.
>> 
>> Running tests now (so far so good), as if I screwed up IC calls it should be seen fast.
>> And benchmarks when hardware is free.
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
> 
>  - Review changes
>  - Merge branch 'master' into 8332265
>  - Merge branch 'master' into 8332265
>  - Small review update
>  - li48 -> movptr
>  - Merge branch 'master' into 8332265
>  - li48

Three more minor comments, looks good otherwise. Thanks.

src/hotspot/cpu/riscv/nativeInst_riscv.hpp line 141:

> 139:   //     add
> 140:   //     addi/jalr/load
> 141:   static bool check_movptr2_data_dependency(address instr) {

Better to rename the existing `check_movptr_data_dependency` as `check_movptr1_data_dependency` at the same time.

src/hotspot/cpu/riscv/nativeInst_riscv.hpp line 421:

> 419:   void flush() {
> 420:     if (!maybe_cpool_ref(instruction_address())) {
> 421:       ICache::invalidate_range(instruction_address(), movptr1_instruction_size /* > movptr2_instruction_size */);

Maybe we can simply remove this `flush()` member function which is not used anywhere.

src/hotspot/cpu/riscv/riscv.ad line 1289:

> 1287: {
> 1288:   // skip the movptr2 in MacroAssembler::ic_call():
> 1289:   // lui + addi + slli + addi + slli + addi

You might also want to update this instruction sequence in the code comment to reflect `movptr2()`.

-------------

PR Review: https://git.openjdk.org/jdk/pull/19246#pullrequestreview-2071081979
PR Review Comment: https://git.openjdk.org/jdk/pull/19246#discussion_r1609882747
PR Review Comment: https://git.openjdk.org/jdk/pull/19246#discussion_r1610113183
PR Review Comment: https://git.openjdk.org/jdk/pull/19246#discussion_r1609960292


More information about the hotspot-dev mailing list