RFR: 8332265: RISC-V: Materialize pointers faster by using a temp register [v2]

Mon May 20 13:27:07 UTC 2024

On Mon, 20 May 2024 13:15:15 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> Hi, please consider!
>> 
>> Materializing a 48-bit pointer, using an additional register, we can do with:
>> lui + lui + slli + add + addi
>> This 15% faster both on VF2 and in CPU models, compared to movptr().
>> 
>> As we often materialize during calls there is free registers.
>> 
>> I have choose just a few spot to use it, many more can use.
>> E.g. la() with tmp register can use li48 instead of movptr.
>> 
>> Running tests now (so far so good), as if I screwed up IC calls it should be seen fast.
>> And benchmarks when hardware is free.
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - li48 -> movptr
>  - Merge branch 'master' into 8332265
>  - li48

Hey, I did an update, not  fully what you are saying.
We are missing a lot of 'abstraction' e.g. take a look at CodeInstaller::pd_next_offset in jvmciCodeInstaller.
I think this code should look like:

jint CodeInstaller::pd_next_offset(NativeInstruction* inst, jint pc_offset, JVMCI_TRAPS) {
  if(inst->is_call() || inst->is_jump() || inst->is_movptr()) {
    return pc_offset + inst->size();
  }
  JVMCI_ERROR_0("unsupported type of instruction for call site");
}

But I need to add a bunch of stuff to unrelated NativeInst, I think that is better suited in another PR.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19246#issuecomment-2120457463