Integrated: 8313406: nep_invoker_blob can be simplified more

Yasumasa Suenaga ysuenaga at openjdk.org
Mon Aug 14 23:17:17 UTC 2023


On Mon, 31 Jul 2023 12:22:00 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

> In FFM, native function would be called via `nep_invoker_blob`. If the function has two arguments, it would be following:
> 
> 
> Decoding RuntimeStub - nep_invoker_blob 0x00007fcae394cd10
> --------------------------------------------------------------------------------
>   0x00007fcae394cd80: pushq %rbp
>   0x00007fcae394cd81: movq %rsp, %rbp
>   0x00007fcae394cd84: subq $0, %rsp
>  ;; { argument shuffle
>   0x00007fcae394cd88: movq %r8, %rax
>   0x00007fcae394cd8b: movq %rsi, %r10
>   0x00007fcae394cd8e: movq %rcx, %rsi
>   0x00007fcae394cd91: movq %rdx, %rdi
>  ;; } argument shuffle
>   0x00007fcae394cd94: callq *%r10
>   0x00007fcae394cd97: leave
>   0x00007fcae394cd98: retq
> 
> 
> `subq $0, %rsp` is for shadow space on stack, and `movq %r8, %rax` is number of args for variadic function. So they are not necessary in some case. They should be remove following if they are not needed:
> 
> 
> Decoding RuntimeStub - nep_invoker_blob 0x00007fd8778e2810
> --------------------------------------------------------------------------------
>   0x00007fd8778e2880: pushq %rbp
>   0x00007fd8778e2881: movq %rsp, %rbp
>  ;; { argument shuffle
>   0x00007fd8778e2884: movq %rsi, %r10
>   0x00007fd8778e2887: movq %rcx, %rsi
>   0x00007fd8778e288a: movq %rdx, %rdi
>  ;; } argument shuffle
>   0x00007fd8778e288d: callq *%r10
>   0x00007fd8778e2890: leave
>   0x00007fd8778e2891: retq
> 
> 
> All java/foreign jtreg tests are passed.
> 
> We can see these stub code on [ffmasm testcase](https://github.com/YaSuenag/ffmasm/tree/ef7a466ca9607164dbe7be7e68ea509d4bdac998/examples/cpumodel) with `-XX:+UnlockDiagnosticVMOptions -XX:+PrintStubCode` and hsdis library. This testcase linked the code with `Linker.Option.isTrivial()`.
> 
> After this change, FFM performance on [another ffmasm testcase](https://github.com/YaSuenag/ffmasm/tree/ef7a466ca9607164dbe7be7e68ea509d4bdac998/benchmarks/funccall) was improved:
> 
> before:
> 
> Benchmark                           Mode  Cnt          Score          Error  Units
> FuncCallComparison.invokeFFMRDTSC  thrpt    3  106664071.816 ± 14396524.718  ops/s
> FuncCallComparison.rdtsc           thrpt    3  108024079.738 ± 13223921.011  ops/s
> 
> 
> after:
> 
> Benchmark                           Mode  Cnt          Score          Error  Units
> FuncCallComparison.invokeFFMRDTSC  thrpt    3  107622971.525 ± 12249767.134  ops/s
> FuncCallComparison.rdtsc           thrpt    3  107695741.608 ± 23983281.346  ops/s
> 
> 
> Environment:
> * CPU: AMD Ryzen 3 3300X
> * OS: Fedora 38 x86_64 (Kernel 6.3.8-200.fc38.x86_64)
> * Hyper-V 4vCPU, 8GB mem

This pull request has now been integrated.

Changeset: 583cb754
Author:    Yasumasa Suenaga <ysuenaga at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/583cb754f38f5d32144e302ce5e82a3b36a2cb78
Stats:     41 lines in 3 files changed: 4 ins; 11 del; 26 mod

8313406: nep_invoker_blob can be simplified more

Reviewed-by: jvernee, vlivanov

-------------

PR: https://git.openjdk.org/jdk/pull/15089


More information about the hotspot-compiler-dev mailing list