Integrated: 8287788: Implement a better allocator for downcalls
Matthias Ernst
duke at openjdk.org
Mon Jan 27 19:48:06 UTC 2025
On Wed, 15 Jan 2025 21:39:05 GMT, Matthias Ernst <duke at openjdk.org> wrote:
> Certain signatures for foreign function calls (e.g. HVA return by value) require allocation of an intermediate buffer to adapt the FFM's to the native stub's calling convention. In the current implementation, this buffer is malloced and freed on every FFM invocation, a non-negligible overhead.
>
> Sample stack trace:
>
> java.lang.Thread.State: RUNNABLE
> at jdk.internal.misc.Unsafe.allocateMemory0(java.base at 25-ea/Native Method)
> ...
> at jdk.internal.foreign.abi.SharedUtils.newBoundedArena(java.base at 25-ea/SharedUtils.java:386)
> at jdk.internal.foreign.abi.DowncallStub/0x000001f001084c00.invoke(java.base at 25-ea/Unknown Source)
> ...
> at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base at 25-ea/Invokers$Holder)
>
>
> To alleviate this, this PR implements a per carrier-thread stacked allocator.
>
> Performance (MBA M3):
>
>
> Before:
> Benchmark Mode Cnt Score Error Units
> CallOverheadByValue.byPtr avgt 10 3.333 ? 0.152 ns/op
> CallOverheadByValue.byValue avgt 10 33.892 ? 0.034 ns/op
>
> After:
> Benchmark Mode Cnt Score Error Units
> CallOverheadByValue.byPtr avgt 30 3.311 ? 0.034 ns/op
> CallOverheadByValue.byValue avgt 30 6.143 ? 0.053 ns/op
>
>
> `-prof gc` also shows that the new call path is fully scalar-replaced vs 160 byte/call before.
This pull request has now been integrated.
Changeset: 8cc13045
Author: Matthias Ernst <mernst-github at mernst.org>
Committer: Jorn Vernee <jvernee at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/8cc13045428eebb8933df865f9a87f0f91909ba5
Stats: 488 lines in 7 files changed: 468 ins; 14 del; 6 mod
8287788: Implement a better allocator for downcalls
Reviewed-by: jvernee
-------------
PR: https://git.openjdk.org/jdk/pull/23142
More information about the core-libs-dev
mailing list