RFR: 8287788: reuse intermediate segments allocated during FFM stub invocations [v2]
Matthias Ernst
duke at openjdk.org
Mon Jan 20 07:30:17 UTC 2025
> Certain signatures for foreign function calls require allocation of an intermediate buffer to adapt the FFM's to the native stub's calling convention ("needsReturnBuffer"). In the current implementation, this buffer is malloced and freed on every FFM invocation, a non-negligible overhead.
>
> Sample stack trace:
>
> java.lang.Thread.State: RUNNABLE
> at jdk.internal.misc.Unsafe.allocateMemory0(java.base at 25-ea/Native Method)
> at jdk.internal.misc.Unsafe.allocateMemory(java.base at 25-ea/Unsafe.java:636)
> at jdk.internal.foreign.SegmentFactories.allocateMemoryWrapper(java.base at 25-ea/SegmentFactories.java:215)
> at jdk.internal.foreign.SegmentFactories.allocateSegment(java.base at 25-ea/SegmentFactories.java:193)
> at jdk.internal.foreign.ArenaImpl.allocateNoInit(java.base at 25-ea/ArenaImpl.java:55)
> at jdk.internal.foreign.ArenaImpl.allocate(java.base at 25-ea/ArenaImpl.java:60)
> at jdk.internal.foreign.ArenaImpl.allocate(java.base at 25-ea/ArenaImpl.java:34)
> at java.lang.foreign.SegmentAllocator.allocate(java.base at 25-ea/SegmentAllocator.java:645)
> at jdk.internal.foreign.abi.SharedUtils$2.<init>(java.base at 25-ea/SharedUtils.java:388)
> at jdk.internal.foreign.abi.SharedUtils.newBoundedArena(java.base at 25-ea/SharedUtils.java:386)
> at jdk.internal.foreign.abi.DowncallStub/0x000001f001084c00.invoke(java.base at 25-ea/Unknown Source)
> at java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(java.base at 25-ea/DirectMethodHandle$Holder)
> at java.lang.invoke.LambdaForm$MH/0x000001f00109a400.invoke(java.base at 25-ea/LambdaForm$MH)
> at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base at 25-ea/Invokers$Holder)
>
>
> When does this happen? A fairly easy way to trigger this is through returning a small aggregate like the following:
>
> struct Vector2D {
> double x, y;
> };
> Vector2D Origin() {
> return {0, 0};
> }
>
>
> On AArch64, such a struct is returned in two 128 bit registers v0/v1.
> The VM's calling convention for the native stub consequently expects an 32 byte output segment argument.
> The FFM downcall method handle instead expects to create a 16 byte result segment through the application-provided SegmentAllocator, and needs to perform an appropriate adaptation, roughly like so:
>
> MemorySegment downcallMH(SegmentAllocator a) {
> MemorySegment tmp = SharedUtils.allocate(32);
> try {
> nativeStub.invoke(tmp); // leaves v0, v1 in tmp
> MemorySegment result = a.allocate(16);
> result.setDouble(0, tmp.getDouble(0));
> result.setDouble(8, tmp.getDouble(16));
> return result;
> ...
Matthias Ernst has updated the pull request incrementally with one additional commit since the last revision:
Implementation notes.
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/23142/files
- new: https://git.openjdk.org/jdk/pull/23142/files/4a2210df..35a3a156
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=23142&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=23142&range=00-01
Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/23142.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/23142/head:pull/23142
PR: https://git.openjdk.org/jdk/pull/23142
More information about the core-libs-dev
mailing list