RFR: 8347408: Create an internal method handle adapter for system calls with errno

Chen Liang liach at openjdk.org
Wed Feb 26 12:58:24 UTC 2025


On Tue, 25 Feb 2025 08:27:26 GMT, Per Minborg <pminborg at openjdk.org> wrote:

> As we advance, converting older JDK code to use the relatively new FFM API requires system calls that can provide `errno` and the likes to explicitly allocate a `MemorySegment` to capture potential error states. This can lead to negative performance implications if not designed carefully and also introduces unnecessary code complexity.
> 
> Hence, this PR proposes adding a JDK internal method handle adapter that can be used to handle system calls with `errno`, `GetLastError`, and `WSAGetLastError`.
> 
> It relies on an efficient carrier-thread-local cache of memory regions to allide allocations.
> 
> 
> Here are some benchmarks that ran on a platform thread and virtual threads respectively (M1 Mac):
> 
> 
> Benchmark                                                  Mode  Cnt   Score   Error  Units
> CaptureStateUtilBench.OfVirtual.adaptedSysCallFail         avgt   30  24.330 ? 0.820  ns/op
> CaptureStateUtilBench.OfVirtual.adaptedSysCallSuccess      avgt   30   8.257 ? 0.117  ns/op
> CaptureStateUtilBench.OfVirtual.explicitAllocationFail     avgt   30  41.415 ? 1.013  ns/op
> CaptureStateUtilBench.OfVirtual.explicitAllocationSuccess  avgt   30  21.720 ? 0.463  ns/op
> CaptureStateUtilBench.OfVirtual.tlAllocationFail           avgt   30  23.636 ? 0.182  ns/op
> CaptureStateUtilBench.OfVirtual.tlAllocationSuccess        avgt   30   8.234 ? 0.156  ns/op
> CaptureStateUtilBench.adaptedSysCallFail                   avgt   30  23.918 ? 0.487  ns/op
> CaptureStateUtilBench.adaptedSysCallSuccess                avgt   30   4.946 ? 0.089  ns/op
> CaptureStateUtilBench.explicitAllocationFail               avgt   30  42.280 ? 1.128  ns/op
> CaptureStateUtilBench.explicitAllocationSuccess            avgt   30  21.809 ? 0.413  ns/op
> CaptureStateUtilBench.tlAllocationFail                     avgt   30  24.422 ? 0.673  ns/op
> CaptureStateUtilBench.tlAllocationSuccess                  avgt   30   5.182 ? 0.152  ns/op
> 
> 
> Adapted system call:
> 
>         return (int) ADAPTED_HANDLE.invoke(0, 0); // Uses a MH-internal pool
> ```        
> Explicit allocation:
> 
>         try (var arena = Arena.ofConfined()) {
>             return (int) HANDLE.invoke(arena.allocate(4), 0, 0);
>         }
> ```        
> Thread Local allocation:
> 
>         try (var arena = POOLS.take()) {
>             return (int) HANDLE.invoke(arena.allocate(4), 0, 0); // Uses a manually specified pool
>         }
> ```        
> The adapted system call exhibits a ~4x performance improvement over the existing "explicit allocation" scheme for the happy path on platform threads. ...

src/java.base/share/classes/jdk/internal/foreign/CaptureStateUtil.java line 56:

> 54:     // The method handles below are bound to static methods residing in this class
> 55: 
> 56:     private static final MethodHandle NON_NEGATIVE_INT_MH =

MethodHandle lookup has an overhead for class initialization. I think a better way of storage is something like the cache mechanism of `MethodHandleImpl.getConstantHandle`.

src/java.base/share/classes/jdk/internal/foreign/CaptureStateUtil.java line 102:

> 100:     // A key that holds both the `returnType` and the `stateName` needed to look up a
> 101:     // specific "basic handle" in the `BASIC_HANDLE_CACHE`.
> 102:     //   returnType E {int.class | long.class}

I think using `∈` or `\in` instead of `E` would be more clear.

src/java.base/share/classes/jdk/internal/foreign/CaptureStateUtil.java line 211:

> 209:                 // This is equivalent to:
> 210:                 //   computeIfAbsent(basicKey, CaptureStateUtil::basicHandleFor);
> 211:                 .computeIfAbsent(basicKey, new Function<>() {

I recommend a local record and capture the record instance in a member static final field. This code creates a function on every call. Also might be of interest whether we should use get + putIfAbsent or computeIfAbsent, as CHM has some bug that makes cIA slower than get for certain access patterns.

src/java.base/share/classes/jdk/internal/foreign/CarrierLocalArenaPools.java line 123:

> 121:          * Thread safe implementation.
> 122:          */
> 123:         public static final class OfCarrier

A public member class in a private nested class... is just weird.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23765#discussion_r1970905900
PR Review Comment: https://git.openjdk.org/jdk/pull/23765#discussion_r1970909347
PR Review Comment: https://git.openjdk.org/jdk/pull/23765#discussion_r1970911090
PR Review Comment: https://git.openjdk.org/jdk/pull/23765#discussion_r1970912512


More information about the core-libs-dev mailing list