RFR: 8347408: Create an internal method handle adapter for system calls with errno

Fri Feb 7 15:30:39 UTC 2025

Going forward, converting older JDK code to use the relatively new FFM API requires system calls that can provide `errno` and the likes to explicitly allocate a `MemorySegment` to capture potential error states. This can lead to negative performance implications if not designed carefully and also introduces unnecessary code complexity.

Hence, this PR proposes to add a JDK internal method handle adapter that can be used to handle system calls with `errno`, `GetLastError`, and `WSAGetLastError`.

It relies on an efficient carrier-thread-local cache of memory regions to allide allocations.

Here are some benchmarks that ran on a platform thread and virtual threads respectively:

Benchmark                                                  Mode  Cnt   Score   Error  Units
CaptureStateUtilBench.OfVirtual.adaptedSysCallFail         avgt   30  24.193 ? 0.268  ns/op
CaptureStateUtilBench.OfVirtual.adaptedSysCallSuccess      avgt   30   8.268 ? 0.080  ns/op
CaptureStateUtilBench.OfVirtual.explicitAllocationFail     avgt   30  42.076 ? 1.003  ns/op
CaptureStateUtilBench.OfVirtual.explicitAllocationSuccess  avgt   30  21.801 ? 0.138  ns/op
CaptureStateUtilBench.OfVirtual.tlAllocationFail           avgt   30  23.265 ? 0.087  ns/op
CaptureStateUtilBench.OfVirtual.tlAllocationSuccess        avgt   30   8.285 ? 0.155  ns/op

CaptureStateUtilBench.adaptedSysCallFail                   avgt   30  23.033 ? 0.423  ns/op
CaptureStateUtilBench.adaptedSysCallSuccess                avgt   30   3.676 ? 0.104  ns/op  // <- Happy path using an internal pool

CaptureStateUtilBench.explicitAllocationFail               avgt   30  42.023 ? 0.736  ns/op
CaptureStateUtilBench.explicitAllocationSuccess            avgt   30  22.013 ? 0.648  ns/op  // <- Allocating memory upon each invocation

CaptureStateUtilBench.tlAllocationFail                     avgt   30  22.050 ? 0.233  ns/op
CaptureStateUtilBench.tlAllocationSuccess                  avgt   30   3.756 ? 0.056  ns/op  // <- Using the pool explicitly from Java code

Adapted system call:

        return (int) ADAPTED_HANDLE.invoke(0, 0); // Uses a MH-internal pool

Explicit allocation:

        try (var arena = Arena.ofConfined()) {
            return (int) HANDLE.invoke(arena.allocate(4), 0, 0);
        }

Thread Local allocation:

        try (var arena = POOLS.take()) {
            return (int) HANDLE.invoke(arena.allocate(4), 0, 0); // Uses a manually specified pool
        }

The adapted system call exhibits a ~6x performance improvement over the existing "explicit allocation" scheme for the happy path on platform threads. Because there needs to be sharing across threads for virtual-tread-capable carrier threads, these are a bit slower ("only" ~2.5x faster).

Tested and passed tiers 1-3.

-------------

Commit messages:
 - Bump copyright year
 - Add benchmarks
 - Add method handle adapter for system calls

Changes: https://git.openjdk.org/jdk/pull/23517/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23517&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8347408
  Stats: 1381 lines in 11 files changed: 1370 ins; 2 del; 9 mod
  Patch: https://git.openjdk.org/jdk/pull/23517.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/23517/head:pull/23517

PR: https://git.openjdk.org/jdk/pull/23517