RFR: 8254693: Add Panama feature to pass heap segments to native code

Jorn Vernee jvernee at openjdk.org
Wed Oct 18 04:21:33 UTC 2023


Add the ability to pass heap segments to native code. This requires using `Linker.Option.critical(true)` as a linker option. It has the same limitations as normal critical calls, namely: upcalls into Java are not allowed, and the native function should return relatively quickly. Heap segments are exposed to native code through temporary native addresses that are valid for the duration of the native call.

The motivation for this is supporting existing Java array-based APIs that might have to pass multi-megabyte size arrays to native code, and are current relying on Get-/ReleasePrimitiveArrayCritical from JNI. Where making a copy of the array would be overly prohibitive.

Components of this patch:

- New binding operator `SegmentBase`, which gets the base object of a `MemorySegment`.
- Rename `UnboxAddress` to `SegmentOffset`. Add flag to specify whether processing heap segments should be allowed.
- `CallArranger` impls use new binding operators when `Linker.Option.critical(/* allowHeap= */ true)` is specified.
- `NativeMethodHandle`/`NativeEntryPoint` allow `Object` in their signatures.
- The object/oop + offset is exposed as temporary address to native code.
- Since we stay in the `_thread_in_Java` state, we can safely expose the oops passed to the downcall stub to native code, without needing GCLocker. These oops are valid until we poll for safepoint, which we never do (invoking pure native code).
- Only x64 and AArch64 for now.
- I've refactored `ArgumentShuffle` in the C++ code to no longer rely on callbacks to get the set of source and destination registers (using `CallingConventionClosure`), but instead just rely on 2 equal size arrays with source and destination registers. This allows filtering the input java registers before passing them to `ArgumentShuffle`, which is required to filter out registers holding segment offsets. Replacing placeholder registers is also done as a separate pre-processing step now. See changes in: https://github.com/openjdk/jdk/pull/16201/commits/d2b40f1117d63cc6d74e377bf88cdcf6d15ff866
- I've factored out `DowncallStubGenerator` in the x64 and AArch64 code to use a common `DowncallLinker::StubGenerator`.
- Fallback linker is also supported using JNI's `GetPrimitiveArrayCritical`/`ReleasePrimitiveArrayCritical`

Aside: fixed existing issue with `DowncallLinker` not properly acquiring segments in interpreted mode.

Numbers for the included benchmark on my machine are:


Benchmark                     (size)  Mode  Cnt        Score       Error  Units
CriticalCalls.callNotPinned      100  avgt   30      123.060 �     5.674  ns/op
CriticalCalls.callNotPinned    10000  avgt   30     3136.032 �    46.175  ns/op
CriticalCalls.callNotPinned  1000000  avgt   30  1190692.161 � 36254.502  ns/op
CriticalCalls.callPinned         100  avgt   30       30.722 �     0.298  ns/op
CriticalCalls.callPinned       10000  avgt   30     2233.453 �    23.568  ns/op
CriticalCalls.callPinned     1000000  avgt   30   220870.350 �  1576.958  ns/op
CriticalCalls.callRecycled       100  avgt   30       38.753 �     0.269  ns/op
CriticalCalls.callRecycled     10000  avgt   30     2683.381 �    56.335  ns/op
CriticalCalls.callRecycled   1000000  avgt   30   314389.106 �  5275.236  ns/op


In particular the difference between the `callNotPinned`, which allocates a native segment and copies the heap segment into it, and the `callPinned` which is zero allocation and zero copy, is important. While the allocation can sometimes be avoided (`callRecycled`), sometimes the API's structure prevents allocations from being amortized.

Testing: `jdk_foreign`

-------------

Commit messages:
 - eyeball more fixes
 - ref other platforms + add back shuffle reg
 - fix failing x86 test
 - fix arm stubs
 - fix x86_32 stubs
 - Share DowncallStubGenerator impl between x64 and aarch64
 - remove GCLocker calls
 - fix zero compilation + disable stress test on non-debug because of missing CheckUnhandledOops flag
 - fix zero for real
 - fix zero + clang build
 - ... and 18 more: https://git.openjdk.org/jdk/compare/1d54e73f...90fdbec0

Changes: https://git.openjdk.org/jdk/pull/16201/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16201&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8254693
  Stats: 1969 lines in 60 files changed: 1169 ins; 545 del; 255 mod
  Patch: https://git.openjdk.org/jdk/pull/16201.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16201/head:pull/16201

PR: https://git.openjdk.org/jdk/pull/16201


More information about the core-libs-dev mailing list