RFR: 8254693: Add Panama feature to pass heap segments to native code
Jorn Vernee
jvernee at openjdk.org
Wed Oct 18 04:21:33 UTC 2023
Add the ability to pass heap segments to native code. This requires using `Linker.Option.critical(true)` as a linker option. It has the same limitations as normal critical calls, namely: upcalls into Java are not allowed, and the native function should return relatively quickly. Heap segments are exposed to native code through temporary native addresses that are valid for the duration of the native call.
The motivation for this is supporting existing Java array-based APIs that might have to pass multi-megabyte size arrays to native code, and are current relying on Get-/ReleasePrimitiveArrayCritical from JNI. Where making a copy of the array would be overly prohibitive.
Components of this patch:
- New binding operator `SegmentBase`, which gets the base object of a `MemorySegment`.
- Rename `UnboxAddress` to `SegmentOffset`. Add flag to specify whether processing heap segments should be allowed.
- `CallArranger` impls use new binding operators when `Linker.Option.critical(/* allowHeap= */ true)` is specified.
- `NativeMethodHandle`/`NativeEntryPoint` allow `Object` in their signatures.
- The object/oop + offset is exposed as temporary address to native code.
- Since we stay in the `_thread_in_Java` state, we can safely expose the oops passed to the downcall stub to native code, without needing GCLocker. These oops are valid until we poll for safepoint, which we never do (invoking pure native code).
- Only x64 and AArch64 for now.
- I've refactored `ArgumentShuffle` in the C++ code to no longer rely on callbacks to get the set of source and destination registers (using `CallingConventionClosure`), but instead just rely on 2 equal size arrays with source and destination registers. This allows filtering the input java registers before passing them to `ArgumentShuffle`, which is required to filter out registers holding segment offsets. Replacing placeholder registers is also done as a separate pre-processing step now. See changes in: https://github.com/openjdk/jdk/pull/16201/commits/d2b40f1117d63cc6d74e377bf88cdcf6d15ff866
- I've factored out `DowncallStubGenerator` in the x64 and AArch64 code to use a common `DowncallLinker::StubGenerator`.
- Fallback linker is also supported using JNI's `GetPrimitiveArrayCritical`/`ReleasePrimitiveArrayCritical`
Aside: fixed existing issue with `DowncallLinker` not properly acquiring segments in interpreted mode.
Numbers for the included benchmark on my machine are:
Benchmark (size) Mode Cnt Score Error Units
CriticalCalls.callNotPinned 100 avgt 30 123.060 � 5.674 ns/op
CriticalCalls.callNotPinned 10000 avgt 30 3136.032 � 46.175 ns/op
CriticalCalls.callNotPinned 1000000 avgt 30 1190692.161 � 36254.502 ns/op
CriticalCalls.callPinned 100 avgt 30 30.722 � 0.298 ns/op
CriticalCalls.callPinned 10000 avgt 30 2233.453 � 23.568 ns/op
CriticalCalls.callPinned 1000000 avgt 30 220870.350 � 1576.958 ns/op
CriticalCalls.callRecycled 100 avgt 30 38.753 � 0.269 ns/op
CriticalCalls.callRecycled 10000 avgt 30 2683.381 � 56.335 ns/op
CriticalCalls.callRecycled 1000000 avgt 30 314389.106 � 5275.236 ns/op
In particular the difference between the `callNotPinned`, which allocates a native segment and copies the heap segment into it, and the `callPinned` which is zero allocation and zero copy, is important. While the allocation can sometimes be avoided (`callRecycled`), sometimes the API's structure prevents allocations from being amortized.
Testing: `jdk_foreign`
-------------
Commit messages:
- eyeball more fixes
- ref other platforms + add back shuffle reg
- fix failing x86 test
- fix arm stubs
- fix x86_32 stubs
- Share DowncallStubGenerator impl between x64 and aarch64
- remove GCLocker calls
- fix zero compilation + disable stress test on non-debug because of missing CheckUnhandledOops flag
- fix zero for real
- fix zero + clang build
- ... and 18 more: https://git.openjdk.org/jdk/compare/1d54e73f...90fdbec0
Changes: https://git.openjdk.org/jdk/pull/16201/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16201&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8254693
Stats: 1969 lines in 60 files changed: 1169 ins; 545 del; 255 mod
Patch: https://git.openjdk.org/jdk/pull/16201.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/16201/head:pull/16201
PR: https://git.openjdk.org/jdk/pull/16201
More information about the core-libs-dev
mailing list