Pinning of on-heap MemorySegment

Radosław Smogura rsmogura at icloud.com
Mon Aug 28 16:56:28 UTC 2023


Hi all,

I would like to see pinning, however I know that there’s bigger concern with pinning.

Pinning prevents GC from compaction and increase memory fragmentation.

I think other approach would be for ImageIO to use MemorySegment instead of operating on int arrays.

For programming like CUDA this can bring additional benefits like using memory mapping between host and device. 

Kind regards,
Rado

> On 28 Aug 2023, at 18:43, Jorn Vernee <jorn.vernee at oracle.com> wrote:
> 
> I'd like to point at the JBS issue for this: https://bugs.openjdk.org/browse/JDK-8254693 I've been working on this lately, but not ready to share a patch yet (I'm also away currently).
> 
> GetPrimitiveArrayCritical/ReleasePrimitiveArrayCritical don't allow you to do much in the interim between them:
> 
> > Inside a critical region, native code must not call other JNI functions, or any system call that may cause the current thread to block and wait for another Java thread.
> 
> We also can not safely return to Java code, because executing Java code while GCLocker is active can result in VM crashes. I've discussed this with GC folks a while back as well, and it seems that generally there is not much lenience around this, and we should stick to this strict contract when using GCLocker.
> 
> I've implemented something similar to what you've done in a POC a while ago [1]. This falls back to doing copies when a GC does not support pinning. But, I'm not very happy with this approach since most GCs don't support pinning, so in most cases you would not really get a benefit from it. I've been looking at a way to support this so that we can use GCLocker as well. I think the only option is adding a linker option that specifies that heap segments should be accessible during a downcall, and then activate GCLocker/pin objects (depending on GC) in the downcall stub just before invoking the target method.
> 
> Jorn
> 
> [1]: https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:Pinning
> 
>> On 28/08/2023 15:04, Yasumasa Suenaga wrote:
>> Hi Maurizio,
>> 
>> Thank you for knowing ffmasm :)
>> 
>> MS::pin/unpin in my proposal are same semantics with GetPrimitiveArrayCritical/ReleasePrimitiveArrayCritical in JNI. They might not be leveraged a lot of developers, but they help a part of experienced developers to improve their application performance. In JNI, we call GetPrimitiveArrayCritical() to access large arrays without memory copy cost. It makes sence. I just want to do same operation in FFM.
>> 
>> Let's think about image processing. When we want to binarize JPEG image, we would load original image with ImageIO API. Then we can get pixels from BufferedImage::getRGB() as int[]. So we have to copy pixels into off-heap MemorySegment if we want to perform operations in native (e.g. SIMD, GPU processing). I'm sure most of Java API / third party libraries use primitive array, not MemorySegment. So I think it is better if we pin on-heap memory and leverage it in native.
>> 
>> Of course I understand it might be some worse behaviors especially GC, so pinning might not be recommended for all of Java developers. But I believe pinning is welcomed from developers who experienced in native (C/C++/assembly/GPGPU and so on) because they want to offload processing which needs large memory (preprocessed in Java).
>> 
>> 
>> Thanks,
>> 
>> Yasumasa
>> 
>> 
>> 2023-08-28 18:48 に Maurizio Cimadamore さんは書きました:
>>> On 28/08/2023 08:29, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>> 
>>>> I'm very interested in FFM, especially generate assembly code in Java and calling them with FFM [1].
>>> Hi,
>>> I've seen your work a year or so ago and have been very impressed by it :-)
>>>> 
>>>> I think one of the performance neck is MemorySegment because all of on-heap regions should be copied into off-heap region (native segment) when they are referred from foreign function. So I'm expecting to implement sort of pinning operation for on-heap MemorySegment like a JNI. I guess it is mentioned in FFM update in last month [2], however "Pinning of heap segments" does not have any links - I guess nobody is working for this yet. Do you have any updates for pinning?
>>>> 
>>>> I've played them with both OpenJDK 22 with pinning support [3] and ffmasm (hand-assembler for Java powered by FFM) [4]. I added pin/unpin method into Unsafe, and they are called by HeapMemorySegmentImpl. Finally I got about 16x performance gain compared to non-pinning code [5] on my laptop.
>>>> 
>>>> I guess FFM API reaches to goal step by step, but it is still a preview in JDK 21. I hope that pinning feature is supported into JDK 22 because I believe we can leverage FFM for more fields! I'm happy to contribute/help to implement pinning feature if it needs.
>>> 
>>> While there's no doubt that in some applications and use cases pinning
>>> provides a significant performance boost, there are some challenges:
>>> 
>>> 1. scope of pinning: is pinning allowed on a per-native-call basis? Or
>>> is it something more general?
>>> 2. does the garbage collection support region-based pinning [1] ?
>>> 3. why is pinning needed in the first place?
>>> 
>>> (1) and (2) are very much linked. Not all garbage collectors support
>>> fine-grained pinning mechanism. Which means that, in most of them, if
>>> you pin, you effectively block GC for the entire duration of the pin
>>> operation (GC locker mechanism). This is something that, as I'm sure
>>> you understand, is not very desirable. For this reason, it might be
>>> better to consider a pinning API which only pins for the duration of a
>>> native call (e.g. in the shape of an additional linker option). While
>>> FFM could support more complex pinning policies (e.g. pin a segment
>>> inside an Arena, so that segment is unpinned when the arena is
>>> closed), given the uneven support for fine-grained pinning across GCs,
>>> I'm not sure such a general API (which is similar to your "MS::pin"
>>> method) would be a good idea. We're doing some experiments for adding
>>> a new linker option which allows for pin heap segments to be pinned
>>> when calling a downcall method handle, we're not yet sure of its
>>> inclusion, but it would be something worth publishing somewhere (when
>>> ready) so that developers (like you) can play with it and provide
>>> feedback.
>>> 
>>> Then there's (3). Most of the times, pinning is used in order to
>>> interact with native calls from public-facing APIs that are "stuck"
>>> using array syntax. That is, in order to be user friendly, such API
>>> work with arrays - but then a problem arises when trying to use the
>>> contents of the array off-heap. But what if the memory was off-heap to
>>> begin with? Then no memory copy would be required. I believe the
>>> biggest impediment for off-heap memory being used directly has to do
>>> with the fact that, for users, interacting with an `int[]` is
>>> significantly easier than interacting with a `MemorySegment`, or a
>>> `ByteBuffer`. But what if we could provide some mechanism to create an
>>> "array view" over an off-heap memory region? Now clients would be able
>>> to use the beloved `[]` syntax, even if memory access remained
>>> off-heap.
>>> 
>>> While we don't have any concrete proposal on this latter point, we do
>>> believe that the topic of making memory segments (or byte buffer)
>>> easier to access for "legacy clients" is inextricably linked to the
>>> topic of pinning of heap memory.
>>> 
>>> [1] - https://openjdk.org/jeps/423
>>> 
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Yasumasa
>>>> 
>>>> 
>>>> [1] https://github.com/YaSuenag/ffmasm
>>>> [2] https://mail.openjdk.org/pipermail/panama-dev/2023-July/019510.html
>>>> [3] https://github.com/YaSuenag/jdk/commit/f0a9b3705b3ecdf3dbb6b80cac9d53456f08f967
>>>> [4] https://github.com/YaSuenag/ffmasm/commit/925608538b936db1b311ae84e12fa0252058b7f4
>>>> [5] https://github.com/YaSuenag/ffmasm/blob/ffm-pinning/benchmarks/vectorapi/src/main/java/com/yasuenag/ffmasm/benchmark/vectorapi/VectorOpComparison.java


More information about the panama-dev mailing list