Unsafe vs MemorySegments / Bounds checking...

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Oct 30 18:00:49 UTC 2024


I think this discussion is spiralling out of control (which is ok :-) ).

First, I believe the inlining traces you are pointing are either 
including transient failures from C1. I recommend only using 
-XX:+PrintInlining together with -XX:-TieredCompilation and see if that 
makes the output a little better. Sometimes I also throw in -Xbatch, 
which makes the output a bit less chaotic to look at (but that can 
affect performance).

Stepping back, the problem at hand is that there might be some 
simplifications we could do to make the call stack of memory access var 
handle simpler. Provided we can back up some of these intuitions with 
hard evidence, and that we can simplify the calls _without losing 
functionality_ sure, that all sounds great. (Although I don't think 
there's many low hanging fruit in there -- but I would be glad to be 
proven wrong of course!).

What we will **not** do is split the FFM API in two parts - safe and 
unsafe. This is not just about var handles - that's only a piece of the 
story. There's bulk copy, and there's also Linker access. Once there's a 
blessed way to remove checks in one place, one can ask for ways to 
disable checks everywhere else.

IMHO the most maintainable solution for your code is to use a segment 
whose bound is Long.MAX_VALUE. That won't be _exactly_ like Unsafe 
(because of the sign check), but it will be pretty close -- and at least 
you won't have to maintain long chains of adapted var handle and cross 
your fingers that everything optimizes correctly everywhere (which I 
suspect will make inlining of the methods in your library a bit more 
predictable).

But, from where I look, it seems to me that where we have landed is not 
that bad? There's a *safe* API to access off-heap memory that in most 
cases gives you the same performance as Unsafe. In some pathological 
cases, you can give up some safety (with MemorySegment::reinterpret and 
MAX_VALUE) to get as close to Unsafe as possible. We realize that, at 
the time being, even that might not be enough. Over time, it is possible 
that the implementation of the FFM API will be improved, or that Hotspot 
inlining will get better (or both!). If past history is of any guidance, 
these things have a tendency to get better over time. Fixing this by 
adding *permanent* API warts is not the right way to approach this.

Maurizio


On 30/10/2024 17:30, Brian S O'Neill wrote:
> Even when all the inlining transformations work, it's still expensive 
> because the compiler has to do a bunch of extra work. And when the 
> calling method needs to get recompiled (as it often does), all the 
> inlining transforms need to be done all over again.
>
> In the end, the code (hopefully) gets transformed down into the 
> original Unsafe method calls that would have been written instead. A 
> set of more direct VarHandles might help, which could perform the 
> basic low-level accesses and copies that the Unsafe class provides.
>
> The current trick of transforming the VarHandle provided by the 
> ValueLayout class is a hack. It only (mostly) works for reasons that 
> are known to the FFM implementers. When I try other tricks with 
> VarHandles and MemorySegments, it either doesn't work as well or the 
> performance is substantially worse. The current hack is also extremely 
> fragile, as evidenced by the JDK 23 regression.
>
>
> On 2024-10-30 10:07 AM, Johannes Lichtenberger wrote:
>> Also the discussions are always   very Hotspot specific, so in 
>> general I've had better overall performance with the GraalVM JIT for 
>> my stuff, but it may lack behind in optimizations for the Foreign 
>> Function and Memory API and maybe the Vector API stuff, but I'm not 
>> sure if that's still true, though
>>
>>
>> Quân Anh Mai <anhmdq at gmail.com <mailto:anhmdq at gmail.com>> schrieb am 
>> Mi., 30. Okt. 2024, 17:53:
>>
>>     I agree that the implementations of the MemorySegment accessors are
>>     overly complicated for such a performance-sensitive component. It
>>     seems we can and should can simplify them massively.
>>
>


More information about the panama-dev mailing list