Memory Segment efficient array handling
    Maurizio Cimadamore 
    maurizio.cimadamore at oracle.com
       
    Thu Apr  1 10:50:17 UTC 2021
    
    
  
Hi,
so, let me start by saying that I agree that, 
functionality/usability-wise it would be nice to have something like 
this (perhaps in MemoryAccess).
On the efficiency-side of things, I think I'm less sure I actually would 
implement anything differently compared to what you have done here. You 
are creating segments, then slicing them, then using copyFrom to do a 
bulk copy. That's using the API as intended.
I would be surprised if a direct call to Unsafe (which would have to 
solve all the addressing issues that MemorySegment.ofArray already does) 
would be much faster than that. After all, all this Java code is 
optimized by the JIT compiler, which most of the times can see through 
stuff like this (e.g. creation of intermediate objects, and such).
As usual, I verify this stuff using JMH - here's a link to a benchmark I 
tweaked to add the case you are describing:
https://github.com/mcimadamore/panama-foreign/blob/intArray-bulk-bench/test/micro/org/openjdk/bench/jdk/incubator/foreign/BulkOps.java#L151
And here's the results:
```
Benchmark                    Mode  Cnt     Score     Error  Units
BulkOps.getIntArraySegments  avgt   30  7363.914 ? 154.629  ns/op
BulkOps.getIntArrayUnsafe    avgt   30  7316.283 ? 178.241  ns/op
```
As you can see there's little to tell the two apart, so IMHO, diving 
into a direct unsafe call to implement this is just not worth it (note 
that you'd have to reimplement bound, liveness checks, etc. - which this 
benchmark doesn't even do!).
Note also that this benchmark uses a copy size of 100_000 elements, 
which you might say is a lot. I also tried with 1_000, and in that case 
the segment version seems 2x faster than the unsafe one - go figure. I 
would take these numbers with a pinch of salt; the high level bit is 
that doing things at the unsafe level is NOT faster.
So, to conclude, usability-wise, these kind of bulk access operations 
might be in the spirit of what MemoryAccess already does - e.g. provide 
static accessors which implement commonly used feature - but that 
doesn't mean that these methods can be written any more efficiently 
inside the JDK.
Maurizio
On 01/04/2021 01:41, leerho wrote:
> Folks,
>
> I am in the process of refactoring our code to use FMA (from JDK16) in our
> application.  But what I find missing is the ability to do efficient
> getting and putting of arrays with MemorySegments.
>
> What I need to do often is to place part of an existing array into a
> segment at a specific offset and the reverse; getting an array of elements
> from a segment at a specific offset and placing it into an existing array
> at a specific offset.  I have looked closely at MemoryAccess and the latest
> ByteBuffer, but I have not found anything quite as flexible as the
> following.
>
> What I have ended up doing is creating an entire class of array methods
> like the following:
>
> public class MemoryArrays {
>>    public static void putIntArray(int[] srcArr, long srcIndex, long
>> numInts,
>>        MemorySegment dstSeg, long dstOffsetBytes) {
>>      MemorySegment srcSeg = MemorySegment.ofArray(srcArr);
>>      MemorySegment srcSegSlice = srcSeg.asSlice(srcIndex << 2, numInts <<
>> 2);
>>      MemorySegment dstSegSlice = dstSeg.asSlice(dstOffsetBytes, numInts <<
>> 2);
>>      dstSegSlice.copyFrom(srcSegSlice);
>>    }
>>
>>    /* ...Same as above for all primitive types... */
>>
>>    public static void getIntArray(MemorySegment srcSeg, long
>> srcOffsetBytes,
>>        int[] dstArr, long dstIndex, long numInts) {
>>      MemorySegment srcSegSlice = srcSeg.asSlice(srcOffsetBytes, numInts <<
>> 2);
>>      MemorySegment dstSeg = MemorySegment.ofArray(dstArr);
>>      MemorySegment dstSegSlice = dstSeg.asSlice(dstIndex << 2, numInts <<
>> 2);
>>      dstSegSlice.copyFrom(srcSegSlice);
>>    }
>>
>>    /* ...Same as above for all primitive types... */
>> }
>>
> I would think that if these methods were built-in to FMA either as a
> separate class or included in MemoryAccess, it would be so much more
> efficient.  All of these separate
> calls to MemorySegment could be eliminated and the entire method inlined
> with a few lines of C++ code.
>
> If Panama is interested I would be happy to contribute such a class.
>
> Lee.
    
    
More information about the panama-dev
mailing list