Memory Segment efficient array handling

Thu Apr 1 13:44:35 UTC 2021

igger sizes, bulk copy wins (which is why System.arrayCopy doesn't - 
always - do a loop).
>
>
> To be clear, I wasn't saying that looping was better. I was saying 
> that the amount of code you need to go through to get to the point 
> where you could do a bulk copy in the original code snippets 
> *probably* isn't efficient because of the wrapping and slicing among 
> other things(e.g. internal checks).
>
>
> Regarding the actual GC cost of wrapping and slicing and immutable 
> types in general: there is zero indication as an end programmer as to 
> whether or not the JVM is going to apply optimizations AFAIK. Netbeans 
> isn't yelling at me saying "Hey, you should write code this way to 
> take advantage of JVM optimization <X> in version <Y>!". Most of the 
> people who seem to have any clue what optimizations the JVM can and 
> will apply at any given time are the people who actively work on it, 
> and even then you/they throw out vague qualifiers like "most" or say 
> that what optimizations the JVM actually applies depends on a given 
> Java version. Is the expectation that people have to buy and read a 
> metaphorical "Java <X> for dummies"book every new Java release just so 
> they know what the current set of optimizations are and in what 
> situations they do and don't come into play?
>
>
> Maybe it's a dumb question but: why should anyone depend on the 
> **possibility** that the JVM **might** optimize objects that are 
> allocated and then discarded right away?

Well, why should anyone second guess the VM and prematurely optimize 
code that doesn't need optimizing? This is something that happens a lot, 
I see people writing very low level code in the hope that that would 
perform better, when the higher-level code performs just as well.

The benchmark I shared shows (with hard evidence, as opposed to 
speculation), that there's no difference between a straight Unsafe call 
and doing the slicing + copyFrom.

If evidence emerges which point to the contrary, we will deal with that, 
of course - but that just doesn't seem to be the case.

Again: this thread has nothing (or little) to do with performance and 
everything to do with usability. Let's keep the performance/GC side out 
of the equation, please.

As for the usability bit, I've filed this:

https://bugs.openjdk.java.net/browse/JDK-8264594

We're happy to accept contributions in that direction!

Maurizio

>
>
>>
>> Maurizio
>>
>>>
>>>
>>> That said, maybe it wouldn't be a bad idea to do add overloads for 
>>> copyFrom that take an index and offset into account for when the 
>>> source is a MemorySegment.
>>>
>>>
>>> On 3/31/21 7:41 PM, leerho wrote:
>>>> Folks,
>>>>
>>>> I am in the process of refactoring our code to use FMA (from JDK16) 
>>>> in our
>>>> application.  But what I find missing is the ability to do efficient
>>>> getting and putting of arrays with MemorySegments.
>>>>
>>>> What I need to do often is to place part of an existing array into a
>>>> segment at a specific offset and the reverse; getting an array of 
>>>> elements
>>>> from a segment at a specific offset and placing it into an existing 
>>>> array
>>>> at a specific offset.  I have looked closely at MemoryAccess and 
>>>> the latest
>>>> ByteBuffer, but I have not found anything quite as flexible as the
>>>> following.
>>>>
>>>> What I have ended up doing is creating an entire class of array 
>>>> methods
>>>> like the following:
>>>>
>>>> public class MemoryArrays {
>>>>>    public static void putIntArray(int[] srcArr, long srcIndex, long
>>>>> numInts,
>>>>>        MemorySegment dstSeg, long dstOffsetBytes) {
>>>>>      MemorySegment srcSeg = MemorySegment.ofArray(srcArr);
>>>>>      MemorySegment srcSegSlice = srcSeg.asSlice(srcIndex << 2, 
>>>>> numInts <<
>>>>> 2);
>>>>>      MemorySegment dstSegSlice = dstSeg.asSlice(dstOffsetBytes, 
>>>>> numInts <<
>>>>> 2);
>>>>>      dstSegSlice.copyFrom(srcSegSlice);
>>>>>    }
>>>>>
>>>>>    /* ...Same as above for all primitive types... */
>>>>>
>>>>>    public static void getIntArray(MemorySegment srcSeg, long
>>>>> srcOffsetBytes,
>>>>>        int[] dstArr, long dstIndex, long numInts) {
>>>>>      MemorySegment srcSegSlice = srcSeg.asSlice(srcOffsetBytes, 
>>>>> numInts <<
>>>>> 2);
>>>>>      MemorySegment dstSeg = MemorySegment.ofArray(dstArr);
>>>>>      MemorySegment dstSegSlice = dstSeg.asSlice(dstIndex << 2, 
>>>>> numInts <<
>>>>> 2);
>>>>>      dstSegSlice.copyFrom(srcSegSlice);
>>>>>    }
>>>>>
>>>>>    /* ...Same as above for all primitive types... */
>>>>> }
>>>>>
>>>> I would think that if these methods were built-in to FMA either as a
>>>> separate class or included in MemoryAccess, it would be so much more
>>>> efficient.  All of these separate
>>>> calls to MemorySegment could be eliminated and the entire method 
>>>> inlined
>>>> with a few lines of C++ code.
>>>>
>>>> If Panama is interested I would be happy to contribute such a class.
>>>>
>>>> Lee.