[foreign-memaccess+abi] RFR: 8270376: Finalize API for memory copy

Thu Jul 15 09:52:34 UTC 2021

On Tue, 13 Jul 2021 20:49:25 GMT, John R Rose <jrose at openjdk.org> wrote:

>> This patch ties up some loose ends with the MemoryCopy API, and it also prepares the ground for some other related refactorings in this area.
>> 
>> The meat of this patch is represented by the various changes in MemoryCopy, where all methods were renamed to simply `copy`, and the length parameter (now called `elementCount`) is always moved to the end.
>> 
>> I've also simplified naming of parameters, as I think distinguishing between `index` and `offset` is enough (e.g. an array has an index, a segment has an offset).
>> 
>> You will see that, at the very end of the class, three more copy methods have been added, which deal fully in terms of segments. I have also moved the complex layout-based memory segment copy operation (which does the swap) as a static method in this class, as I believe the static form makes the method more regular and usable.
>> 
>> I've made some changes to our uses of `copyFrom` in the linker, to use the new methods in `MemoryCopy` when the copy operation was performing slicing in the source/target segment, which I think makes the code more readable. Of course these changes are completely optional and could be omitted as well.
>
> I agree with these simplifications.  The previous iteration on naming was useful for hammering out the API, but it won't wear well:  It's too much like keeping the training wheels on the bike forever.
> 
> Also, yes, `MemoryCopy` should be right-sized to be about copies of various sorts, not just those that involve arrays only.  Perhaps there will be byte buffer copy routines added there in the future?   Perhaps, even, stream-like operations or scatter and gathers?  We don't have to decide this today, but having a suite of static methods for easy copying makes a lot of sense.
> 
> There 's a garble in the javadoc: s/see read-only/read-only segment/
> 
> It might be worth pointing out something like this:
> 
>> Note that the first five arguments of the copy methods roughly follow the convention for System.arraycopy.  The optional trailing sixth argument controls byte order, which by default is the current native byte order.
> 
> Reasons:  Everybody that approaches an API like this needs to classify it mentally:  dest or source first?  indexes or lengths?  (etc.)  Saying that it’s like arraycopy removes angst immediately.  Also, it's good to remind people of byte order; bad stuff happens eventually when we close our eyes to byte order effects.
> 
> I agree that there should be a fluent (non-static) API for the very most common cases of copying.  But I disagree slightly with where the line is drawn.  The `copyFrom` operation should take a long offset before the source operand, to place the data at a programmable position in the destination.  Also (as a second suggestion) the `copyFrom` routine should return the updated (incremented) position.  I say this because I believe that a very common pattern of copies is to gather many smaller segments into a larger buffer.  Such a loop looks like this, with and without my suggestions:
> 
> 
> MemorySegment whole = …;
> long pos = 0;
> for (MemorySegment part : parts…) {
>    pos = whole.copyFrom(pos, part);
> }
> for (MemorySegment part : parts…) {
>    whole.copyFrom(pos, part);
>    pos += part.size();  //redundant size query, bug bait
> }
> for (MemorySegment part : parts…) {
>    var partDest = whole.asSlice(pos); //temp obj!!
>    partDest.copyFrom(part);
>    pos += part.size();  //redundant size query, bug bait
> }
> 
> 
> Regarding arrays:  I suggest replacing the byte order argument with a `ValueLayout` argument.  This makes it clearer (IMO) what is the meaning of the argument:  It's clear that it applies to the value as stored inside the MS, and not the array.  Of course, with a moment's thought, it's also obvious that's true about the ByteOrder, because the array should not ever hold anything other than native order items.  (You could also address this by changing order to dstOrder and srcOrder.)  Here's another reason to use a VL instead of a BO:  Because it's more information, you can add more extensions later on.  One that springs to mind:  a boolean value layout could imply that values copied into a Java boolean array could normalize to 0/1 values.  Another:  Vectors might use a mixed byte order convention, where the lanes are swapped one way and the lane order is swapped another way; this can be encoded in a (future) value layout but not in a BO.
> 
> Surrounding this smaller point about VL vs. BO is a larger one:  Any API that has bespoke points for each Java primitive array type is going to look out of date in Valhalla.  The way to prepare for that, IMO, is to create generic API points which use dynamic typing (such as reflection), and then define the bespoke API points as handy sugar.  That way, when we complex arrays or vector arrays, they will just play normally with the copy API.
> 
> Specific suggestions about unified array copies:
> 
> 
> /// MemoryCopy.java
> 
> // these are the primitives:
> void arraycopy(Object srcArray, int srcIndex, MemorySegment dstSegment, long dstOffset, int elementCount, ValueuLayout element);
> void arraycopy(MemorySegment srcSegment, long srcOffset, Object dstArray, int dstIndex, int elementCount, ValueuLayout element);
> 
> // these are sugar:
> void copy(char[] srcArray, int srcIndex, MemorySegment dstSegment, long dstOffset, int elementCount) {
>     arraycopy(srcArray, srcIndex, dstSegment, dstOffset, elementCount, ValueLayouts.CHAR);
> }
> void copy(MemorySegment srcSegment, long srcOffset, char[] dstArray, int dstIndex, int elementCount) {
>     arraycopy(srcSegment, srcOffset, dstArray, dstIndex, elementCount, ValueLayouts.CHAR);
> }
> //Omit these next two?
> void copy(MemorySegment srcSegment, long srcOffset, char[] dstArray, int dstIndex, int elementCount, ByteOrder srcOrder) {
>     arraycopy(srcSegment, srcOffset, dstArray, dstIndex, elementCount, ValueLayouts.CHAR.withOrder(dstOrder));
> }
> void copy(char[] srcArray, int srcIndex, MemorySegment dstSegment, long dstOffset, int elementCount, ByteOrder dstOrder) {
>     arraycopy(srcArray, srcIndex, dstSegment, dstOffset, elementCount, ValueLayouts.CHAR.withOrder(dstOrder));
> }
> 
> 
> And the handling of arrays in MC interacts with the handling of arrays in MemorySegment:
> 
> /// MemorySegment.java
> 
> // primitive
> <AT> AT toArray(Class<AT> atype, ValueLayout srcLayout);
> MemorySegment ofArray(Object arr, ValueLayout destLayout) {
>     MemorySegment res = …;
>     MemoryCopy.arraycopy(arr, 0, res, 0, Array.getLength(arr), destLayout);
>     return res;
> }
> 
> // sugar
> default char[] toCharArray() {
>     return toArray(char[].class, ValueLayouts.CHAR);
> }
> static MemorySegment ofArray(char[] arr) {
>     return ofArray(arr, ValueLayouts.CHAR);
> }
> 
> 
> Basically I'm suggesting that the API would be more coherent if you more aggressively factored all the P-array methods (where P is a primitive type) through single generic methods, akin to `System.arraycopy`.  That will make it less necessary to have multiple overloads per P-array type, since there's a way to use the generic API point to get more control.
> 
> Also, when Valhalla comes, the set of P in P-arrays will become unbounded.  You'll have flat arrays of complex float, vector128, and who knows what else.  It will be necessary at least then to rethink all of those bespoke API points for P-arrays (P in char, short, int, long etc.).
> 
> 
> P.S. (Post-script or parting shot) Although I like and accept `asSlice` just as I like and accept `subLIst`, for reducing complexity of the API, I don't like the name `asSlice`.  We have `asList` in various places, but we don't have `asSubList`, because when you "as" something you keep its contents but change the view.  But a slice is *not* the same contents as the original.  Please consider (as an independent change) `slice` instead of `asSlice`.  Everybody knows, or can learn, that a slice is a view, so there's no need to have the extra prompt of the "as" word.  Note that everybody learned that a `subList` is a view, not a copy of a series of elements.  Same story here.
> 
> P.P.S. Unrelated suggestion about `MS::spliterator`:  It should not throw if `size()` is not a multiple of the layout size; rather, it should tolerate this and simply create the first segment as a multiple of the size.  That makes the situation no worse than before, since there is still only one segment with an “odd tail”.  I’m bringing this up because that’s *exactly* how vectorized splits should work!  I.e., if the spliterator is properly tolerant, then it can be asked to split on behalf of a vectorized loop kernel, which works best when fed an exact multiple of vector lengths, but can tolerate an “odd tail” if it must.  The spliterator should return null if the original pre-split size is less than 2*N.  That’s the only case where the client code has to look around for the “odd tail”.  Naturally, an unrolled vectorized loop is best, and the best `Layout` to hand to `MS::spliterator` in that case is the size of the vector *times the unroll count*.  Again, there i
 s no cause to demand an exact multiple of that size; the spliterator shouudl be tolerant.

> @rose00 I believe I've addressed your comments. Now, when looking at the code it's clear, I think, what the primitives are. My only gripe is that this is not reflected in the API - e.g. all copy methods look the same-y when looking at the javadoc, and it's not clear to see which method is based on which. Of course we could clarify things (e.g. say that "this method is equivalent to..." in the javadoc) - but I'm wondering if the real copy primitives should be statics in `MemorySegment` - and leave `MemoryCopy` for strongly-typed sugary copies between segment and arrays.
> 
> More specifically, the methods that I don't believe belong in MemoryCopy are:
> 
>     * `copy(MemorySegment, ValueLayout, long, MemorySegment, ValueLayout, long, long)`
> 
>     * `copy(MemorySegment, ValueLayout, long, Object, int, int)`
> 
>     * `copy(Object, int, MemorySegment, ValueLayout, long, long)`
> 
> 
> But I could also claim that the following also kind of belong somewhere else:
> 
>     * `copy(MemorySegment, MemorySegment, long)`
> 
>     * `copy(MemorySegment, long, MemorySegment, long, long)`
> 
> 
> As these are just telescopic shortcuts for the general segment copy primitive above. Any thoughts?

Thinking more about this, while the new approach is, I think, beneficial in terms of making all the implementation flow into a couple of array copying routine, I don't see, as of yet, the necessity of exposing such primitives in the API. As noted, earlier, it is likely that, w/o extra work in the VM, calling these methods will be less efficient - so users will effectively have two ways for doing the same thing, one which works well, and one which doesn't.

I agree with the general idea that we should have a clear distinction between primitives and derived methods, but I think exposing an "half primitive" like we're doing now (after all the real primitive is the implementation method taking explicit byte size, stride and layout) can potentially backfire.

So, I'd like to make a suggestion to:

*  "hide" the primitive API methods:
    - `copy(MemorySegment, ValueLayout, long, Object, int, int)`
     -`copy(Object, int, MemorySegment, ValueLayout, long, long)`
* move these methods under `MemorySegment`:
   - `copy(MemorySegment, ValueLayout, long, MemorySegment, ValueLayout, long, long)`
    - `copy(MemorySegment, MemorySegment, long)`
    - `copy(MemorySegment, long, MemorySegment, long, long)`

-------------

PR: https://git.openjdk.java.net/panama-foreign/pull/568