Part 1 Proposal for JDK-8264594

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Tue Apr 27 18:12:21 UTC 2021


On 27/04/2021 19:03, leerho wrote:
> Maurizio,
>
> * Naming -- Yes it is hard.  But rather than having to follow naming 
> practices elsewhere in Java, where your thoughts ended up bypasses the 
> whole naming issue.  Elegant!
:-)
> * Forget boxed types.  Agree,
> * Destination vs Target - In my own code I have always used src/dst, 
> but I read on this forum that someone preferred target, thus src/tgt. 
> That is why I suggested target.  But I am ambivalent. So if we agree 
> on src/dst, then we should be consistent in these discussions :)
>
> I was hesitant to broach the topic of implementation until we agreed 
> on the API, but I can see that you took a holistic approach, which led 
> you to a more elegant solution.
Well, I guess I tried to look at what's really wrong with the current 
minimalistic API and it came down to:

* lack of symmetry
* need to box arrays into temporary segments
* lack of endianness options

And realized that the noise introduced by these factors was seriously 
messing up the code readability.

>
> It is ironic that I also came up with the idea of enhancing copyFrom, 
> but to a more generic copy({src args}, {dst args}) as an efficient way 
> to do the implementations of these methods, and I was going to suggest 
> it when we talked about the actual implementations.   But I like your 
> approach better.
>
> To expand on what I think you are proposing,  we would be adding to 
> MemorySegment:
>
> public MemorySegment copyInto(byte[] dstArray, int dstStartIndex, 
> dstElements) {} //assumes native order
> public MemorySegment copyInto(byte[] dstArray, int dstStartIndex, 
> dstElements, ByteOrder order) {}
> ... 6 more pairs
> public MemorySegment copyFrom(byte[] srcArray, int srcStartIndex, 
> srcElements) {} //assumes native order
> public MemorySegment copyFrom (byte[] srcArray, int srcStartIndex, 
> srcElements, ByteOrder order) {}
> --- 6 more pairs
>
> A total of 28 methods.
These are a lot of methods, for sure. I wonder if we could merge some of 
these a la System::arrayCopy - e.g. make the array parameter be Object. 
If we do that, we only need 2/4 (depending on the overloads) ?
>
> I still would prefer that the names of the arguments clearly reflect 
> source or destination and the units.  Since "length" or "size" can be 
> ambiguous.  But, alas, I can live with just "dstStart" and "length" :)
I didn't mean to make a strong proposal on argument names - we can tweak 
those as needed.
>
> We would need to document for the user that the more general use case 
> would for the user to write:
>
> ```
> MemorySegment srcSegment = ...
> int[] dstArray = ...
> srcSegment.asSlice(srcOffsetBytes, srcLengthBytes).copyInto(dstArray, 
> dstIndex, dstElements);
>
> //and
> MemorySegment dstSegment = ...
> int[] srcArray = ...
> dstSegment.asSlice(dstOffsetBytes, dstLengthBytes).copyFrom(srcArray, 
> srcIndex, srcElements);
> ```
> This use case clearly shows that the units of offset and length on the 
> slice are different from the units of offset and length in the 
> copyInto/From arguments.  It would be helpful to the user if the 
> argument names could reflect that.
>
> Although I realize that  asSlice(long offset, long newSize) already 
> exists, is it possible to enhance the argument names to something like 
> asSlice(long offsetBytes, long newSizeBytes) ? or offsetBytes, 
> lengthBytes? ... in a future release?

Sure, I can tweak the names even as part of the current integration PR:

https://github.com/openjdk/jdk/pull/3699

(in fact, we have a lot more latitude to fix javadoc and things like 
this - as they are not, strictly speaking, new features).

Maurizio


>
> Lee.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 27, 2021 at 3:16 AM Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com 
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
>     Hi Lee,
>     this is good exploratory work. I have some instant "off the hip"
>     reactions to it, which I'll address here (followed by some more
>     useful insights below):
>
>     * On naming, while I agree that naming is hard - I think we also
>     have to consider what the rest of the JDK does. For instance,
>     these bulk methods on ByteBuffers are just called put/get. I'm not
>     aware of any other similar method in the JDK with names like the
>     one you propose. One possibility, especially if we want to add
>     this as part of `MemoryAccess` would be to double down on the
>     current naming strategy. MemoryAccess uses getByte/setByte - can
>     we get away with getByteS (segment -> array) and setByteS (array
>     -> segment). This looks odd at first, but if you squint - these
>     operations are doing some sort of "multivalue" set/get. Now, the
>     signature of getByte returns a byte - but if we're in multivalue
>     world, that would be a byte[] (in the target position). The dual
>     case, setByte, takes a byte parameter, so that would be a byte[]
>     (in the source position). I think it's a stretch, but it works out
>     (and it's more or less similar to what was done for bytebuffers,
>     so likely more familiar to developers).
>
>     * I don't agree with the direction of supporting boxed array
>     types. These arrays are reference arrays, there's nothing special
>     about them, other than their names, which is vaguely
>     primitive-sounding. The main idea for MemoryAccess is to provide
>     fast primitive to manipulate memory; I don't think there's
>     anything which can be called "fast" to copy stuff from a
>     MemorySegment into a Integer[] - you have to through all the
>     4-bytes segment slices and turn them into Integer one by one. This
>     is not bulk, it should at the very least not be named in the same
>     way as the others, but I think that, realistically, it just
>     doesn't belong here.
>
>     * naming-wise I think "destination" (which abbreviates to "dst")
>     works better than "target".
>
>     But let me go back at the original issue: copying elements in bulk
>     from an existing segment to an existing array (and viceversa) is
>     perceived as too verbose. Let's try to write down the code that
>     would copy some ints from a segment into an int array:
>
>     ```
>     MemorySegment srcSegment = ...
>     int[] targetArray = ...
>     //start here
>     MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart, length);
>     MemorySegment dstSegment = MemorySegment.ofArray(targetArray);
>     MemorySegment dstSegmentSlice = dstSegment.asSlice(dstStart, length);
>     dstSegmentSlice.copyFrom(srcSegmentSlice);
>
>     ```
>
>     Now, there are some issues here - but there are two things which
>     stand out as particularly awkward:
>
>     * We have to wrap an array into an heap segment and then slice it.
>     For the source segment, we still have to slice, but that's a
>     already in segment-form, so that step doesn't feel so punitive.
>     * Any sense of "data flow" is loss in the above code. I think the
>     main culprit is that, since we only have "copyFrom", the receiver
>     segment has to be that temporary heap segment created above,
>     which, in the developer's perspective, is totally unimportant. But
>     seriously, staring at the above code, we'd have an hard time
>     understanding that the code is "just" copy bytes from the segments
>     into the array.
>
>     To address these concerns we could do two things in the
>     MemorySegment API:
>
>     * add overloads of copyFrom, so that we can copy things not only
>     "from" a segment - but also from an array (with specified indices)
>     * introduce a "copyTo" which is the same as copyFrom, but with
>     reverse polarity (e.g. copy from THIS segment to somewhere else)
>     * add byte swapping support to both copyFrom and copyInto
>
>     With these we can compress the above code as follows:
>
>     ```
>     MemorySegment srcSegment = ...
>     int[] targetArray = ...
>     //start here
>     MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart,
>     length);  // we still need to slice this
>     srcSegmentSlice.copyInto(targetArray, dstStart, length);
>
>     ```
>
>     Which, using fluent style would be:
>
>     ```
>     MemorySegment srcSegment = ...
>     int[] targetArray = ...
>     //start here
>     srcSegment.asSlice(srcStart, length)
>                        .copyInto(targetArray, dstStart, length);
>
>     ```
>
>     It's not a one lines, but it looks a lot better than where we
>     started from. A casual reader here would have no issues
>     understanding the data flow - clearly bytes go from the segment
>     into the array.
>
>
>     I think that maybe I would prefer to see something like this
>     rather than having more static accessors dumped onto the side. We
>     already have "copyFrom" in memory segment, the issue that it has
>     is that it is too limited, and its lack of symmetry (lack of
>     "copyTo") is forcing users down paths which make code less
>     readable. Let's focus on that?
>
>     Maurizio
>
>
>
>     On 27/04/2021 06:54, leerho wrote:
>>     This write-up is to propose the design approach, Part 1, for JBS
>>     JDK-8264594 <https://bugs.openjdk.java.net/browse/JDK-8264594>.
>>
>>     Feedback is appreciated.
>>
>>     Lee.
>


More information about the panama-dev mailing list