Part 1 Proposal for JDK-8264594
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Apr 27 18:12:21 UTC 2021
On 27/04/2021 19:03, leerho wrote:
> Maurizio,
>
> * Naming -- Yes it is hard. But rather than having to follow naming
> practices elsewhere in Java, where your thoughts ended up bypasses the
> whole naming issue. Elegant!
:-)
> * Forget boxed types. Agree,
> * Destination vs Target - In my own code I have always used src/dst,
> but I read on this forum that someone preferred target, thus src/tgt.
> That is why I suggested target. But I am ambivalent. So if we agree
> on src/dst, then we should be consistent in these discussions :)
>
> I was hesitant to broach the topic of implementation until we agreed
> on the API, but I can see that you took a holistic approach, which led
> you to a more elegant solution.
Well, I guess I tried to look at what's really wrong with the current
minimalistic API and it came down to:
* lack of symmetry
* need to box arrays into temporary segments
* lack of endianness options
And realized that the noise introduced by these factors was seriously
messing up the code readability.
>
> It is ironic that I also came up with the idea of enhancing copyFrom,
> but to a more generic copy({src args}, {dst args}) as an efficient way
> to do the implementations of these methods, and I was going to suggest
> it when we talked about the actual implementations. But I like your
> approach better.
>
> To expand on what I think you are proposing, we would be adding to
> MemorySegment:
>
> public MemorySegment copyInto(byte[] dstArray, int dstStartIndex,
> dstElements) {} //assumes native order
> public MemorySegment copyInto(byte[] dstArray, int dstStartIndex,
> dstElements, ByteOrder order) {}
> ... 6 more pairs
> public MemorySegment copyFrom(byte[] srcArray, int srcStartIndex,
> srcElements) {} //assumes native order
> public MemorySegment copyFrom (byte[] srcArray, int srcStartIndex,
> srcElements, ByteOrder order) {}
> --- 6 more pairs
>
> A total of 28 methods.
These are a lot of methods, for sure. I wonder if we could merge some of
these a la System::arrayCopy - e.g. make the array parameter be Object.
If we do that, we only need 2/4 (depending on the overloads) ?
>
> I still would prefer that the names of the arguments clearly reflect
> source or destination and the units. Since "length" or "size" can be
> ambiguous. But, alas, I can live with just "dstStart" and "length" :)
I didn't mean to make a strong proposal on argument names - we can tweak
those as needed.
>
> We would need to document for the user that the more general use case
> would for the user to write:
>
> ```
> MemorySegment srcSegment = ...
> int[] dstArray = ...
> srcSegment.asSlice(srcOffsetBytes, srcLengthBytes).copyInto(dstArray,
> dstIndex, dstElements);
>
> //and
> MemorySegment dstSegment = ...
> int[] srcArray = ...
> dstSegment.asSlice(dstOffsetBytes, dstLengthBytes).copyFrom(srcArray,
> srcIndex, srcElements);
> ```
> This use case clearly shows that the units of offset and length on the
> slice are different from the units of offset and length in the
> copyInto/From arguments. It would be helpful to the user if the
> argument names could reflect that.
>
> Although I realize that asSlice(long offset, long newSize) already
> exists, is it possible to enhance the argument names to something like
> asSlice(long offsetBytes, long newSizeBytes) ? or offsetBytes,
> lengthBytes? ... in a future release?
Sure, I can tweak the names even as part of the current integration PR:
https://github.com/openjdk/jdk/pull/3699
(in fact, we have a lot more latitude to fix javadoc and things like
this - as they are not, strictly speaking, new features).
Maurizio
>
> Lee.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 27, 2021 at 3:16 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
> Hi Lee,
> this is good exploratory work. I have some instant "off the hip"
> reactions to it, which I'll address here (followed by some more
> useful insights below):
>
> * On naming, while I agree that naming is hard - I think we also
> have to consider what the rest of the JDK does. For instance,
> these bulk methods on ByteBuffers are just called put/get. I'm not
> aware of any other similar method in the JDK with names like the
> one you propose. One possibility, especially if we want to add
> this as part of `MemoryAccess` would be to double down on the
> current naming strategy. MemoryAccess uses getByte/setByte - can
> we get away with getByteS (segment -> array) and setByteS (array
> -> segment). This looks odd at first, but if you squint - these
> operations are doing some sort of "multivalue" set/get. Now, the
> signature of getByte returns a byte - but if we're in multivalue
> world, that would be a byte[] (in the target position). The dual
> case, setByte, takes a byte parameter, so that would be a byte[]
> (in the source position). I think it's a stretch, but it works out
> (and it's more or less similar to what was done for bytebuffers,
> so likely more familiar to developers).
>
> * I don't agree with the direction of supporting boxed array
> types. These arrays are reference arrays, there's nothing special
> about them, other than their names, which is vaguely
> primitive-sounding. The main idea for MemoryAccess is to provide
> fast primitive to manipulate memory; I don't think there's
> anything which can be called "fast" to copy stuff from a
> MemorySegment into a Integer[] - you have to through all the
> 4-bytes segment slices and turn them into Integer one by one. This
> is not bulk, it should at the very least not be named in the same
> way as the others, but I think that, realistically, it just
> doesn't belong here.
>
> * naming-wise I think "destination" (which abbreviates to "dst")
> works better than "target".
>
> But let me go back at the original issue: copying elements in bulk
> from an existing segment to an existing array (and viceversa) is
> perceived as too verbose. Let's try to write down the code that
> would copy some ints from a segment into an int array:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart, length);
> MemorySegment dstSegment = MemorySegment.ofArray(targetArray);
> MemorySegment dstSegmentSlice = dstSegment.asSlice(dstStart, length);
> dstSegmentSlice.copyFrom(srcSegmentSlice);
>
> ```
>
> Now, there are some issues here - but there are two things which
> stand out as particularly awkward:
>
> * We have to wrap an array into an heap segment and then slice it.
> For the source segment, we still have to slice, but that's a
> already in segment-form, so that step doesn't feel so punitive.
> * Any sense of "data flow" is loss in the above code. I think the
> main culprit is that, since we only have "copyFrom", the receiver
> segment has to be that temporary heap segment created above,
> which, in the developer's perspective, is totally unimportant. But
> seriously, staring at the above code, we'd have an hard time
> understanding that the code is "just" copy bytes from the segments
> into the array.
>
> To address these concerns we could do two things in the
> MemorySegment API:
>
> * add overloads of copyFrom, so that we can copy things not only
> "from" a segment - but also from an array (with specified indices)
> * introduce a "copyTo" which is the same as copyFrom, but with
> reverse polarity (e.g. copy from THIS segment to somewhere else)
> * add byte swapping support to both copyFrom and copyInto
>
> With these we can compress the above code as follows:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart,
> length); // we still need to slice this
> srcSegmentSlice.copyInto(targetArray, dstStart, length);
>
> ```
>
> Which, using fluent style would be:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> srcSegment.asSlice(srcStart, length)
> .copyInto(targetArray, dstStart, length);
>
> ```
>
> It's not a one lines, but it looks a lot better than where we
> started from. A casual reader here would have no issues
> understanding the data flow - clearly bytes go from the segment
> into the array.
>
>
> I think that maybe I would prefer to see something like this
> rather than having more static accessors dumped onto the side. We
> already have "copyFrom" in memory segment, the issue that it has
> is that it is too limited, and its lack of symmetry (lack of
> "copyTo") is forcing users down paths which make code less
> readable. Let's focus on that?
>
> Maurizio
>
>
>
> On 27/04/2021 06:54, leerho wrote:
>> This write-up is to propose the design approach, Part 1, for JBS
>> JDK-8264594 <https://bugs.openjdk.java.net/browse/JDK-8264594>.
>>
>> Feedback is appreciated.
>>
>> Lee.
>
More information about the panama-dev
mailing list