Part 1 Proposal for JDK-8264594
leerho
leerho at gmail.com
Tue Apr 27 18:03:29 UTC 2021
Maurizio,
* Naming -- Yes it is hard. But rather than having to follow naming
practices elsewhere in Java, where your thoughts ended up bypasses the
whole naming issue. Elegant!
* Forget boxed types. Agree,
* Destination vs Target - In my own code I have always used src/dst, but I
read on this forum that someone preferred target, thus src/tgt. That is why
I suggested target. But I am ambivalent. So if we agree on src/dst, then
we should be consistent in these discussions :)
I was hesitant to broach the topic of implementation until we agreed on the
API, but I can see that you took a holistic approach, which led you to a
more elegant solution.
It is ironic that I also came up with the idea of enhancing copyFrom, but
to a more generic copy({src args}, {dst args}) as an efficient way to do
the implementations of these methods, and I was going to suggest it when we
talked about the actual implementations. But I like your approach better.
To expand on what I think you are proposing, we would be adding to
MemorySegment:
public MemorySegment copyInto(byte[] dstArray, int dstStartIndex,
dstElements) {} //assumes native order
public MemorySegment copyInto(byte[] dstArray, int dstStartIndex,
dstElements, ByteOrder order) {}
... 6 more pairs
public MemorySegment copyFrom(byte[] srcArray, int srcStartIndex,
srcElements) {} //assumes native order
public MemorySegment copyFrom (byte[] srcArray, int srcStartIndex,
srcElements, ByteOrder order) {}
--- 6 more pairs
A total of 28 methods.
I still would prefer that the names of the arguments clearly reflect source
or destination and the units. Since "length" or "size" can be ambiguous.
But, alas, I can live with just "dstStart" and "length" :)
We would need to document for the user that the more general use case would
for the user to write:
```
MemorySegment srcSegment = ...
int[] dstArray = ...
srcSegment.asSlice(srcOffsetBytes, srcLengthBytes).copyInto(dstArray,
dstIndex, dstElements);
//and
MemorySegment dstSegment = ...
int[] srcArray = ...
dstSegment.asSlice(dstOffsetBytes, dstLengthBytes).copyFrom(srcArray,
srcIndex, srcElements);
```
This use case clearly shows that the units of offset and length on the
slice are different from the units of offset and length in the
copyInto/From arguments. It would be helpful to the user if the argument
names could reflect that.
Although I realize that asSlice(long offset, long newSize) already exists,
is it possible to enhance the argument names to something like asSlice(long
offsetBytes, long newSizeBytes) ? or offsetBytes, lengthBytes? ... in a
future release?
Lee.
On Tue, Apr 27, 2021 at 3:16 AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:
> Hi Lee,
> this is good exploratory work. I have some instant "off the hip" reactions
> to it, which I'll address here (followed by some more useful insights
> below):
>
> * On naming, while I agree that naming is hard - I think we also have to
> consider what the rest of the JDK does. For instance, these bulk methods on
> ByteBuffers are just called put/get. I'm not aware of any other similar
> method in the JDK with names like the one you propose. One possibility,
> especially if we want to add this as part of `MemoryAccess` would be to
> double down on the current naming strategy. MemoryAccess uses
> getByte/setByte - can we get away with getByteS (segment -> array) and
> setByteS (array -> segment). This looks odd at first, but if you squint -
> these operations are doing some sort of "multivalue" set/get. Now, the
> signature of getByte returns a byte - but if we're in multivalue world,
> that would be a byte[] (in the target position). The dual case, setByte,
> takes a byte parameter, so that would be a byte[] (in the source position).
> I think it's a stretch, but it works out (and it's more or less similar to
> what was done for bytebuffers, so likely more familiar to developers).
>
> * I don't agree with the direction of supporting boxed array types. These
> arrays are reference arrays, there's nothing special about them, other than
> their names, which is vaguely primitive-sounding. The main idea for
> MemoryAccess is to provide fast primitive to manipulate memory; I don't
> think there's anything which can be called "fast" to copy stuff from a
> MemorySegment into a Integer[] - you have to through all the 4-bytes
> segment slices and turn them into Integer one by one. This is not bulk, it
> should at the very least not be named in the same way as the others, but I
> think that, realistically, it just doesn't belong here.
>
> * naming-wise I think "destination" (which abbreviates to "dst") works
> better than "target".
>
> But let me go back at the original issue: copying elements in bulk from an
> existing segment to an existing array (and viceversa) is perceived as too
> verbose. Let's try to write down the code that would copy some ints from a
> segment into an int array:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart, length);
> MemorySegment dstSegment = MemorySegment.ofArray(targetArray);
> MemorySegment dstSegmentSlice = dstSegment.asSlice(dstStart, length);
> dstSegmentSlice.copyFrom(srcSegmentSlice);
>
> ```
>
> Now, there are some issues here - but there are two things which stand out
> as particularly awkward:
>
> * We have to wrap an array into an heap segment and then slice it. For the
> source segment, we still have to slice, but that's a already in
> segment-form, so that step doesn't feel so punitive.
> * Any sense of "data flow" is loss in the above code. I think the main
> culprit is that, since we only have "copyFrom", the receiver segment has to
> be that temporary heap segment created above, which, in the developer's
> perspective, is totally unimportant. But seriously, staring at the above
> code, we'd have an hard time understanding that the code is "just" copy
> bytes from the segments into the array.
>
> To address these concerns we could do two things in the MemorySegment API:
>
> * add overloads of copyFrom, so that we can copy things not only "from" a
> segment - but also from an array (with specified indices)
> * introduce a "copyTo" which is the same as copyFrom, but with reverse
> polarity (e.g. copy from THIS segment to somewhere else)
> * add byte swapping support to both copyFrom and copyInto
>
> With these we can compress the above code as follows:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> MemorySegment srcSegmentSlice = srcSegment.asSlice(srcStart, length); //
> we still need to slice this
> srcSegmentSlice.copyInto(targetArray, dstStart, length);
>
> ```
>
> Which, using fluent style would be:
>
> ```
> MemorySegment srcSegment = ...
> int[] targetArray = ...
> //start here
> srcSegment.asSlice(srcStart, length)
> .copyInto(targetArray, dstStart, length);
>
> ```
>
> It's not a one lines, but it looks a lot better than where we started
> from. A casual reader here would have no issues understanding the data flow
> - clearly bytes go from the segment into the array.
>
>
> I think that maybe I would prefer to see something like this rather than
> having more static accessors dumped onto the side. We already have
> "copyFrom" in memory segment, the issue that it has is that it is too
> limited, and its lack of symmetry (lack of "copyTo") is forcing users down
> paths which make code less readable. Let's focus on that?
>
> Maurizio
>
>
>
> On 27/04/2021 06:54, leerho wrote:
>
> This write-up is to propose the design approach, Part 1, for JBS
> JDK-8264594 <https://bugs.openjdk.java.net/browse/JDK-8264594>.
>
> Feedback is appreciated.
>
> Lee.
>
>
More information about the panama-dev
mailing list