Part 1 Proposal for JDK-8264594

Fri Jun 4 14:09:33 UTC 2021

Maurizio,
Thank you for your reply.  I agree with your observation in the difference
between the single item case and the bulk copy.  I guess I became so
familiar with how offsets are handled in Unsafe where the single element
and bulk copy cases the references to memory are always in bytes, it seamed
so simple and natural.

In my experience anyway, reading and writing data to off-heap almost always
involves a mix of preamble elements and bulk copy arrays where it was
necessary to track the offset in bytes, where the multibyte elements may
not be correctly aligned to the base of the allocated segment, or God
forbid, the arrays.

What is most important is getting this into the JDK, and if I have to add
more overloads, I will :)

Lee.

On Fri, Jun 4, 2021 at 2:34 AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

>
> On 04/06/2021 01:32, leerho wrote:
>
> When I first saw all the logical index overloads in MemoryAccess I
> shuddered.  It is a lot of API clutter as far as I’m concerned.  But just
> because we made that mistake with MemoryAccess doesn’t mean we have to
> continue propagating it.
>
> I'm sorry, I just don't see that as being a "mistake". Code like this:
>
> ```
> for (int i = 0 ; i < limit ; i++) {
>    long val = MemoryAccess.getLongAtIndex(segment, i);
>    ...
> }
> ```
>
> Is _extremely_ common. While it's true that memory segment is an unbiased
> array of bytes, there are times you want to access it in a more structured
> form - and saying "you can't do that" because it adds overloads doesn't
> seem like a great solution to me.
>
>
> Also I have explicitly named the two offsets very differently to try to
> make it clear that they are very different.  I guess I could go further and
> name the “srcIndex” -> “srcIndexLongs” for the long array case, for
> example.  But since “index” is used elsewhere as the logical index I
> assumed that this would not be necessary.
>
> But I would much rather use longer, more explicit names than propagate API
> clutter.
>
> I was thinking overnight that, perhaps, the memory copy and the single
> element cases are different. When you do single access (see above) you do
> that frequently in a loop - and you often get elements of the same type. As
> stated above, asking developers to work in terms of offsets seems a bit
> odd, when they are really accessing a flat array of element type T.
>
> In the case of bulk copy, I don't think there's the same degree of
> pressure - you will probably call the bulk copy at the beginning, or at the
> end of your computation (or both ends!) - and having an offset shift is
> more tolerable.
>
> So, I'm open to evaluating something more minimal, and see how far we can
> get with it.
>
> Maurizio
>
>
> Lee
>
> On Thu, Jun 3, 2021 at 1:26 PM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
>
>> Thanks.
>>
>> While I agree that a segment is just a bunch of bytes, I think also think
>> that having an API which takes two indices at the same time, one logical
>> and one physical is kind of evil. I bet that a lot of developers will get
>> that wrong, at least the first time.
>>
>> What you say about segments being an array of bytes, and thus not needing
>> index-based overloads can also be said for MemoryAccess class - but I
>> believe that the index-based accessors in the are quite popular.
>>
>> But overall, my worry is not performances, or symmetry with respect to
>> other APIs - my main worry is what I said above: that people will just
>> assume that indices are logical for both the segment and the array - so
>> overloads with clearly different names should help quite a bit IMHO.
>>
>>
>> Maurizio
>> On 03/06/2021 20:50, leerho wrote:
>>
>> Maurizio,
>> Here is the MemoryCopy class (w/o byte swap) and the TestMemoryCopy class
>> (w/o testing byte swap).
>>
>> I copied from the TestMemoryAccess class in that I noticed that there is
>> no package statement.  I am not sure what @SuppressWarnings are allowed in
>> your test environment, I suspect I should remove all of them.
>>
>> When the MemorySegment copyFrom(MemorySegment, MemoryLayout, ByteOrder)
>> becomes available I can add the tests for that.
>>
>> *IMHO*
>>
>> I feel strongly that there is no need to additionally overload these
>> methods with the segment offset specified in array index units.  It is so
>> trivial to convert from one to the other and it can be done easily in the
>> method argument with a simple multiply by 2, 4 or 8. This should compile
>> down to a simple shift, which becomes a single cycle CPU instruction.  So
>> this is not a performance issue.
>>
>> Furthermore, it is best that the user becomes accustomed to thinking of a
>> segment fundamentally as an array of bytes.  Once a segment is loaded with
>> some primitive array, the segment loses the context of the type of the
>> array it was loaded with; in effect, a kind of "type erasure" similar to
>> Java's generics.  This is one of the reasons that MemorySegments can be so
>> powerful.
>>
>>
>> I have been programming with the "MemorySegment" concept for a number of
>> years now and find that keeping in mind that segments are just bytes is
>> very useful.
>>
>>
>> Let me know what you think.
>>
>> Lee.
>>
>>
>>
>> On Mon, May 31, 2021 at 6:56 AM Maurizio Cimadamore <
>> maurizio.cimadamore at oracle.com> wrote:
>>
>>> Hi Lee,
>>> this looks good.
>>>
>>> One thing I note is that there's an ambiguity as to whether the segment
>>> index is expressed as a logical index, or a raw byte offset. Your snippet
>>> does the latter. If we want to follow MemoryAccess, perhaps that calls for
>>> 2 overloads (copyFromArrayAtIndex/copyFromArrayAtOffset), as I imagine both
>>> could be useful, depending on the case?
>>>
>>> The javadoc will have to say something when the segment being used is
>>> backed by the very array that is the source/target of the copy (we have
>>> some text like that in MemorySegment::copyFrom).
>>>
>>> I think it would be helpful to progress further with this, add the
>>> remaining templates (w/o ByteOrder, for now) test and see how it works in
>>> practice.
>>>
>>> I will add (or just file a simple PR, so that you can just borrow from
>>> it - should be a single method) something to do the segment copy with swap.
>>>
>>> Thanks!
>>> Maurizio
>>> On 28/05/2021 22:37, leerho wrote:
>>>
>>> Maurizio,
>>>
>>> Again sorry about the delay.
>>>
>>> Attached is a template proposal for the MemoryCopy class.
>>>
>>> I can't complete this without your proposed new copyFrom(...) method in
>>> MemorySegment. As it is written, it should work, but without the byte swap
>>> capability.
>>>
>>> I can complete the rest of the primitives like this template, if you
>>> would like with javadocs.  I could also start writing tests, but without
>>> the byte-swap.
>>>
>>> Let me know what would be most helpful.
>>>
>>> Cheers,
>>>
>>> Lee
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, May 20, 2021 at 11:36 AM leerho <leerho at gmail.com> wrote:
>>>
>>>> Of course!
>>>>
>>>> On Thu, May 20, 2021 at 9:42 AM Maurizio Cimadamore <
>>>> maurizio.cimadamore at oracle.com> wrote:
>>>>
>>>>>
>>>>> On 20/05/2021 17:13, leerho wrote:
>>>>>
>>>>> I am not sure if the *dstSegSlice* requires the *srcCopyLen*. I would
>>>>> hope that it is smart enough to realize that the input length is smaller
>>>>> than the given offset minus the segment size.
>>>>>
>>>>> asSlice has an overload that just takes an offset and infers the
>>>>> resulting size from there.
>>>>>
>>>>> But that doesn't seem what you want here - as you want the slice to
>>>>> have a specific size (the size of the input array).
>>>>>
>>>>> MemorySegment::copyFrom wants the two segments to have the same size,
>>>>> so I think you need that.
>>>>>
>>>>> In terms of performance, there's no difference between asSlice(offset)
>>>>> and asSlice(offset, size) - you have to create a new segment anyway.
>>>>>
>>>>> Maurizio
>>>>>
>>>>>
>>>>> --
> From my cell phone.
>
> --
>From my cell phone.