Part 1 Proposal for JDK-8264594

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Tue Jun 8 13:59:55 UTC 2021


Hi Lee,
thanks for this, I'll try to put together a PR in the next week or so.

Re. overloads I was thinking about how the ByteBuffer API goes about 
them, and, in I see that that API doesn't have the same issues, given 
that ByteBuffer only provides bulk copy for byte arrays. If you want to 
bulk copy a double[] you have to convert to a DoubleBuffer first and 
then copy (and then use logical indices for the buffer too).

Question: do you have concrete use cases where you'd like to copy 
contents of a double[] at non-aligned positions inside a segment?

I'm asking because, in principle we could leave the methods in 
CopyMemory cover the aligned, logical indices only part of the API. And 
ask the user to fall back to MemorySegment::copyFrom if they really 
really want unaligned (which should be uncommon?).

Cheers
Maurizio

On 08/06/2021 00:55, leerho wrote:
> Maurizio,
> I have updated MemoryCopy and TestMemoryCopy by extending the variable 
> names to be even more clear, for example:
>
>   * srcArr -> srcArray
>   * srcSeg -> srcSegment
>   * srcCopyLen -> srcCopyLengthInts
>   * srcIndex -> srcIndexLongs
>   * Etc.
>
> I also added a comment in the class javadoc describing the difference 
> between variables with the word "index" and ones with the word 
> "offset", etc.
>
> I have looked at what it would take to add overloads that refer to the 
> segment offset in array element units.  It can be done, but the names 
> of these new methods will be kinda ugly or awkward:
>
>   * copyFromArrayAtIndex -- confusing since the array uses index units
>     already
>   * copyAtIndexFromArray -- not much better
>   * copyFromArrayWithIndex -- nah
>   * copyFromArrayToSegementUsingIndex -- a little more clear, but
>     really awkward
>
> If you have any better ideas, I am open.  (But IMHO, I don't think we 
> need this :) )
>
> Please let me know what the next step should be :)
>
> Thanks!
>
> Lee.
>
> On Fri, Jun 4, 2021 at 7:09 AM leerho <leerho at gmail.com 
> <mailto:leerho at gmail.com>> wrote:
>
>     Maurizio,
>     Thank you for your reply.  I agree with your observation in the
>     difference between the single item case and the bulk copy.  I
>     guess I became so familiar with how offsets are handled in Unsafe
>     where the single element and bulk copy cases the references to
>     memory are always in bytes, it seamed so simple and natural.
>
>     In my experience anyway, reading and writing data to off-heap
>     almost always involves a mix of preamble elements and bulk copy
>     arrays where it was necessary to track the offset in bytes, where
>     the multibyte elements may not be correctly aligned to the base of
>     the allocated segment, or God forbid, the arrays.
>
>     What is most important is getting this into the JDK, and if I have
>     to add more overloads, I will :)
>
>     Lee.
>
>     On Fri, Jun 4, 2021 at 2:34 AM Maurizio Cimadamore
>     <maurizio.cimadamore at oracle.com
>     <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
>
>         On 04/06/2021 01:32, leerho wrote:
>>         When I first saw all the logical index overloads in
>>         MemoryAccess I shuddered.  It is a lot of API clutter as far
>>         as I’m concerned. But just because we made that mistake with
>>         MemoryAccess doesn’t mean we have to continue propagating it.
>
>         I'm sorry, I just don't see that as being a "mistake". Code
>         like this:
>
>         ```
>         for (int i = 0 ; i < limit ; i++) {
>            long val = MemoryAccess.getLongAtIndex(segment, i);
>            ...
>         }
>         ```
>
>         Is _extremely_ common. While it's true that memory segment is
>         an unbiased array of bytes, there are times you want to access
>         it in a more structured form - and saying "you can't do that"
>         because it adds overloads doesn't seem like a great solution
>         to me.
>
>>
>>         Also I have explicitly named the two offsets very differently
>>         to try to make it clear that they are very different.  I
>>         guess I could go further and name the “srcIndex” ->
>>         “srcIndexLongs” for the long array case, for example.  But
>>         since “index” is used elsewhere as the logical index I
>>         assumed that this would not be necessary.
>>
>>         But I would much rather use longer, more explicit names than
>>         propagate API clutter.
>
>         I was thinking overnight that, perhaps, the memory copy and
>         the single element cases are different. When you do single
>         access (see above) you do that frequently in a loop - and you
>         often get elements of the same type. As stated above, asking
>         developers to work in terms of offsets seems a bit odd, when
>         they are really accessing a flat array of element type T.
>
>         In the case of bulk copy, I don't think there's the same
>         degree of pressure - you will probably call the bulk copy at
>         the beginning, or at the end of your computation (or both
>         ends!) - and having an offset shift is more tolerable.
>
>         So, I'm open to evaluating something more minimal, and see how
>         far we can get with it.
>
>         Maurizio
>
>>
>>         Lee
>>
>>         On Thu, Jun 3, 2021 at 1:26 PM Maurizio Cimadamore
>>         <maurizio.cimadamore at oracle.com
>>         <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>
>>             Thanks.
>>
>>             While I agree that a segment is just a bunch of bytes, I
>>             think also think that having an API which takes two
>>             indices at the same time, one logical and one physical is
>>             kind of evil. I bet that a lot of developers will get
>>             that wrong, at least the first time.
>>
>>             What you say about segments being an array of bytes, and
>>             thus not needing index-based overloads can also be said
>>             for MemoryAccess class - but I believe that the
>>             index-based accessors in the are quite popular.
>>
>>             But overall, my worry is not performances, or symmetry
>>             with respect to other APIs - my main worry is what I said
>>             above: that people will just assume that indices are
>>             logical for both the segment and the array - so overloads
>>             with clearly different names should help quite a bit IMHO.
>>
>>
>>             Maurizio
>>
>>             On 03/06/2021 20:50, leerho wrote:
>>>             Maurizio,
>>>             Here is the MemoryCopy class (w/o byte swap) and the
>>>             TestMemoryCopy class (w/o testing byte swap).
>>>
>>>             I copied from the TestMemoryAccess class in that I
>>>             noticed that there is no package statement.  I am not
>>>             sure what @SuppressWarnings are allowed in your test
>>>             environment, I suspect I should remove all of them.
>>>
>>>             When the MemorySegment copyFrom(MemorySegment,
>>>             MemoryLayout, ByteOrder) becomes available I can add the
>>>             tests for that.
>>>
>>>             *IMHO*
>>>
>>>                 I feel strongly that there is no need to
>>>                 additionally overload these methods with the segment
>>>                 offset specified in array index units.  It is so
>>>                 trivial to convert from one to the other and it can
>>>                 be done easily in the method argument with a simple
>>>                 multiply by 2, 4 or 8. This should compile down to a
>>>                 simple shift, which becomes a single cycle CPU
>>>                 instruction.  So this is not a performance issue.
>>>
>>>                 Furthermore, it is best that the user becomes
>>>                 accustomed to thinking of a segment fundamentally as
>>>                 an array of bytes.  Once a segment is loaded with
>>>                 some primitive array, the segment loses the context
>>>                 of the type of the array it was loaded with; in
>>>                 effect, a kind of "type erasure" similar to Java's
>>>                 generics.  This is one of the reasons that
>>>                 MemorySegments can be so powerful.
>>>
>>>
>>>                 I have been programming with the "MemorySegment"
>>>                 concept for a number of years now and find that
>>>                 keeping in mind that segments are just bytes is very
>>>                 useful.
>>>
>>>
>>>             Let me know what you think.
>>>
>>>             Lee.
>>>
>>>
>>>
>>>             On Mon, May 31, 2021 at 6:56 AM Maurizio Cimadamore
>>>             <maurizio.cimadamore at oracle.com
>>>             <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>>
>>>                 Hi Lee,
>>>                 this looks good.
>>>
>>>                 One thing I note is that there's an ambiguity as to
>>>                 whether the segment index is expressed as a logical
>>>                 index, or a raw byte offset. Your snippet does the
>>>                 latter. If we want to follow MemoryAccess, perhaps
>>>                 that calls for 2 overloads
>>>                 (copyFromArrayAtIndex/copyFromArrayAtOffset), as I
>>>                 imagine both could be useful, depending on the case?
>>>
>>>                 The javadoc will have to say something when the
>>>                 segment being used is backed by the very array that
>>>                 is the source/target of the copy (we have some text
>>>                 like that in MemorySegment::copyFrom).
>>>
>>>                 I think it would be helpful to progress further with
>>>                 this, add the remaining templates (w/o ByteOrder,
>>>                 for now) test and see how it works in practice.
>>>
>>>                 I will add (or just file a simple PR, so that you
>>>                 can just borrow from it - should be a single method)
>>>                 something to do the segment copy with swap.
>>>
>>>                 Thanks!
>>>                 Maurizio
>>>
>>>                 On 28/05/2021 22:37, leerho wrote:
>>>>                 Maurizio,
>>>>
>>>>                 Again sorry about the delay.
>>>>
>>>>                 Attached is a template proposal for the MemoryCopy
>>>>                 class.
>>>>
>>>>                 I can't complete this without your proposed new
>>>>                 copyFrom(...) method in MemorySegment. As it is
>>>>                 written, it should work, but without the byte swap
>>>>                 capability.
>>>>
>>>>                 I can complete the rest of the primitives like this
>>>>                 template, if you would like with javadocs.  I could
>>>>                 also start writing tests, but without the byte-swap.
>>>>
>>>>                 Let me know what would be most helpful.
>>>>
>>>>                 Cheers,
>>>>
>>>>                 Lee
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                 On Thu, May 20, 2021 at 11:36 AM leerho
>>>>                 <leerho at gmail.com <mailto:leerho at gmail.com>> wrote:
>>>>
>>>>                     Of course!
>>>>
>>>>                     On Thu, May 20, 2021 at 9:42 AM Maurizio
>>>>                     Cimadamore <maurizio.cimadamore at oracle.com
>>>>                     <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>>>
>>>>
>>>>                         On 20/05/2021 17:13, leerho wrote:
>>>>>                         I am not sure if the /dstSegSlice/
>>>>>                         requires the /srcCopyLen/. I would hope
>>>>>                         that it is smart enough to realize that
>>>>>                         the input length is smaller than the given
>>>>>                         offset minus the segment size.
>>>>
>>>>                         asSlice has an overload that just takes an
>>>>                         offset and infers the resulting size from
>>>>                         there.
>>>>
>>>>                         But that doesn't seem what you want here -
>>>>                         as you want the slice to have a specific
>>>>                         size (the size of the input array).
>>>>
>>>>                         MemorySegment::copyFrom wants the two
>>>>                         segments to have the same size, so I think
>>>>                         you need that.
>>>>
>>>>                         In terms of performance, there's no
>>>>                         difference between asSlice(offset) and
>>>>                         asSlice(offset, size) - you have to create
>>>>                         a new segment anyway.
>>>>
>>>>                         Maurizio
>>>>
>>>>
>>         -- 
>>         From my cell phone.
>
>     -- 
>     From my cell phone.
>


More information about the panama-dev mailing list