Part 1 Proposal for JDK-8264594
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Jun 8 13:59:55 UTC 2021
Hi Lee,
thanks for this, I'll try to put together a PR in the next week or so.
Re. overloads I was thinking about how the ByteBuffer API goes about
them, and, in I see that that API doesn't have the same issues, given
that ByteBuffer only provides bulk copy for byte arrays. If you want to
bulk copy a double[] you have to convert to a DoubleBuffer first and
then copy (and then use logical indices for the buffer too).
Question: do you have concrete use cases where you'd like to copy
contents of a double[] at non-aligned positions inside a segment?
I'm asking because, in principle we could leave the methods in
CopyMemory cover the aligned, logical indices only part of the API. And
ask the user to fall back to MemorySegment::copyFrom if they really
really want unaligned (which should be uncommon?).
Cheers
Maurizio
On 08/06/2021 00:55, leerho wrote:
> Maurizio,
> I have updated MemoryCopy and TestMemoryCopy by extending the variable
> names to be even more clear, for example:
>
> * srcArr -> srcArray
> * srcSeg -> srcSegment
> * srcCopyLen -> srcCopyLengthInts
> * srcIndex -> srcIndexLongs
> * Etc.
>
> I also added a comment in the class javadoc describing the difference
> between variables with the word "index" and ones with the word
> "offset", etc.
>
> I have looked at what it would take to add overloads that refer to the
> segment offset in array element units. It can be done, but the names
> of these new methods will be kinda ugly or awkward:
>
> * copyFromArrayAtIndex -- confusing since the array uses index units
> already
> * copyAtIndexFromArray -- not much better
> * copyFromArrayWithIndex -- nah
> * copyFromArrayToSegementUsingIndex -- a little more clear, but
> really awkward
>
> If you have any better ideas, I am open. (But IMHO, I don't think we
> need this :) )
>
> Please let me know what the next step should be :)
>
> Thanks!
>
> Lee.
>
> On Fri, Jun 4, 2021 at 7:09 AM leerho <leerho at gmail.com
> <mailto:leerho at gmail.com>> wrote:
>
> Maurizio,
> Thank you for your reply. I agree with your observation in the
> difference between the single item case and the bulk copy. I
> guess I became so familiar with how offsets are handled in Unsafe
> where the single element and bulk copy cases the references to
> memory are always in bytes, it seamed so simple and natural.
>
> In my experience anyway, reading and writing data to off-heap
> almost always involves a mix of preamble elements and bulk copy
> arrays where it was necessary to track the offset in bytes, where
> the multibyte elements may not be correctly aligned to the base of
> the allocated segment, or God forbid, the arrays.
>
> What is most important is getting this into the JDK, and if I have
> to add more overloads, I will :)
>
> Lee.
>
> On Fri, Jun 4, 2021 at 2:34 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
>
> On 04/06/2021 01:32, leerho wrote:
>> When I first saw all the logical index overloads in
>> MemoryAccess I shuddered. It is a lot of API clutter as far
>> as I’m concerned. But just because we made that mistake with
>> MemoryAccess doesn’t mean we have to continue propagating it.
>
> I'm sorry, I just don't see that as being a "mistake". Code
> like this:
>
> ```
> for (int i = 0 ; i < limit ; i++) {
> long val = MemoryAccess.getLongAtIndex(segment, i);
> ...
> }
> ```
>
> Is _extremely_ common. While it's true that memory segment is
> an unbiased array of bytes, there are times you want to access
> it in a more structured form - and saying "you can't do that"
> because it adds overloads doesn't seem like a great solution
> to me.
>
>>
>> Also I have explicitly named the two offsets very differently
>> to try to make it clear that they are very different. I
>> guess I could go further and name the “srcIndex” ->
>> “srcIndexLongs” for the long array case, for example. But
>> since “index” is used elsewhere as the logical index I
>> assumed that this would not be necessary.
>>
>> But I would much rather use longer, more explicit names than
>> propagate API clutter.
>
> I was thinking overnight that, perhaps, the memory copy and
> the single element cases are different. When you do single
> access (see above) you do that frequently in a loop - and you
> often get elements of the same type. As stated above, asking
> developers to work in terms of offsets seems a bit odd, when
> they are really accessing a flat array of element type T.
>
> In the case of bulk copy, I don't think there's the same
> degree of pressure - you will probably call the bulk copy at
> the beginning, or at the end of your computation (or both
> ends!) - and having an offset shift is more tolerable.
>
> So, I'm open to evaluating something more minimal, and see how
> far we can get with it.
>
> Maurizio
>
>>
>> Lee
>>
>> On Thu, Jun 3, 2021 at 1:26 PM Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com
>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>
>> Thanks.
>>
>> While I agree that a segment is just a bunch of bytes, I
>> think also think that having an API which takes two
>> indices at the same time, one logical and one physical is
>> kind of evil. I bet that a lot of developers will get
>> that wrong, at least the first time.
>>
>> What you say about segments being an array of bytes, and
>> thus not needing index-based overloads can also be said
>> for MemoryAccess class - but I believe that the
>> index-based accessors in the are quite popular.
>>
>> But overall, my worry is not performances, or symmetry
>> with respect to other APIs - my main worry is what I said
>> above: that people will just assume that indices are
>> logical for both the segment and the array - so overloads
>> with clearly different names should help quite a bit IMHO.
>>
>>
>> Maurizio
>>
>> On 03/06/2021 20:50, leerho wrote:
>>> Maurizio,
>>> Here is the MemoryCopy class (w/o byte swap) and the
>>> TestMemoryCopy class (w/o testing byte swap).
>>>
>>> I copied from the TestMemoryAccess class in that I
>>> noticed that there is no package statement. I am not
>>> sure what @SuppressWarnings are allowed in your test
>>> environment, I suspect I should remove all of them.
>>>
>>> When the MemorySegment copyFrom(MemorySegment,
>>> MemoryLayout, ByteOrder) becomes available I can add the
>>> tests for that.
>>>
>>> *IMHO*
>>>
>>> I feel strongly that there is no need to
>>> additionally overload these methods with the segment
>>> offset specified in array index units. It is so
>>> trivial to convert from one to the other and it can
>>> be done easily in the method argument with a simple
>>> multiply by 2, 4 or 8. This should compile down to a
>>> simple shift, which becomes a single cycle CPU
>>> instruction. So this is not a performance issue.
>>>
>>> Furthermore, it is best that the user becomes
>>> accustomed to thinking of a segment fundamentally as
>>> an array of bytes. Once a segment is loaded with
>>> some primitive array, the segment loses the context
>>> of the type of the array it was loaded with; in
>>> effect, a kind of "type erasure" similar to Java's
>>> generics. This is one of the reasons that
>>> MemorySegments can be so powerful.
>>>
>>>
>>> I have been programming with the "MemorySegment"
>>> concept for a number of years now and find that
>>> keeping in mind that segments are just bytes is very
>>> useful.
>>>
>>>
>>> Let me know what you think.
>>>
>>> Lee.
>>>
>>>
>>>
>>> On Mon, May 31, 2021 at 6:56 AM Maurizio Cimadamore
>>> <maurizio.cimadamore at oracle.com
>>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>>
>>> Hi Lee,
>>> this looks good.
>>>
>>> One thing I note is that there's an ambiguity as to
>>> whether the segment index is expressed as a logical
>>> index, or a raw byte offset. Your snippet does the
>>> latter. If we want to follow MemoryAccess, perhaps
>>> that calls for 2 overloads
>>> (copyFromArrayAtIndex/copyFromArrayAtOffset), as I
>>> imagine both could be useful, depending on the case?
>>>
>>> The javadoc will have to say something when the
>>> segment being used is backed by the very array that
>>> is the source/target of the copy (we have some text
>>> like that in MemorySegment::copyFrom).
>>>
>>> I think it would be helpful to progress further with
>>> this, add the remaining templates (w/o ByteOrder,
>>> for now) test and see how it works in practice.
>>>
>>> I will add (or just file a simple PR, so that you
>>> can just borrow from it - should be a single method)
>>> something to do the segment copy with swap.
>>>
>>> Thanks!
>>> Maurizio
>>>
>>> On 28/05/2021 22:37, leerho wrote:
>>>> Maurizio,
>>>>
>>>> Again sorry about the delay.
>>>>
>>>> Attached is a template proposal for the MemoryCopy
>>>> class.
>>>>
>>>> I can't complete this without your proposed new
>>>> copyFrom(...) method in MemorySegment. As it is
>>>> written, it should work, but without the byte swap
>>>> capability.
>>>>
>>>> I can complete the rest of the primitives like this
>>>> template, if you would like with javadocs. I could
>>>> also start writing tests, but without the byte-swap.
>>>>
>>>> Let me know what would be most helpful.
>>>>
>>>> Cheers,
>>>>
>>>> Lee
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, May 20, 2021 at 11:36 AM leerho
>>>> <leerho at gmail.com <mailto:leerho at gmail.com>> wrote:
>>>>
>>>> Of course!
>>>>
>>>> On Thu, May 20, 2021 at 9:42 AM Maurizio
>>>> Cimadamore <maurizio.cimadamore at oracle.com
>>>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>>>>
>>>>
>>>> On 20/05/2021 17:13, leerho wrote:
>>>>> I am not sure if the /dstSegSlice/
>>>>> requires the /srcCopyLen/. I would hope
>>>>> that it is smart enough to realize that
>>>>> the input length is smaller than the given
>>>>> offset minus the segment size.
>>>>
>>>> asSlice has an overload that just takes an
>>>> offset and infers the resulting size from
>>>> there.
>>>>
>>>> But that doesn't seem what you want here -
>>>> as you want the slice to have a specific
>>>> size (the size of the input array).
>>>>
>>>> MemorySegment::copyFrom wants the two
>>>> segments to have the same size, so I think
>>>> you need that.
>>>>
>>>> In terms of performance, there's no
>>>> difference between asSlice(offset) and
>>>> asSlice(offset, size) - you have to create
>>>> a new segment anyway.
>>>>
>>>> Maurizio
>>>>
>>>>
>> --
>> From my cell phone.
>
> --
> From my cell phone.
>
More information about the panama-dev
mailing list