RFR: JDK-8198285: More consistent Access API for arraycopy

Sun Jun 3 12:27:06 UTC 2018

Hi Erik

Thanks for the review. See my comments inline below:

> Thanks for doing this. A few comments...
> 
> This:
> ArrayAccess<>::arraycopy_from_native<jbyte>(...
> 
> ...won't reliably compile without saying "template", like this:
> ArrayAccess<>::template arraycopy_from_native<jbyte>(...
> 
> ...because the template member is qualified on another template class.
> My suggestion is to remove the explicit <jbyte> and let that be inferred
> from the address instead.

Ok, I've done that for all the cases where it's possible. The heap->heap
cases cannot do that because there's no address pointer to infer from.
I've added the 'template' keyword there.

> And things like this:
>  bool RuntimeDispatch<decorators, T,
> BARRIER_ARRAYCOPY>::arraycopy_init(arrayOop src_obj, size_t
> src_offset_in_bytes, const T* src_raw, arrayOop dst_obj, size_t
> dst_offset_in_bytes, T* dst_raw, size_t length) {
> 
> ...as a single line, could do with some newlines to make it reasonably
> wide.

Ok, I've broken those lines down to have src args in one line, dst args
in another. This should make it more readable.

> Now that we copy both from heap to heap, from heap to native, and native
> to heap, we have to be a bit more careful in modref
> oop_arraycopy_in_heap. In the covariant case of arraycopy, you
> unconditionally apply the pre and post barriers on the destination.
> However, if the destination is native
> (ArrayAccess::arraycopy_to_native), that does not feel like it will go
> particularly well, unless I have missed something.

oop arrays are never copied from/to native.

> Also your new ArrayAccess class inherits from HeapAccess<IN_HEAP_ARRAY |
> decorators>, which is nice. But its member functions call
> HeapAccess<decorators>::arraycopy, which will miss out on the
> IN_HEAP_ARRAY decorator, which might lead to not using precise marking
> in barrier backends. If I were you, I would probably typedef a
> BaseAccess for HeapAccess<IN_HEAP_ARRAY | decorators> inside of
> ArrayAccess, and call that in the member functions.

I am not sure how to do this typedef trick, but I added | IN_HEAP_ARRAY
to all the calls. It also required to extend the verify check on the
decorators to include IN_HEAP_ARRAY.

Differential:
http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/
Full:
http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/

What do you think?

Cheers,
Roman

> Thanks,
> /Erik
> 
> On 2018-06-01 00:25, Roman Kennke wrote:
>> Hi Erik,
>>
>> It took me a while to get back to this.
>>
>> I wrote a little wrapper around the extended arraycopy API that allows
>> to basically write exactly what you suggested:
>>
>>
>>> ArrayAccess<ARRAYCOPY_ATOMIC>::arraycopy_from_native(ptr, obj, index,
>>> length);
>> I agree that this seems the cleanest solution.
>>
>> The backend still sees both obj+off OR raw-ptr, but I think this is ok.
>>
>> Differential:
>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/
>> Full:
>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/
>>
>> What do you think?
>>
>> Roman
>>
>>> Each required parameter is clear when you read the API.
>>>
>>> But I am open to suggestions of course.
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2018-04-25 17:59, Roman Kennke wrote:
>>>>> 1) Resolve the addresses as we do today, but you think you get better
>>>>> Shenandoah performance if we either
>>>>> 2) Have 3 different access calls, (from heap to native, from native to
>>>>> heap, from heap to heap) for copying, or
>>>>> 3) Have 1 access call that can encode the above 3 variants, but looks
>>>>> ugly at the call sites.
>>>> There's also the idea to pass Address arguments to arraycopy, one for
>>>> src, one for dst, and have 2 different subclasses: one for obj+offset
>>>> (heap access) and one with direct pointer (raw). Your comment gave me
>>>> the idea to also provide arrayOop+idx. This would look clean on the
>>>> caller side I think.
>>>>
>>>> It would also be useful on the GC side: BarrierSets would specialize
>>>> only in the variants that they are interested in, for example, in case
>>>> of Shenandoah:
>>>> 1. arraycopy(HeapAddress,HeapAddress) Java->Java
>>>> 2. arraycopy(HeapAddress,RawAddress)  Java->native
>>>> 3. arraycopy(RawAddress,HeapAddress)  native->Java
>>>>
>>>> other barriersets can ignore the exact type and only call the args
>>>> address->resolve() or so to get the actual raw address.
>>>>
>>>> This would *also* be beneficial for the other APIs: instead of having
>>>> all the X() and X_at() variants, we can just use one X variant that
>>>> either takes RawAddress or HeapAddress.
>>>>
>>>> I made a little (not yet compiling/working) prototype of this a while
>>>> ago:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch
>>>>
>>>>
>>>> What do you think? Would it make sense to go further down that road?
>>>>
>>>> Roman
>>>>
>>>>> You clearly went for 3, which leaves the callsites looking rather hard
>>>>> to read. It is for example not obvious for me what is going on here
>>>>> (javaClasses.cpp line 313):
>>>>>
>>>>> HeapAccess<>::arraycopy<jbyte>(NULL, 0, reinterpret_cast<const
>>>>> jbyte*>(utf8_str), value(h_obj()),
>>>>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>>>>>
>>>>> ...without looking very carefully at the long list of arguments
>>>>> encoding
>>>>> what is actually going on (copy from native to the heap). What is
>>>>> worse
>>>>> is that this will probably not compile without adding the template
>>>>> keyword to the call (since you have a qualified template member
>>>>> function
>>>>> behind a template class), like this:
>>>>> HeapAccess<>::template arraycopy<jbyte>(NULL, 0,
>>>>> reinterpret_cast<const
>>>>> jbyte*>(utf8_str), value(h_obj()),
>>>>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>>>>>
>>>>> ...which as a public API leaves me feeling a bit like this:     :C
>>>>>
>>>>> May I suggest adding an array access helper. The idea is to keep a
>>>>> single call through Access, and a single backend point for array copy,
>>>>> but let the helper provide the three different types of copying as
>>>>> different functions, so that the call sites still look pretty and easy
>>>>> to read and follow.
>>>>>
>>>>> This helper could additionally have load_at, and store_at use array
>>>>> indices as opposed to offsets, and hence hide the offset
>>>>> calculations we
>>>>> perform today (typically involving checking if we are using compressed
>>>>> oops or not).
>>>>>
>>>>> I am thinking something along the lines of
>>>>> ArrayAccess<>::arraycopy_to_native(readable_arguments),
>>>>> ArrayAccess<>::arraycopy_from_native(readable_arguments),
>>>>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some
>>>>> form of Access<>::arraycopy(unreadable_arguments). And for example
>>>>> ArrayAccess<>::load_at(obj, index) would translate to some kind of
>>>>> HeapAccess<IN_HEAP_ARRAY>::load_at(obj, offset_for_index(index)) as a
>>>>> bonus, making everyone using the API jump with happiness.
>>>>>
>>>>> What do you think about this idea? Good or bad? I guess the
>>>>> question is
>>>>> whether this helper should be in access.hpp, or somewhere else
>>>>> (like in
>>>>> arrayOop). Thoughts are welcome.
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>> On 2018-04-11 19:54, Roman Kennke wrote:
>>>>>>     Currently, the arraycopy API in access.hpp gets the src and dst
>>>>>> oops,
>>>>>> plus the src and dst addresses. In order to be most useful to garbage
>>>>>> collectors, it should receive the src and dst oops together with the
>>>>>> src
>>>>>> and dst offsets instead, and let the Access API / GC calculate the
>>>>>> src
>>>>>> and dst addresses.
>>>>>>
>>>>>> For example, Shenandoah needs to resolve the src and dst objects for
>>>>>> arraycopy, and then apply the corresponding offsets. With the current
>>>>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then
>>>>>> resolve obj, then re-add the calculate ptr-diff. This is fragile
>>>>>> because
>>>>>> we also may resolve obj in the runtime before calculating ptr
>>>>>> (e.g. via
>>>>>> arrayOop::base()). If we then pass in the original obj and a ptr
>>>>>> calculated from another copy of the same obj, the above resolution
>>>>>> logic
>>>>>> would not work. This is currently the case for obj-arraycopy.
>>>>>>
>>>>>> I propose to change the API to accept obj+offset, in addition to ptr
>>>>>> for
>>>>>> both src and dst. Only one or the other should be used. Heap accesses
>>>>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses
>>>>>> (or
>>>>>> heap accesses that are already resolved.. use with care) should pass
>>>>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also
>>>>>> allows the
>>>>>> API to be used for Java<->native array bulk transfers.
>>>>>>
>>>>>> An alternative would be to break the API up into 4 variants:
>>>>>>
>>>>>> Java->Java transfer:
>>>>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t
>>>>>> len)
>>>>>>
>>>>>> Java->Native transfer:
>>>>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len)
>>>>>>
>>>>>> Native->Java transfer:
>>>>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len)
>>>>>>
>>>>>> 'Unsafe' transfer:
>>>>>> arraycopy(S* src_raw, D* dst_raw, size_t len)
>>>>>>
>>>>>>
>>>>>> But that seemed to be too much boilerplate copy+pasting for my taste.
>>>>>> (See how having this overly complicated template layer hurts us?)
>>>>>>
>>>>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for
>>>>>> almost every Access API, we may want to abstract that and take an
>>>>>> Address type argument, which would be either HeapAddress(obj,
>>>>>> offset) or
>>>>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the
>>>>>> actual address, or specialize for HeapAddress variants and resolve
>>>>>> the
>>>>>> objs and then resolve the address. This would also allow us to get
>>>>>> rid
>>>>>> of almost half of the API (all the *_at variants would go) and some
>>>>>> other simplifications. However, this seemed to explode the scope of
>>>>>> this
>>>>>> RFE, and would be better handled in another RFE.
>>>>>>
>>>>>> This changes makes both typeArrayKlass and objArrayKlass use the
>>>>>> changed
>>>>>> API, plus I identified all (hopefully) places where we do bulk
>>>>>> Java<->native array transfers and make them use the API too. Gets us
>>>>>> rid
>>>>>> of a bunch of memcpy calls :-)
>>>>>>
>>>>>> Please review the change:
>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/
>>>>>>
>>>>>> Thanks, Roman
>>>>>>
>>
>