Part 1 Proposal for JDK-8264594

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Wed Apr 28 09:29:55 UTC 2021


Hi Lee
IMHO some of the disadvantages you mention about 2 are a bit accidental 
- and not dissimilar to what happens here:

https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/util/Collection.html#toArray(T%5B%5D)

But I can understand that this argument might not be persuading you.

For endianness, I think that there is a difference between:

* I have a segment and I want to read an int
* I have a segment and I want to copy parts of it in an _existing_ array

The former is super common, the latter feels like something you reach 
for in more advanced use cases, honestly. My feeling is that having a 
"more friendly" overload for what is a pretty advanced method anyway 
feels some sort of middle ground that would double the size of the API 
w/o really making it more usable (I don't thing users like you would be 
bothered by an extra `nativeOrder()` suffix). At least that's what I was 
assuming.

As for (1) - some comments:

* you seem to assume that "native implementation" == better performance. 
I don't think that's the case. And I don't see why you think it'd be 
more elegant that way. The main thing that needs to be done, if we go 
down this path, is to keep profiling information sharp - that is, if you 
call copyToArray with a byte[] and then with a short[] you want C2 to 
work with different type profile information. For this there is a 
technique in hotspot, called argument type profiling, which is used for 
pseudo static methods in Unsafe (and in MemoryAccess) which I think we 
can use here.

* I don't see why it has to be a "static" method, or why a static method 
is perceived to be better. The issue with static methods is that they 
don't mix with slicing:

MemorySegment.copyToArray(sourceSegment.asSlice(), ....)

this is worse than:

sourceSegment.asSlice().copyToArray(...)

(btw, this avoids, at least in part, some of the type profiling 
pot-holes highlighted above, as receiver type information is always 
speculated upon, w/o doing anything special). So, two instances methods 
copyToArray and copyFromArray don't sound less elegant to me (actually 
more so)

* Accepting an Object parameter is inconsistent with e.g. the 
MemorySegment.of(...) factories, which take sharp array types. In 
principle those factories could be compressed down into one (in fact the 
API started off that way, but we went back for performance issues).


in both approaches, I think if we use an instance method we should just 
drop the length parameter from the method and infer the length from the 
receiver segment. Here's a table I prepared yesterday night thinking 
about all possible combinations (this assumed approach (1)):

* from array to segment
     - segment does not exist: MemorySegment.of(array) [.asSlice(...)]
     - segment does exist: segment [.asSlice(...)] .copyFrom(array, 
index, order)
* from segment to array
     - array does not exist: segment [.asSlice(...)] .toXYZArray()
     - arrays does exist: segment [.asSlice(...)] .toXYZArray(array, 
index, order)

(this still doesn't seem to be that bad - if toXYZArray name bothers 
you, just replace that with copyToArray - the rest stays the same).

Maurizio

On 28/04/2021 01:43, leerho wrote:
> Maurizio,
>
> This is great.  You have opened up several new possibilities and I'd 
> like to explore some of them.
>
> 1. The use of /Object/, a la /System.arrayCopy(Object src, int srcPos, 
> Object dest, int destPos, int length)./
> There are several interesting ideas (and concerns) that spin off of 
> this idea.
>
>  1. First, is that System.arrayCopy() is a native method so all the
>     work is under the covers, which I cannot see (I don't know how to
>     find the C++ code that implements this method.)
>       * Advantage: This would greatly reduce the number of methods
>         needed (as you pointed out).
>       * Concern: Performance.  I would think we would be moving work
>         that could be done at compile time (e.g., what type of array
>         is it?) to run time.   Or would Hotspot figure this out ?
>       * Advantage: Potentially this could allow boxed array types as
>         the native code could just figure this out and do the right thing.
>       * Concern: because of its similarity to /System.arrayCopy()/,
>         the user might be tempted to use it for array types that are
>         completely inappropriate.  (Just throw an exception.)
>  2. This /Object/ idea also led me to: What if we extended the C++
>     code underneath arrayCopy() to allow one or both of the Objects to
>     be a MemorySegment?
>       * Make it clear that both Pos arguments and the length argument
>         would have to be in bytes and these arguments would have to
>         become longs.  This fact alone will kill this idea as it
>         raises all kinds of issues. Terrible idea, forget this one!
>  3. What about this idea. Two /static void arrayCopy-type/ methods in
>     MemorySegment: (or 4 if we overload with ByteOrder):
>       * /copyToArray(MemorySegment srcSegment, long srcOffsetBytes,
>         Object dstArray, int dstIndex, int dstElements)/;
>       * /copyFromArray(Object srcArray, int srcIndex, int srcElements,
>         MemorySegment dstSegment, long dstOffsetBytes)/;
>       * If these are implemented as native methods AND if there is not
>         a runtime performance burden, THIS would be elegant!
>
> 2. "split" the overloads between copyFrom and toXYZ arrays.
>
>   * Is this what you had in mind? (7 or 14 of these pairs)
>       o /int[] dstArray = srcSegment.asSlice(long offsetBytes, long
>         lengthBytes).toIntArray(int[] dstArray, int dstIndex, int
>         dstElements)/;
>         Because toIntArray returns an array, this changes the behavior
>         whereby the array is provided and returned. Clunky.
>         This requires specifying the length twice, once in bytes, the
>         other in elements. :-(
>       o /dstSegment.asSlice(long offsetBytes, long
>         lengthBytes).copyFrom(int[] srcArray, int srcIndex, int
>         srcElements)/;
>         This also requires specifying the length twice, once in bytes,
>         the other in elements.  :-(
>   * Perhaps you meant this? (7 or 14 of these pairs)
>       o /int[] dstArray = srcSegment.toIntArray(long
>         srcOffsetBytes, int[] dstArray, int dstIndex, int dstElements)/;
>         This still has a clunky return of the dstArray that isn't
>         really needed. Perhaps this variant would be a void return?
>       o /dstSegment.copyFrom(int[] srcArray, int srcIndex, int
>         srcElements, long dstOffsetBytes)/;
>
> I think I like 1.3 the best.
>
>   * Only 2 or 4 new methods (let's assume 4)
>   * But it may require the most under-the-covers work, which would be
>     in C++.  I'm hoping some of the code from System.arrayCopy() could
>     be leveraged.
>
> *Note*: I think there is a good reason why MemoryAccess overloads all 
> the methods with ByteOffset set to the default: and that is because 
> 90% of the use-cases and users want the default and don't really want 
> to always be specifying ByteOrder, even if it is only a static method 
> and quite fast; it would still be viewed as unnecessary!  If we are 
> choosing between only 2 or 4 added methods, overloading with the 
> default ByteOrder should not be a big deal.
>
> Cheers,
>
> Lee.
>
> /
> /
>
>
> On Tue, Apr 27, 2021 at 2:13 PM Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com 
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
>
>     On 27/04/2021 21:50, Maurizio Cimadamore wrote:
>     > is just one native call away, after all.
>
>     Whoops - I meant "static" not, "native", of course :-)
>
>     Maurizio
>


More information about the panama-dev mailing list