RFR(s): 8060192: Add default method Collection.toArray(generator)

Tue Dec 12 02:00:15 UTC 2017

On Dec 7, 2017, at 5:03 PM, Martin Buchholz <martinrb at google.com> wrote:
> 
> (I'm still trying to love this new API)
> 
> The changes to the jsr166 tck are fine.
> 
> I'm not convinced that the new implementation for ArrayList is progress.
> The current implementation of toArray(T[]) does
> 
>            // Make a new array of a's runtime type, but my contents:
>            return (T[]) Arrays.copyOf(elementData, size, a.getClass());
> 
> which does not have to pay the cost of zeroing the array, so is potentially
> faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
> does jmh say?)

The VM does not do pre-zeroing.  We've done experiments over the years
with that and it has never beaten pre-garbaged memory, with zero filling
at creation time.  The reason is that we can vectorize the zero filling at
any time *and* omit zeroing at all in some important cases like copyOf.
OTOH, pre-zeroing assumes there is background bandwidth available
for pre-processing cache lines that are not yet in use, which isn't always
true; sometimes the mutator needs all the cache bandwidth.  All of which
is to say that the VM is likely *never* to do pre-zeroing, unless there is
some special hardware feature that does so, in bulk, in main memory.
Which AFAIK doesn't exist today.

This particular set of performance problems is (IMO) sensitive to the
number of passes over the input and output arrays, as well as over
any intermediate arrays.  Zeroing isn't a separate pass in the case
of copyOf, but in any two phase array creation protocol, zeroing *is*
likely to require a separate pass.  What do I mean by "two phase"?
I mean a scheme where in phase one an empty array is created,
and in phase two the array is filled with content.  If the two phases
are not reliably juxtaposed (as they are in copyOf) the VM must
conservatively fill the empty array with zeros.  Different code shapes
have (inherently) different inlining characteristics.  In the case of
toArray(IntFunction), three things must happen to drop the zeroing
pass (as in Arrays.copyOf):  First, the particular IntFunction must
fully inline (that's phase one), and second, the (phase two) loop
that populates the new array must be juxtaposed with the inlined
code of the IntFunction call, without intermediate logic that might
discourage the optimizer.  Third, both code shapes must use
intrinsics that are recognized by the optimizer.  These three
conditions are carefully engineered in the body of Arrays.copyOf,
but appear "emergently" ("accidentally") via the toArray(IF) API.
So the optimization isn't impossible, but it is subject to more risk,
and/or will require more optimizer work to support that code shape.

Bottom line:  Arrays.copyOf is not as arbitrary as one may think;
it is designed that way because it is easy to optimize.  And it covers
two interesting use cases in the present discussion, which are
(a) copy my internal array wholesale, and (b) make me a fresh
array from a zero-length exemplar.

> If we're not getting type safety and not getting significantly better
> performance, all we have is a more natural API.  But toArray(IntFunction)
> also seems not very natural to me,

Whether or not it bears on the present API discussion, I think it
would be fruitful to step back and ask ourselves the big picture
question here, "What is the natural shape of an array factory?"

I submit to you that such a factory is *not* an IntFunction, because
that only creates half of an array (or 0.01% of one), the empty
version that needs to be populated.  A natural array factory API
will provide two pieces of data:  First a size (an accurate, not a
provisional one like List.size), and second a series of values to
put in the array.  *That* structure is the natural, might I say
categorical, argument to an array factory.  Also a List factory.
As you can probably tell, I'm taking the value-based view of
things here.

(Proof by generalization:  If/when we eventually create frozen
arrays, and flattened arrays, and volatile arrays, and hybrid
instance-with-sized-tail arrays, it will not always be an option
to create the empty array in a first phase, and then fill it from
the outside.  Frozen arrays are the obvious counterexample,
but any immutable data structure will do, as will a private
object tail, even if mutable.)

The interface would look something like one of these:

interface ArrayContent<T> { int size(); T get(int i); }
interface ArrayContent<T> { int size(); Iterator<T> iterator(); }
interface ArrayContent<T> { int count(); Stream<T> stream(); }
interface ArrayContent<T> { Spliterator<T> elements(); } //must be SIZED

That last suggests that we don't need a new API at all,
that perhaps the natural input to an array factory is just
a Spliterator with the SIZED characteristic, and a range
check on the estimateSize() result.  Or maybe a Stream.

There are obvious bootstrapping issues here, but I don't
think they are complex, because the source object can
create a tiny Stream or Spliterator or ArrayContent that
provides a trivial view on that source object's internals.

For ArrayList (or its sublists) it would be enough to return
a three-field Spliterator thingy that peeks at a left-edge
shrinking slice of an arbitrary T[] array.  It's a little like
what Claes and I were discussing for List.of (and that
is a related problem).

Because of its categoricity, the above ArrayContent thingy
is also useful for creating things which aren't literally arrays
but look enough like them (have the same contents).
We could design an API that subsumes and generalizes
both toArray and asList, by allowing the lambda to specify
what sort of aggregate to produce:

interface Collection<T> {
   /** Produces a copy of the sequence of values in this collection. */
   <C> C copy(SequenceCopier<T,C> copier);
}
// SequenceCopier<T,C> == Function<SequenceContent<T>, C>
// SequenceContent<T> == ArrayContent<T> or Stream<T> or Spliterator<T>

The idea is that every major sequence-like type can provide its own
"copier" factory function which will take a standard SequenceContent
and package it up.  And every collection or stream (or sequence)
would also provide a copy operation which builds an internally
appropriate SequenceContent cursor and calls the user-supplied
factory method, as "return copier.apply(myContent);".

class Arrays {
  // people who want x.toArray can also use x.copy(Arrays::make):
  static <T> T[] make(SequenceContent<T,T[]> content) { … }

  // ArrayList.copy uses this one internally as "myContent":
  static <T> SequenceContent<T> slice(T[] a, int start, int end) { … }
}
class List<T> {
  // people who want x.asUnmodifiableList can also use x.copy(List::make):
  default SequenceCopier<T,List<T>> make(SequenceContent<T,T[]> content) { … }

  // external access to a sized snapshot of a list:
  default SequenceContent<T> sizedContent() { … }
}

For my money, *that's* the natural API point we should be
looking for.  (Hence it's not necessarily relevant to 8060192.)
It's a nice little three-way dance between a source collection,
a lightweight snapshot of its contents, and a lambda that
creates a destination value.  There are enough small bikesheds
here to keep us trying on new colors until spring, but I think
that goes with the territory:  There are three things that fit
together here, and they all need nice names.

To repeat:  These points are in response to Martin's
musing about a "natural" factory API for arrays, and
don't necessarily apply to this RFE.  I *do* think that
the two-phase array creation protocol (first create
empty, then fill later) is fundamentally weak, both in
its lack of generality and its optimizability, but there
may be reasons I'm unaware of why people would
love to have more stuff in that vein.

> and we'll have to live with the
> toArray(new String[0]) wart forever anyways.  (But is it really so bad?)
> (Maybe toArray(Class<componentType>) is actually more natural?)

(I wish I had this API point:  T[] Class<T>.emptyArray() which would
cache the darn empty array.  It's T[] so it has to throw on primitives until
we get Valhalla generics.  I have written many a private-static-final
to hold an empty array for this sort of thing.  Would make toArray more
tolerable IMO.  In other dreams, empty varargs lists would use this array
as well.)

— John