ArrayFactory SAM type / toArray

Wed Sep 19 16:45:44 PDT 2012

On 09/19/2012 05:13 PM, Brian Goetz wrote:
>> I don't find these arguments convincing.  There's no race (any more than
>> there is for any bulk operation) as the allocation is done by the object
>> itself.  The allocation stuff is pretty much a red herring: most users
>> don't preallocate the array. So it seems to me that using factories here
>> might amount to needless complexity and inconsistency.
>
> I agree with you that most users don't pre-allocate the array.  Which
> makes the existing form of toArray even more unfortunate!  Because then
> the allocation always involves multiple reflective calls.  (Some of
> which are sometimes optimized by some VMs in some conditions, but none
> of which are always optimized by all VMs in all conditions.)  So the
> performance will always be worse in the toArray(T[]) formulation.
>
> The fundamental problem is that the client knows how best to create the
> array (the client can say "new Foo" but the library cannot say "new T",
> and therefore has to fall back to reflection), but the library knows
> best how big the array should be.
>
> This is a classic example of the sort of differences in APIs you get
> when designing an API with or without closures.  The client knows how;
> the library knows how much; ideally we'd like for the client to pass
> that knowledge into the library.  The approximations we got when it is
> hard to combine these are unfortunate; we can do better now.
>
> I'd find David's suggestion of toArray(Class) more compelling (in some
> sense it is the most "right" in that it doesn't conflate "what" with
> "how") except I don't buy that the intrinsification of reflective array
> allocation in some VMs in some compilation modes in some situations
> makes all the reflective costs go away.

It should be:

   <S super T> S[] toArray(Class<S> clazz);

Likewise if the ArrayFactory deal is the way we go it probably ought to be:

   <S super T> S[] toArray(ArrayFactory<S> clazz);

..because it seems to me you might have a wildcarded stream, or a more 
specific stream type than you intend to capture.  Restricting it to the 
one actual type is, in a way, as bad as returning Object[].

> We're creating a new API here.  All things being equal, we should lean
> on consistency with existing APIs when we can, but obviously that is
> just a guideline (someday we're going to have to contend with the fact
> that an int isn't big enough to store the size of collections.)  The
> existing toArray signatures are the best we could have done at the time
> (and that was a very different time), but that doesn't mean we shouldn't
> seek to do any better.
>
> Here are what the client callsites might look like in various cases:
>
>   // status quo
>   Foo[] foos = ...toArray(new Foo[0]);            // ugh reflection
>   Foo[] foos = ...toArray(new Foo[xyz.size()]);   // ugh ugly and racy
>
>   // proposed
>   Foo[] foos = ...toArray(n -> new Foo[n]);
>
>   // David's alternative
>   Foo[] foos = ...toArray(Foo.class);
>
> I don't see the "complexity of factories" being a big problem here -- if
> people can deal with lambdas at all, this is a pretty simple case, and
> its only a few characters longer than the "new Foo[0]" version.  I think
> the lambda code reads pretty naturally.  (Actually I find the "new
> Foo[0]" the most confusing -- why would I pass in a new empty array?)
>

-- 
- DML