ArrayFactory SAM type / toArray
Brian Goetz
brian.goetz at oracle.com
Wed Sep 19 15:13:43 PDT 2012
> I don't find these arguments convincing. There's no race (any more than
> there is for any bulk operation) as the allocation is done by the object
> itself. The allocation stuff is pretty much a red herring: most users
> don't preallocate the array. So it seems to me that using factories here
> might amount to needless complexity and inconsistency.
I agree with you that most users don't pre-allocate the array. Which
makes the existing form of toArray even more unfortunate! Because then
the allocation always involves multiple reflective calls. (Some of
which are sometimes optimized by some VMs in some conditions, but none
of which are always optimized by all VMs in all conditions.) So the
performance will always be worse in the toArray(T[]) formulation.
The fundamental problem is that the client knows how best to create the
array (the client can say "new Foo" but the library cannot say "new T",
and therefore has to fall back to reflection), but the library knows
best how big the array should be.
This is a classic example of the sort of differences in APIs you get
when designing an API with or without closures. The client knows how;
the library knows how much; ideally we'd like for the client to pass
that knowledge into the library. The approximations we got when it is
hard to combine these are unfortunate; we can do better now.
I'd find David's suggestion of toArray(Class) more compelling (in some
sense it is the most "right" in that it doesn't conflate "what" with
"how") except I don't buy that the intrinsification of reflective array
allocation in some VMs in some compilation modes in some situations
makes all the reflective costs go away.
We're creating a new API here. All things being equal, we should lean
on consistency with existing APIs when we can, but obviously that is
just a guideline (someday we're going to have to contend with the fact
that an int isn't big enough to store the size of collections.) The
existing toArray signatures are the best we could have done at the time
(and that was a very different time), but that doesn't mean we shouldn't
seek to do any better.
Here are what the client callsites might look like in various cases:
// status quo
Foo[] foos = ...toArray(new Foo[0]); // ugh reflection
Foo[] foos = ...toArray(new Foo[xyz.size()]); // ugh ugly and racy
// proposed
Foo[] foos = ...toArray(n -> new Foo[n]);
// David's alternative
Foo[] foos = ...toArray(Foo.class);
I don't see the "complexity of factories" being a big problem here -- if
people can deal with lambdas at all, this is a pretty simple case, and
its only a few characters longer than the "new Foo[0]" version. I think
the lambda code reads pretty naturally. (Actually I find the "new
Foo[0]" the most confusing -- why would I pass in a new empty array?)
More information about the lambda-libs-spec-experts
mailing list