Excessive copying of Collection.toArray()

Stuart Marks stuart.marks at oracle.com
Thu Sep 4 03:55:13 UTC 2025



>> There have been previous attempts to reduce copying here and there, but they do suffer from the problems you have described. See https://github.com/openjdk/jdk/pull/12212
> 
> That's a valuable link. It touches on the same problem and potential
> solutions. It's a shame it was abandoned. This superficially simple
> problem touches performance, security, and bootstrapping and that
> scares people off.

I think that PR is a good example of why we don't like to start off with code. As 
you noted there are a bunch of subtleties to the problem. Starting off with a PR 
with code mixes the fundamental issues with coding issues and the whole thing goes 
off into the weeds. If the code was written under the assumption that it was trying 
to solve problem A, but the real problem is problem B, then it turns into an uphill 
struggle to convert the existing code from solving problem A to solving problem B. 
Or worse, the code change bounces around trying to solve several problems but solves 
none of them well. (There have been several other examples of this phenomenon.)

> But I would still plead that there is some low-hanging fruit for
> improvement here. There are defensive clones and "is it an ArrayList
> (but not ArrayList subclass!!!)" optimizations copy-pasted in half a
> dozen scattered places. If that optimization is still valuable, it
> would benefit from a reusable utility method, where it can be
> centrally maintained, documented, and evangelized. If not, it should
> be deleted.

Agreed, a simple-though-incomplete approach might provide some useful improvements 
in the short term.

> Also, the knowledge of why the defensive clone is made at all, and why
> it must be of type Object[] and not a covariant subclass, these are
> currently arcane secrets held only by those who know the history of
> the associated bug reports. That's another part of why I argue for a
> central method that can be named and discussed. (I originally
> requested it in a webbug report, and was told it would be better
> discussed on the mailing list instead.)

Two issues here.

The need for an extra defensive copy was established and fixed via the security 
vulnerability process and as such, all discussion of it was embargoed. Sorry. 
However, the sources are all out there and if you look at the code prior to commit 
[1] you can see the vulnerable code. Also, the eagle-eyed Heinz Kabutz noticed this 
and wrote about it in his Java Specialists' newsletter [2]. (Leave it to Heinz to 
relate the ArrayList.class check to the particular local origin of Cretan 
vegetables.) But I think you've discerned the issue yourself already.

The second issue is that, by specification, the array returned by the no-arg 
toArray() method must have a component type of Object. The specification previously 
wasn't clear about this, and this led to a variety of bugs. See [3] and bugs linked 
from there. Now, *why* is it specified this way? Well, the static return type of 
toArray() is Object[], which leads callers to believe they can store any object into 
it. If toArray() were to return an array with some other component type, they'd get 
ArrayStoreException. So, the spec requires that you get what it says on the tin.

s'marks



[1]: https://github.com/openjdk/jdk/commit/343ecd806bb05065d31124573b3e58a864cfc617

[2]: 
https://www.javaspecialists.eu/archive/Issue290-CopyOnWriteArrayList-and-Collection.toArray.html

[3]: https://bugs.openjdk.org/browse/JDK-8160406



More information about the core-libs-dev mailing list