RFR: 8180352: Add Stream.toList() method
Donald Raab
donraab at gmail.com
Thu Feb 4 17:59:40 UTC 2021
I have been reading previous threads, the original bug request, and exploring the javadoc and implementation of toList() on Stream in JDK 16. I don’t want to waste time rehashing previous discussions, but I want to understand and prioritize the motivation of this change, and propose what I believe is a safer alternative name for this method based on the current implementation: Stream.toUnmodifiableList().
The original justification for Stream.toList() as described in the bug request filed in 2017 was: "It makes the coding just easier, saves some time and place.”
see: https://bugs.openjdk.java.net/browse/JDK-8180352
Summarizing my interpretation of the eventual motivations for the change, over various threads (please correct as needed):
1. Convenience (easier to discover API and less code for developers to write - potential refactoring of large number of unmodifiable usages of Collectors.toList())
2. Performance (less copying during parallel operations and pre-sizing for serial operations)
3. Unmodifiability (List result but thread-safe and usable in static contexts and as hash keys - unsafe for mutating operations)
4. Support for nulls (Inconsistent with Collectors.toUnmodifiableList() but consistent with other Stream methods)
List is a mutable interface. It’s contract allows for “conditional mutability”. A convention was established in 2014 with Collectors.toList() returning a mutable List (ArrayList). The spec in Collectors.toList() says there is no guarantee on the type, but there appears to be no appetite to undo this. The current Collectors.toList() convention was then further strengthened with the addition of Collectors.toUnmodifiableList() in Java 10 in 2018.
The problem as I see it is that we are currently only looking inside of the JDK for convention. The convention of using the toList() name has existed in other libraries and application code for as long as List has been available as an interface. The problem of inconsistency of convention within the JDK itself can be argued away as a historical mistake, but harder to claim for the "established convention" in other libraries.
We learned an important lesson in JDK 15 when the addition of a default isEmpty method to CharSequence broke Eclipse Collections. See this excellent blog from Stuart Marks: https://stuartmarks.wordpress.com/2020/09/22/incompatibilities-with-jdk-15-charsequence-isempty/.
The Stream interface has existed for seven years now. The method toList() is a prime candidate for anyone who wanted to optimize for convenience over the past seven years.
I found this default definition of toList() on the Folds interface in Cyclops:
https://github.com/aol/cyclops/blob/dc26bdad0f2b8a5aa87de4aa946ed52a0dbf00de/cyclops/src/main/java/com/oath/cyclops/types/foldable/Folds.java#L99
I discovered these occurrences of toList() in the jOOQ library.
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/Collectable.java#L682
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/SeqImpl.java#L557
Incidentally, this interface also defines toUnmodifiableList().
https://github.com/jOOQ/jOOL/blob/main/jOOL/src/main/java/org/jooq/lambda/Collectable.java#L692
jOOQ did not define it’s toList() method as a default method on Collectable, so it should not trigger the default method issue we discovered in JDK 15, but the specifications of toList() on Collectable and Stream are incompatible. Which specification should win?
Eclipse Collections consistent convention in all of its interfaces is that toList returns a MutableList [1]. MutableList extends java.util.List, which is a mutable interface. The Stream.toList() method will violate the principle of least surprise for Eclipse Collections users based on long time established conventions for the the library. There may be other libraries and applications that have defined their own toList() methods to return List as well.
Lastly, Craig Motlin explains the performance optimization we use for toList in our parallel implementation in his talk in 2014: https://www.infoq.com/presentations/java-streams-scala-parallel-collections/
Thanks,
Don
[1] Example usages of Eclipse Collections toList:
// toList result is mutable for all of these usages with Eclipse Collections
List list1 = mutableSet.toList();
List list2 = mutableSet.asLazy().toList();
List list3 = mutableSet.asParallel(Executors.newWorkStealingPool(), 10).toList();
List list4 = mutableSet.stream().collect(Collectors.toList());
List list5 = mutableSet.stream().collect(Collectors2.toList());
// toList result is currently unmodifiable in Stream
List list6 = mutableSet.stream().toList();
More information about the core-libs-dev
mailing list