RFR: 8180352: Add Stream.toList() method

Brian Goetz brian.goetz at oracle.com
Wed Nov 4 17:58:59 UTC 2020


> As for nullity topic, I really welcome that the proposed toList() is
> null-tolerant but it worth mentioning that existing terminal
> operations like findFirst(), findAny(), min() and max() are already
> null-hostile.
The min() and max() methods are not null hostile.  If you pass a 
null-friendly comparator (such as returned by the `nullsFirst()` and 
`nullsLast()` comparator combinators), nulls are not a problem.  The 
null hostility comes not from streams, but from the behaviors that 
clients pass to stream methods, whether they be min() or max() or 
map().  If the behaviors passed to _any_ stream method are null-hostile, 
then (usually) so will be that stream pipeline.  But this is something 
entirely under the control of the user; users should ensure that their 
domain and the behaviors that operate on that domain are consistent.

What we're talking about is what _streams_ should do.  Streams should 
not gratutiously narrow the domain.  Since `toList()` will be built into 
streams, we have to define this clearly, and there's no justification 
for making this method null-hostile.  The arguments made so far that 
toList() should somehow be null-hostile appear to be nothing more than 
weak wrappers around "I hate nulls, so let's make new methods 
null-hostile."

Remi say:

> You know that you can not change the implementation of 
> Collectors.toList(), and you also know that if there is a method 
> Stream.toList(), people will replace the calls to 
> .collect(Collectors.toList()) by a call to Stream.toList() for the 
> very same reason but you want the semantics of Stream.toList() to be 
> different from the semantics of Collectors.toList().

This is what I call a "for consistency" argument.  Now, we all know that 
consistency is a good starting point, but almost invariably, when 
someone says "you should do X because for consistency with Y", that "for 
consistency" argument turns out to be little more than a thin wrapper 
around "I prefer Y, and I found a precedent for it." Yes, people will be 
tempted to make this assumption -- at first. (And most of the time, that 
will be fine -- the most common thing people do after collecting to a 
list is iterate the list.)   But it is a complete inversion to say that 
the built-in method must be consistent with any given existing 
Collector, even if that collector has a similar name.  The built-in 
method should provide sensible default to-list behavior, and if you want 
_any other_ to-list behavior -- a mutable list, a different type of 
list, a null-hostile list, a list that drops prime-numbered elements, 
whatever -- you use the more general tool, which is collect(), which 
lets you do whatever you want, and comes with a variety of 
pre-configured options.  And this is the most sensible default behavior 
for a built-in to-list operation.

firstFirst/findAny are indeed sad corner cases, because we didn't have a 
good way of representing "maybe absent nullable value."  (If that case 
were more important, we might have done more work to support it.)  But I 
think it would be a poor move to try and extrapolate from this behavior; 
this behavior is merely a hostage to circumstance.




More information about the core-libs-dev mailing list