RFR: 8180352: Add Stream.toList() method
Brian Goetz
brian.goetz at oracle.com
Wed Nov 4 17:58:59 UTC 2020
> As for nullity topic, I really welcome that the proposed toList() is
> null-tolerant but it worth mentioning that existing terminal
> operations like findFirst(), findAny(), min() and max() are already
> null-hostile.
The min() and max() methods are not null hostile. If you pass a
null-friendly comparator (such as returned by the `nullsFirst()` and
`nullsLast()` comparator combinators), nulls are not a problem. The
null hostility comes not from streams, but from the behaviors that
clients pass to stream methods, whether they be min() or max() or
map(). If the behaviors passed to _any_ stream method are null-hostile,
then (usually) so will be that stream pipeline. But this is something
entirely under the control of the user; users should ensure that their
domain and the behaviors that operate on that domain are consistent.
What we're talking about is what _streams_ should do. Streams should
not gratutiously narrow the domain. Since `toList()` will be built into
streams, we have to define this clearly, and there's no justification
for making this method null-hostile. The arguments made so far that
toList() should somehow be null-hostile appear to be nothing more than
weak wrappers around "I hate nulls, so let's make new methods
null-hostile."
Remi say:
> You know that you can not change the implementation of
> Collectors.toList(), and you also know that if there is a method
> Stream.toList(), people will replace the calls to
> .collect(Collectors.toList()) by a call to Stream.toList() for the
> very same reason but you want the semantics of Stream.toList() to be
> different from the semantics of Collectors.toList().
This is what I call a "for consistency" argument. Now, we all know that
consistency is a good starting point, but almost invariably, when
someone says "you should do X because for consistency with Y", that "for
consistency" argument turns out to be little more than a thin wrapper
around "I prefer Y, and I found a precedent for it." Yes, people will be
tempted to make this assumption -- at first. (And most of the time, that
will be fine -- the most common thing people do after collecting to a
list is iterate the list.) But it is a complete inversion to say that
the built-in method must be consistent with any given existing
Collector, even if that collector has a similar name. The built-in
method should provide sensible default to-list behavior, and if you want
_any other_ to-list behavior -- a mutable list, a different type of
list, a null-hostile list, a list that drops prime-numbered elements,
whatever -- you use the more general tool, which is collect(), which
lets you do whatever you want, and comes with a variety of
pre-configured options. And this is the most sensible default behavior
for a built-in to-list operation.
firstFirst/findAny are indeed sad corner cases, because we didn't have a
good way of representing "maybe absent nullable value." (If that case
were more important, we might have done more work to support it.) But I
think it would be a poor move to try and extrapolate from this behavior;
this behavior is merely a hostage to circumstance.
More information about the core-libs-dev
mailing list