Stream implements Iterable, thoughts from Guava

Fri Apr 26 14:47:16 UTC 2019

(Kevin and I have been hoping in vain to spend more time on this, so here's
a dump of the additional long email I wrote last month but hadn't sent.
Obviously still no pressure on you to pick this back up.)

Last time, I presented anecdotes about how "Stream implements Iterable"
could cause problems for Iterable-accepting methods. My hope has been to
weigh those against the benefits of using Stream in foreach. Now, I'm back
to argue that the benefits to foreach aren't as large as they initially
sound.

As the proposal points out, it's always possible to fall back to an
old-style for() loop with an Iterator variable. While I agree that this is
"an uncomfortable step backwards," I would argue three things:

1. It's good for there to be some friction.
2. Some people don't even think of this possibility.
3. The step doesn't actually make code much longer.

In more detail:

1. It's good for there to be some friction.

This has already been brought up, at least that people might prefer for()
over forEach(), so I don't want to dwell on it. Anecdote: When Kevin
reviewed all calls to our old Stream-to-Iterable adapter (only 15 at that
point), the 1 caller who was using it for a for() loop was reimplementing
collect(toMap()). So I just want to emphasize that this concern applies to
more than just forEach(). (Of course you're familiar with the same basic
problem from forEach() itself, which people sometimes use instead of
collect(), etc.)

(Naturally we don't want *too* much friction, so one way to frame this part
of the debate is "How much is too much?" In my remaining points, I'll argue
that it's less friction than it appears.)

2. Some people don't even think of this possibility.

That's not a *good* thing, of course: We'd rather that users who really
want to break out of Stream-land be able to find the way to do so. But I
think this is one reason that the demand for "foreach for Streams" hasn't
gone away: People just don't know about the closest alternative. Perhaps,
if we more widely encouraged the old-style loop, it would reduce demand for
foreach.

For example, while the proposal calls out the old-style loop as a
possibility, it's not mentioned in
https://bugs.openjdk.java.net/browse/JDK-8148917 -- nor in most of the
pages linked from there. That's fair when the problem is phrased as "How do
I use Stream with foreach?" but that's arguably an XY problem: What someone
wanted is {checked exceptions, break/continue/return, etc.}; what that
person tried is foreach, but that might not be the best solution.

3. The step doesn't actually make code much longer.

The proposal presents some code, but I'd emphasize two things:

- Anyone with "foreach for Streams" will also have `var`.

- Users are likely to already want to declare a variable for their Stream.

So I think the comparison looks like this:

var names =
    annotations.stream()
        .map(ASTHelpers::getSymbol)
        .filter(Objects::nonNull)
        .map(Symbol::getSimpleName)
        .map(a -> "@" + a);
for (var name : names) {
  doSomething(name);
}

vs.

var names =
    annotations.stream()
        .map(ASTHelpers::getSymbol)
        .filter(Objects::nonNull)
        .map(Symbol::getSimpleName)
        .map(a -> "@" + a)
        .iterator();
while (names.hasNext()) {
  doSomething(names.next());
}

(code is adapted from
https://github.com/google/error-prone/blob/61cb540c08ec63faa56dccce00049cff1f8b41ea/core/src/main/java/com/google/errorprone/bugpatterns/AnnotationPosition.java#L230,
the first Stream use I found in Google open-source code)

To be fair:

- Sometimes users will want to extract each element into a variable.

- Sometimes users *won't* have a variable for the Stream. This seems likely
if the Stream is returned from a method that "does most of the work,"
rather than being created and used inline (if that distinction makes
sense?). My impression in Google code is that most streams are created
inline, but I admit I don't have numbers, and as always, we might not be
representative. (In particular, I think that we and you give somewhat
different guidance on when a method should return Stream.)

So sometimes the comparison looks like this:

for (var name : makeTheStream()) {
  doSomething(name);
}

vs.

for (var names = makeTheStream().iterator(); names.hasNext(); ) {
  var name = names.next();
  doSomething(name);
}

That's definitely worse. But again, it's worth noting that `var` still
helps. Thanks to `var`, foreach is less important today than it was when it
was first introduced.

To be clear: foreach is still good! I still like it! But we're no longer
competing with:

for (Iterator<Something<Foo>> names = makeTheStream().iterator();
    names.hasNext();
    ) {
  Something<Foo> name = names.next();
  doSomething(name);
}

If we're considering, then, what's changed since the original decision to
*not* make Stream implement Iterable, I'd include `var`. And I'd argue that
it improves old-style for() loops enough that we don't much need foreach
over Stream.

So the question again, as I see it, is how often foreach is useful for
Stream (and how much effort that saves) versus how often Iterable-ness will
be used when it "shouldn't" be (and how bad that is).