Two questions about forEach

Thu Aug 9 04:10:52 PDT 2012

I see your point,  you are saying:

             forEach means eager

is an obvius semantic. 

Well maybe, but I must confess I wouldn't say that forEach() is more eager
that map() if I hadn't read the Brian's paper.
I would had preferred forEach() being lazy, meaning return the incomming
Iterable, and provide someting more explicit like 

		.evaluate()  

But anyway, this is no so important, with map() you can do everything.

-----Mensaje original-----
De: Mike Duigou [mailto:mike.duigou at oracle.com] 
Enviado el: jueves, 09 de agosto de 2012 9:23
Para: Ryan Musgrave
CC: Jose; lambda-dev at openjdk.java.net
Asunto: Re: Two questions about forEach

On Aug 8 2012, at 22:54 , Ryan Musgrave wrote:

> In your example:
> 
> bar.stream()
>   .forEach( { f -> f.refresh();})
>   .filter( Foo::IsActive )
>   .findFirst();
> 
> I would expect the operations to apply in order. It shouldn't matter 
> whether or not they are lazy, eager, or if some steps can be optimised 
> out as long as the results are correct.

Does every item get refreshed in order to complete findFirst? Do you really
want findFirst to inspect every element? The semantics and efficiency are
important. forEach is intended to provide the semantic which matches it's
name, the lambda is executed for each item unconditionally and eagerly. I
don't want to relie upon static analysis tools to "maybe" find this problem.
There are other fluent alternatives to using forEach in the middle of a
stream.

If you do have examples of situation you would use a lazy forEach (not a map
or a filter) I'd like to hear them. In general though we're trying to
structure the api such that both the sources of elements and the elements
are immutable through all sequences of operations.

I would like to be able to return something from forEach but the source
stream ain't it. Originally forEach did return the stream but that was only
because methods with void returns make me "itchy"--methods should return
*something* if they can do so cheaply rather than void. Returning the stream
obscures the eager nature of forEach. Element count has been suggested since
it would be quite cheap and potentially useful.  

Another option is to abuse map() by providing a mapper that returns the same
object it is provided but also mutates them as desired much like what a lazy
forEach might do.

> The important part is the semantics. A static analysis tool could 
> quite easily warn a developer that they are using an eager operation 
> in the middle of lazy operations on a stream.

Typing, ie. not returning a stream, has the same effect. forEach is as
consistent as we can make it with the other eager ops.

> The .into() example could be the same too:
> 
> foo.stream()
>   .map(x -> x.convertToXml())
>   .into(xmlResults)
>   .map(XlstHelper::convertToFormatA)
>   .filter(x -> x != null)
>   .findFirst();
> if(LOG.isDebug()) {
>  xmlResults.LogResults();
> }
> 
> Is the only reason that this is prevented because the goal is to 
> prevent bad practice/multiple iteration of a collection?

into()'s return result is intended for collecting the result of a new
operation. So you say:

List<XmlDocument> docs = foo.stream()
  .map(x -> x.convertToXml())
  .into(ArrayList<>::new);

XmlDocument someDoc = docs.stream().map(XlstHelper::convertToFormatA)
  .filter(x -> x != null)
  .findFirst().getOrThrow(() -> new NoSuchElementException);

Having to do the ArrayList creation in a separate step before the stream may
be slightly annoying. If you prefer a single long statement you can always
do:

List<XmlDocument> docs = new ArrayList<>(); XmlDocument someDoc =
foo.stream()
  .map(x -> x.convertToXml())
  .into(docs)
  .stream() // turn result back into a stream.
  .map(XlstHelper::convertToFormatA)
  .filter(x -> x != null)
  .findFirst().getOrThrow(() -> new NoSuchElementException);

This form has "less meat, more bun" it's more verbose and still two
statements. This will work but I don't imagine people will use it.

Mike

> 
> Regards,
> Ryan
> 
> On Thu, Aug 9, 2012 at 10:47 AM, Mike Duigou <mike.duigou at oracle.com>
wrote:
>> 
>> On Aug 8 2012, at 03:51 , Jose wrote:
>> 
>>> 2. Why forEach() returns void?. In this way the pipeline is sealed 
>>> at the end, no more pieces can be attached to it.
>>> It would better returning the incomming iterable (its elements maybe 
>>> modified by the lambda)
>> 
>> The problem is that forEach(), like into(), is an eager operation. For
consistency with other eager operations forEach "consumes" the elements. If
you find you are wanting to put forEach() in the middle of a pipeline you
*probably* want to be using map instead which is properly lazy. forEach in
the middle of a pipeline would remove the opportunity for lazy behaviour
later in the pipeline.
>> 
>> Consider (imagining that forEach returned the stream) :
>>    Set<ServerConnection> bar;
>> 
>>    ServerConnection server = bar.stream().forEach(f-> 
>> f.refresh();}).filter(Foo::isActive).findFirst();
>> 
>> This snippet starts with a pool of server connections, refreshes each
connection, filters out the active ones, and returns the first active.
>> 
>> Using forEach for the refresh() step means that *every* server is
refreshed before even finding the first active server. This is probably a
huge waste of effort. Rather than calling forEach it would be better to make
the filter stage do the refresh (yes, it's a more complicated lambda, but
the benefit is that only a smaller set of servers must be refresh()ed).
>> 
>> If there are examples of forEach in the middle of a pipeline that do 
>> make sense to you, feel free to ask for refactoring suggestions. :-)
>> 
>> Mike
>>