forEach/forEachOrdered/forEachRemaining ... iterate?

Mon Jun 24 01:31:29 PDT 2013

On 06/13/2013 05:50 PM, Brian Goetz wrote:
> It's a mess for sure, but I am not sure whether this is a good path 
> out of the forest.  There are only so many candidate names, and the 
> "try and give every unique semantics a unique name" eventually 
> degenerates into them all being some form of "forEach #352", where the 
> user code is still not readable without a decoder chart.
>
> Here are some of the dimensions covered here:
>
>  - data arity (Consumer vs BiConsumer)
>  - data projection (element, map key, map value)
>  - sequential vs parallel
>  - whether to prefer encounter (spatial) or temporal order
>  - more...
>
> It is impossible for a name to be readable and writable and still 
> precisely convey all the dimensions here; forEach is fuzzy, 
> forEachKeyParallelInEncounterOrder is precise but awful.  I don't know 
> what the difference between iterate and forEach should be; I could 
> learn, but it will never be obvious because of the way our brains map 
> words to concepts.
>
> Without overloading names, it is already spiraling out of control.  
> The goal here was for "forEach" to be the natural behavior for the 
> thing being iterated; for a sequential stream, that is a sequential 
> encounter-order traversal and for a parallel stream, that is a 
> parallel temporal-order traversal.  Variants of forEach (like 
> forEachOrdered) allow you to select the less natural variant.  (*I 
> would rather get rid of forEachRemaining and make them "forEach" also.*)

+1

I think that majority of usages will be iterating over the whole set of 
elements in one go, not doing any external iteration steps before 
calling forEachRemaining. I recently played with constructing an API 
based on Iterator and this "forEachRemaining" really stands out when I 
read it, making me wonder why the "Remaining" part is so important to be 
explicitly spelled out and to be taking precious space in code, 
obfuscating the really important part, the "forEach".

Even if the semantics of Iterator.forEach were not explicitly described 
in javadoc, many users would implicitly assume the "Remaining" 
semantics, since they are used to the fact that with Iterator API, you 
can't go back once you consume an element. So I think the javadoc is 
more than enough to clear-up any doubts and the "Remaining" method 
suffix could be dropped.

Regards, Peter

>
>
>
>
> On 6/13/2013 10:47 AM, Peter Levart wrote:
>> I know it's a little late, but let's look at current situation. There
>> are 5 methods in current APIs with similar names and signatures + 1
>> additional with a little different signature in Map + some similarly
>> named methods in ConcurrentHashMap:
>>
>> interface Stream<T> {
>>       void forEach(Consumer<? super T> action)
>>       void forEachOrdered(Consumer<? super T> action)
>> }
>>
>> interface Iterable<T> {
>>       void forEach(Consumer<? super T> action)
>> }
>>
>> interface Iterator<T> {
>>       void forEachRemaining(Consumer<? super T> action)
>> }
>>
>> interface Spliterator<T> {
>>       void forEachRemaining(Consumer<? super T> action)
>> }
>>
>> interface Map<K, V> {
>>       void forEach(BiConsumer<? super K, ? super V> action)
>> }
>>
>> class ConcurrentHashMap<K, V> ... {
>>       void forEachKey(long parallelismThreshold,
>>                       Consumer<? super K> action)
>>       <U> void forEachKey(long parallelismThreshold,
>>                           Function<? super K, ? extends U> transformer,
>>                           Consumer<? super U> action)
>>       void forEachValue(long parallelismThreshold,
>>                         Consumer<? super V> action)
>>       <U> void forEachValue(long parallelismThreshold,
>>                             Function<? super V, ? extends U> 
>> transformer,
>>                             Consumer<? super U> action)
>> }
>>
>>
>> I'm wondering if the following alternative would be easier to read and
>> reason-about in code:
>>
>> interface Stream<T> {
>>       void forEach(Consumer<? super T> action)
>>       void iterate(Consumer<? super T> action)
>> }
>>
>> interface Iterable<T> {
>>       void iterate(Consumer<? super T> action)
>> }
>>
>> interface Iterator<T> {
>>       void iterateRemaining(Consumer<? super T> action)
>> }
>>
>> interface Spliterator<T> {
>>       void iterateRemaining(Consumer<? super T> action)
>> }
>>
>> interface Map<K, V> {
>>       void iterate(BiConsumer<? super K, ? super V> action)
>> }
>>
>> class ConcurrentHashMap<K, V> ... {
>>       void forEachKey(long parallelismThreshold,
>>                       Consumer<? super K> action)
>>       <U> void forEachKey(long parallelismThreshold,
>>                           Function<? super K, ? extends U> transformer,
>>                           Consumer<? super U> action)
>>       void forEachValue(long parallelismThreshold,
>>                         Consumer<? super V> action)
>>       <U> void forEachValue(long parallelismThreshold,
>>                             Function<? super V, ? extends U> 
>> transformer,
>>                             Consumer<? super U> action)
>> }
>>
>>
>> Why? I'm a little concerned about the duality of Stream.forEach() and
>> Iterable.forEach(). While the later is always sequential and
>> encounter-ordered, the former can be executed out-of order and/or
>> in-parallel. I think that by naming the sequential variants differently,
>> the reader of code need not be concerned about the type of the target
>> that the method is invoked upon, just the name. This enables fast
>> browsing of code where the eye can quickly glance over iterate()s but
>> slows down on each forEach()...
>>
>> How do others feel about the (re)use of forEach... names in different
>> APIs? Would it be more difficult to find the right method if iterate()
>> was used instead of forEach() for iteration?
>>
>>
>> Regards, Peter
>>
>>