forEach/forEachOrdered/forEachRemaining ... iterate?

Mon Jun 24 02:41:27 PDT 2013

On 06/24/2013 10:31 AM, Peter Levart wrote:
> On 06/13/2013 05:50 PM, Brian Goetz wrote:
>> It's a mess for sure, but I am not sure whether this is a good path
>> out of the forest.  There are only so many candidate names, and the
>> "try and give every unique semantics a unique name" eventually
>> degenerates into them all being some form of "forEach #352", where the
>> user code is still not readable without a decoder chart.
>>
>> Here are some of the dimensions covered here:
>>
>>   - data arity (Consumer vs BiConsumer)
>>   - data projection (element, map key, map value)
>>   - sequential vs parallel
>>   - whether to prefer encounter (spatial) or temporal order
>>   - more...
>>
>> It is impossible for a name to be readable and writable and still
>> precisely convey all the dimensions here; forEach is fuzzy,
>> forEachKeyParallelInEncounterOrder is precise but awful.  I don't know
>> what the difference between iterate and forEach should be; I could
>> learn, but it will never be obvious because of the way our brains map
>> words to concepts.
>>
>> Without overloading names, it is already spiraling out of control.
>> The goal here was for "forEach" to be the natural behavior for the
>> thing being iterated; for a sequential stream, that is a sequential
>> encounter-order traversal and for a parallel stream, that is a
>> parallel temporal-order traversal.  Variants of forEach (like
>> forEachOrdered) allow you to select the less natural variant.  (*I
>> would rather get rid of forEachRemaining and make them "forEach" also.*)
> +1
>
> I think that majority of usages will be iterating over the whole set of
> elements in one go, not doing any external iteration steps before
> calling forEachRemaining. I recently played with constructing an API
> based on Iterator and this "forEachRemaining" really stands out when I
> read it, making me wonder why the "Remaining" part is so important to be
> explicitly spelled out and to be taking precious space in code,
> obfuscating the really important part, the "forEach".

There is no point to use the same name for things that are semantically 
different.
By example,
   iterator.forEachRemaining(e -> {...});
   iterator.remove();
is a valid code.

Now, for the name, it's sometimes useful to separate the first object
from the following ones,
if (!iterator.hasNext()) {
    return "";
}
StringBuilder builder = new StringBuilder();
builder.append(it.next());
iterator.forEachRemaining( element -> {
   builder.append(", ").append(element);
});
return builder.toString();

>
> Even if the semantics of Iterator.forEach were not explicitly described
> in javadoc, many users would implicitly assume the "Remaining"
> semantics, since they are used to the fact that with Iterator API, you
> can't go back once you consume an element. So I think the javadoc is
> more than enough to clear-up any doubts and the "Remaining" method
> suffix could be dropped.

If you have only the method forEach and a class that implement
Iterator and Iterable (why is not the problem here :), then you are
in trouble to provide the code of forEach given the semantics for
iterating over an Iterable and iterating over an Iterator is not the same.

>
> Regards, Peter

cheers,
Rémi

>
>>
>>
>>
>> On 6/13/2013 10:47 AM, Peter Levart wrote:
>>> I know it's a little late, but let's look at current situation. There
>>> are 5 methods in current APIs with similar names and signatures + 1
>>> additional with a little different signature in Map + some similarly
>>> named methods in ConcurrentHashMap:
>>>
>>> interface Stream<T> {
>>>        void forEach(Consumer<? super T> action)
>>>        void forEachOrdered(Consumer<? super T> action)
>>> }
>>>
>>> interface Iterable<T> {
>>>        void forEach(Consumer<? super T> action)
>>> }
>>>
>>> interface Iterator<T> {
>>>        void forEachRemaining(Consumer<? super T> action)
>>> }
>>>
>>> interface Spliterator<T> {
>>>        void forEachRemaining(Consumer<? super T> action)
>>> }
>>>
>>> interface Map<K, V> {
>>>        void forEach(BiConsumer<? super K, ? super V> action)
>>> }
>>>
>>> class ConcurrentHashMap<K, V> ... {
>>>        void forEachKey(long parallelismThreshold,
>>>                        Consumer<? super K> action)
>>>        <U> void forEachKey(long parallelismThreshold,
>>>                            Function<? super K, ? extends U> transformer,
>>>                            Consumer<? super U> action)
>>>        void forEachValue(long parallelismThreshold,
>>>                          Consumer<? super V> action)
>>>        <U> void forEachValue(long parallelismThreshold,
>>>                              Function<? super V, ? extends U>
>>> transformer,
>>>                              Consumer<? super U> action)
>>> }
>>>
>>>
>>> I'm wondering if the following alternative would be easier to read and
>>> reason-about in code:
>>>
>>> interface Stream<T> {
>>>        void forEach(Consumer<? super T> action)
>>>        void iterate(Consumer<? super T> action)
>>> }
>>>
>>> interface Iterable<T> {
>>>        void iterate(Consumer<? super T> action)
>>> }
>>>
>>> interface Iterator<T> {
>>>        void iterateRemaining(Consumer<? super T> action)
>>> }
>>>
>>> interface Spliterator<T> {
>>>        void iterateRemaining(Consumer<? super T> action)
>>> }
>>>
>>> interface Map<K, V> {
>>>        void iterate(BiConsumer<? super K, ? super V> action)
>>> }
>>>
>>> class ConcurrentHashMap<K, V> ... {
>>>        void forEachKey(long parallelismThreshold,
>>>                        Consumer<? super K> action)
>>>        <U> void forEachKey(long parallelismThreshold,
>>>                            Function<? super K, ? extends U> transformer,
>>>                            Consumer<? super U> action)
>>>        void forEachValue(long parallelismThreshold,
>>>                          Consumer<? super V> action)
>>>        <U> void forEachValue(long parallelismThreshold,
>>>                              Function<? super V, ? extends U>
>>> transformer,
>>>                              Consumer<? super U> action)
>>> }
>>>
>>>
>>> Why? I'm a little concerned about the duality of Stream.forEach() and
>>> Iterable.forEach(). While the later is always sequential and
>>> encounter-ordered, the former can be executed out-of order and/or
>>> in-parallel. I think that by naming the sequential variants differently,
>>> the reader of code need not be concerned about the type of the target
>>> that the method is invoked upon, just the name. This enables fast
>>> browsing of code where the eye can quickly glance over iterate()s but
>>> slows down on each forEach()...
>>>
>>> How do others feel about the (re)use of forEach... names in different
>>> APIs? Would it be more difficult to find the right method if iterate()
>>> was used instead of forEach() for iteration?
>>>
>>>
>>> Regards, Peter
>>>
>>>
>