Natural Encounter Order (was RE: Stream#generate() vs. iterate())
Stuart Marks
stuart.marks at oracle.com
Mon Oct 14 22:55:22 PDT 2013
Brian is of course correct but I'd like to quibble about terminology.
The term "encounter order" is defined in the java.util.stream package docs. The
term "natural encounter order" is new to me, though, and while I'm pretty sure I
know what Brian means by it, I don't think it's a good term. It's too easily
confused with "natural order" which is a long-standing term that means something
different; see java.lang.Comparable.
Note that "encounter order" is a property of a stream and is derived from the
stream's source (or possibly upstream operations), whereas "natural order" is a
property of the elements themselves.
I'd recommend avoiding "natural encounter order" and stick to simply "ordered"
or "unordered".
s'marks
On 10/6/13 2:38 AM, Brian Goetz wrote:
> Yes, you've hit the point exactly. If you don't care about order in the answer, you can always call unordered - and this is cheap - if you are going to do a parallel op that might be constrained by ordering. But the key is, you, the user , are the only one who knows whether you care about ordering or not. So we can't guess. If you don't care, tell us so, and we can give you a possibly faster execution.
>
> Sent from my iPhone
>
> On Oct 6, 2013, at 9:44 AM, Samir Talwar <samir at noodlesandwich.com> wrote:
>
>> A few things.
>>
>> Firstly, in your example you used the `range` method, which is both ordered
>> and sorted by definition. While it hasn't been explicitly sorted, not
>> keeping it sorted as a steam would confuse and surprise pretty much
>> everyone.
>>
>> On that topic: "ordered" and "sorted" mean different things, "ordered"
>> meaning the same thing as "indexed", although with less attention paid to
>> implementation. As you saw, a TreeSet is both ordered and sorted, whereas a
>> HashSet is neither. Lists and arrays are ordered but not necessarily
>> sorted; for example, `ImmutableList.of(1, 5, 2, 6, 3)` is ordered, but
>> clearly unsorted.
>>
>> Lastly, there's no harm in throwing in a call to `unordered` whenever
>> you're not fussed about order for whatever reason. My understanding is that
>> if the stream is already unordered, it will just return itself.
>>
>> – Samir.
>> On 6 Oct 2013 09:28, "Millies, Sebastian" <Sebastian.Millies at softwareag.com>
>> wrote:
>>
>>> Thanks for that information. Part of my confusion may be just a language
>>> thing. I'd call the property
>>> of having a first, second etc. element "indexed", not "ordered". In
>>> German, "ordered" is more akin to
>>> "sorted". (But perhaps even in English, cf. SQL ORDER BY clause?).
>>>
>>> For that reason, I'd not have expected to have to unorder() a stream that
>>> had never been
>>> explicitly sort() 'ed.
>>>
>>> Or in other words, I'd have expected sorted()/unordered() and
>>> sequential()/parallel() to
>>> be pairs of operations that converted along the two dimensions, with
>>> "unordered sequential" always
>>> the default unless explicitly coded otherwise.
>>>
>>> Setting aside the choice of words, is there some rigorous definition of
>>> "natural encounter order" that
>>> will let me know whether a stream will in fact be ordered or unordered
>>> when I create it?
>>> Consider the following:
>>>
>>> TreeSet<String> t = new TreeSet<>();
>>> HashSet<String> h = new HashSet<>();
>>> Set<String> s;
>>>
>>> s = t;
>>> Stream<String> tStream = s.parallelStream();
>>>
>>> assert(tStream.spliterator().hasCharacteristics(Spliterator.ORDERED)) ;
>>>
>>> s = h;
>>> Stream<String> hStream = s.parallelStream();
>>> assert( !
>>> hStream.spliterator().hasCharacteristics(Spliterator.ORDERED));
>>>
>>> Usually having a Set reference I will not know whether the object has been
>>> created
>>> a TreeSet or HashSet. Similarly, having a Map.EntrySet I will usually not
>>> know whether the
>>> underlying Map was a LinkedHashMap. So for good parallel performance I
>>> will be forced
>>> to always explicitly call unordered(), or conduct that complicated test on
>>> the spliterator's
>>> characteristics. Unless I'm overlooking something, I am not happy with
>>> the concept "natural
>>> encounter order", because it seems unknowable.
>>>
>>> -- Sebastian
>>>
>>>> -----Original Message-----
>>>> From: Brian Goetz [mailto:brian.goetz at oracle.com]
>>>> Sent: Saturday, October 05, 2013 9:56 PM
>>>> To: Millies, Sebastian
>>>> Cc: lambda-dev at openjdk.java.net
>>>> Subject: Re: Stream#generate() vs. iterate()
>>>>
>>>> Sequential and unordered are orthogonal characteristics.
>>>>
>>>> Ordered/unordered means that the source has a natural encounter order,
>>> and that
>>>> stream operations should respect that order. For example, lists and
>>> arrays have a
>>>> natural encounter order; they have a first element, a second element,
>>> etc. Whereas
>>>> a HashSet has no natural encounter order; processing the elements in one
>>> order is as
>>>> good as any other.
>>>>
>>>> Sequential/parallel has to do with whether the stream operations will
>>> execute
>>>> sequentially or in parallel when the terminal operation is initiated.
>>>>
>>>> generate() returns a sequential, unordered stream. You can turn that
>>> into a parallel,
>>>> unordered stream with generate(f).parallel().
>>>>
>>>> On Oct 5, 2013, at 6:54 PM, Millies, Sebastian wrote:
>>>>
>>>>> I'm a bit confused, perhaps it's just terminology:
>>>>>
>>>>> Looking at the Javadoc (in b106) for Stream#generate(Supplier) it says
>>> it returns a
>>>> sequential stream.
>>>>> In your post you say it returns an unordered stream. In what way can a
>>> sequential
>>>> stream be unordered?
>>>>>
>>>>> -- Sebastian
>>>>>
>>>>> -----Original Message-----
>>>>> From: lambda-dev-bounces at openjdk.java.net [mailto:lambda-dev-
>>>> bounces at openjdk.java.net] On Behalf Of Brian Goetz
>>>>> Sent: Saturday, October 05, 2013 6:13 PM
>>>>> To: Arne Siegel
>>>>> Cc: lambda-dev at openjdk.java.net
>>>>> Subject: Re: stream.parallel().limit() not usable
>>>>>
>>>>> [snip]
>>>>>
>>>>> You might also do better with Stream.generate, since it creates an
>>> unordered
>>>> stream:
>>>>>
>>>>> Stream.generate(generatorFunction)
>>>>> .parallel()
>>>>> ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Software AG – Sitz/Registered office: Uhlandstraße 12, 64297
>>> Darmstadt, Germany
>>>> – Registergericht/Commercial register: Darmstadt HRB 1562 -
>>> Vorstand/Management
>>>> Board: Karl-Heinz Streibich (Vorsitzender/Chairman), Dr. Wolfram Jost,
>>> Arnd
>>>> Zinnhardt; - Aufsichtsratsvorsitzender/Chairman of the Supervisory
>>> Board: Dr. Andreas
>>>> Bereczky - http://www.softwareag.com
>>
>
More information about the lambda-dev
mailing list