Streams and Spliterator characteristics confusion
Paul Sandoz
paul.sandoz at oracle.com
Fri Jun 27 09:43:16 UTC 2014
On Jun 27, 2014, at 9:06 AM, Kasper Nielsen <kasperni at gmail.com> wrote:
> Hi,
>
> I'm trying to understand how the Spliterator.characteristics are maintained
> across stream operations and I'm a bit confused. Maybe someone here can
> clear it up for me
>
Internally in the stream pipeline we keep track of certain characteristics for optimization purposes and those are conveniently used to determine the characteristics of the Spliterator, so there are some idiosyncrasies poking through.
> s.sorted().spliterator() -> Spliterator.SORTED = true
> But if I use specify a comparator the stream is not sorted
> s.sorted((a,b) -> 1).spliterator() -> Spliterator.SORTED = false
>
Right, there is an optimization internally that ensures if the upstream stream is already sorted than the sort operation becomes a nop e.g.
s.sorted().sorted();
This optimization cannot apply when a comparator is passed in since we don't know if two comparators are identical in their behaviour e.g:
s.sorted((a, b) -> a.compareTo(b)).sorted(Compatators.naturalOrder()).sorted()
> s.distinct().spliterator() -> Spliterator.DISTINCT = true
> but limiting the number of distinct elements makes the stream non distinct
> s.distinct().limit(10).spliterator() -> Spliterator.DISTINCT = false
I don't observe that (see program below).
> On the other hand something like Spliterator.SORTED is maintained when I
> invoke limit
> s.sorted().limit(10).spliterator() -> Spliterator.SORTED = true
>
>
> A flag such as Spliterator.NONNULL is also cleared in situations where it
> should not be.
That is because it is not tracked in the pipeline as there is no gain optimisation-wise (if it was it would be cleared for map/flatMap operations and preserved for other ops like filter as you say below).
It's not that difficult to support this and should add no measurable performance cost, we deliberately left space in the bit fields, however since spliterator() is an escape-hatch for doing stuff that cannot be done by other operations i think the value of supporting NONULL is marginal.
> In my opinion an operation such as filter(predicate) should not flip the
> flag.
>
> Finally, are there any relationship between Spliterator.SORTED and
> Spliterator.ORDERED.
> I would think that a stream cannot be sorted without it also being ordered?
> but s.sorted().unordered().spliterator returns ORDERED=false and SORTED=true
>
Strictly speaking this spliterator is not conforming as per the specification of Spltierator.ORDERED.
The reason is for the following example the last sorted is still a nop:
s.sorted().unordered().sorted()
I think in this case we can post-fix the flags to re-set ORDERED, if it was cleared, and if SORTED is still set.
Paul.
import java.util.Arrays;
import java.util.List;
import java.util.Spliterator;
public class Test {
public static void main(String[] args) {
List<Integer> l = Arrays.asList(1, 2, 3, 4);
x("l.stream().distinct()",
l.stream().distinct().spliterator());
x("l.stream().distinct().limit(10)",
l.stream().distinct().limit(10).spliterator());
x("l.stream().sorted()",
l.stream().sorted().spliterator());
x("l.stream().sorted().limit(10)",
l.stream().sorted().limit(10).spliterator());
x("l.stream().unordered()",
l.stream().sorted().unordered().spliterator());
}
static void x(String sd, Spliterator<?> s) {
int cs = s.characteristics();
StringBuilder sb = new StringBuilder(sd);
sb.append(": ");
x(sb, cs, Spliterator.SORTED, "SORTED");
sb.append(", ");
x(sb, cs, Spliterator.DISTINCT, "DISTINCT");
sb.append(", ");
x(sb, cs, Spliterator.ORDERED, "ORDERED");
System.out.println(sb);
}
static void x(StringBuilder sb, int cs, int ct, String s) {
if ((cs & ct) == 0) {
sb.append("~");
}
sb.append(s);
}
}
More information about the core-libs-dev
mailing list