RFR: 8153293 - Stream API: Preserve SORTED and DISTINCT characteristics for boxed() and asLongStream() operations
Stefan Zobel
spliterator at gmail.com
Mon Apr 4 19:28:25 UTC 2016
Hi Tagir,
good catch! I like the proposal.
>
> (different longs can be converted into the same double, so DISTINCT is
> not preserved here; not sure whether this is possible for ints)
>
I think IntStream.asDoubleStream() can also preserve DISTINCT as
different ints can't be mapped to the same double.
Math.ulp((double) Integer.MIN_VALUE) ~ 4.7E-7
in contrast to
Math.ulp((double) Long.MIN_VALUE) = 2048.0
So there are more than enough doubles in the vicinity of large int
values. It's only when ulp get's >= 1.0 that distinct integral values
need to be mapped to the same double (that happens between 1.0E15 and
1.0E16 for longs). Please anyone correct me if I'm wrong.
Regards,
Stefan
2016-04-01 18:25 GMT+02:00 Tagir F. Valeev <amaembo at gmail.com>:
> Hello!
>
> Please review and sponsor the following patch:
> http://cr.openjdk.java.net/~tvaleev/webrev/8153293/r1/
>
> The patch preserves more characteristics on primitive stream
> operations:
> IntStream/LongStream/DoubleStream.boxed() preserves SORTED and DISTINCT
> IntStream.asLongStream() preserves SORTED and DISTINCT
> IntStream.asDoubleStream() and LongStream.asDoubleStream() preserves SORTED
> (different longs can be converted into the same double, so DISTINCT is
> not preserved here; not sure whether this is possible for ints)
>
> Fixing the boxed() case is especially important as distinct() for
> primitive streams is implemented like boxed().distinct().unbox, so the
> actual distinct() operation cannot take the advantage of DISTINCT flag
> (skip the operation at all) or SORTED flag (switch to more efficient
> implementation).
>
> Here's the small JMH benchmark which measures the performance boost of
> quite common operation: sort the input numbers and leave only distinct
> ones:
> http://cr.openjdk.java.net/~tvaleev/webrev/8153293/jmh/
>
> new Random(1).ints(size).sorted().distinct().toArray()
>
> I've got the following results.
>
> 9ea+111:
>
> Benchmark (size) Mode Cnt Score Error Units
> SortDistinctTest.sortedDistinct 10 avgt 30 0,612 ± 0,004 us/op
> SortDistinctTest.sortedDistinct 1000 avgt 30 92,848 ± 1,039 us/op
> SortDistinctTest.sortedDistinct 100000 avgt 30 32147,205 ± 3487,422 us/op
>
> 9ea+111 patched:
>
> Benchmark (size) Mode Cnt Score Error Units
> SortDistinctTest.sortedDistinct 10 avgt 30 0,435 ± 0,001 us/op
> SortDistinctTest.sortedDistinct 1000 avgt 30 40,555 ± 0,772 us/op
> SortDistinctTest.sortedDistinct 100000 avgt 30 9031,651 ± 73,956 us/op
>
> With best regards,
> Tagir Valeev.
>
More information about the core-libs-dev
mailing list