RFR 8148115: Stream.findFirst for unordered source optimization

Paul Sandoz paul.sandoz at oracle.com
Tue Jan 26 10:38:47 UTC 2016


Hi Tagir,

Looks good. Just small things.


FindOps.java
—

 293             if (!this.mustFindFirst) {

 310             if (this.mustFindFirst) {


Small thing, no need for “this”.


LambdaTestHelpers
—

 404     public static<T> void assertContains(Optional<T> actual, Iterator<T> it) {
 405         if (actual.isPresent()) {
 406             T r = actual.get();
 407             boolean contained = false;
 408             while (!contained && it.hasNext()) {
 409                 contained = Objects.equals(r, it.next());
 410             }
 411             assertTrue(contained);
 412         }
 413         else {
 414             assertFalse(it.hasNext());
 415         }
 416     }


You can now use Optional.ifPresentOrElse, which was added in 9:

  ifPresentOrElse(
    t -> { …},
    () -> assertFalse(it.hasNext()):

Up to you. Personally i prefer that to the isPresent then get. It potentially works better if there was also a non-optional receiving method to reduce the lambda size.

—

I was wondering if the findFirst tests would require updating.

For primitive streams, you are correct there are no unordered providers of data. in certain cases unordered is induced e.g. forEach testing.

A focused fix would be to exercise with say s.filter().unordered(), a more general fix would be as you say to add another provider for int, long and double (backed by say HashSet, or more simply/efficiently using unordered()).

We have to carefully review that because that may increase test times. We don’t have to apply to all data sets, plus we could also refine the ref data sets too in that regard, thus keep the overall actions similar or even smaller.

Shall we separate that in a new issue?

—

I will take this and other patches (close/limit etc) you propose in bulk.

Thanks,
Paul.

> On 23 Jan 2016, at 13:43, Tagir F. Valeev <amaembo at gmail.com> wrote:
> 
> Hello, Paul!
> 
> Thank you for review. I implemented it according to your suggestions,
> here's issue:
> https://bugs.openjdk.java.net/browse/JDK-8148115
> And webrev:
> http://cr.openjdk.java.net/~tvaleev/webrev/8148115/r1/
> 
> This patch required some changes in existing tests as FindFirstOpTest
> asserted that findFirst() result must be equal to .iterator().next()
> even if the source is HashSet. I extracted the relevant part of
> FindAnyOpTest into assertContains(Optional, Iterator) method and
> placed it into LambdaTestHelpers.
> 
> The corresponding changes in primitive tests are unnecessary, because
> it seems that primitive data providers never produce unordered
> streams. I can try to add unordered primitive sources and correct
> these tests as well if it's necessary.
> 
> With best regards,
> Tagir Valeev.
> 
> 
> PS> Hi Tagir,
> 
> PS> Thanks for looking at this.
> 
> PS> I would prefer if a boolean was passed into the task constructor e.g.:
> 
> PS> @Override
> PS> public <P_IN> O evaluateParallel(PipelineHelper<T> helper,
> PS>                                  Spliterator<P_IN> spliterator) {
> PS>     // This takes into account the upstream ops flags and the terminal
> PS>     // op flags and therefore takes into account findFirst or findAny
> PS>     boolean mustFindFirst =
> PS> StreamOpFlag.ORDERED.isKnown(helper.getStreamAndOpFlags());
> PS>     return new FindTask<>(this, mustFindFirst, helper, spliterator).invoke();
> PS> }
> 
> PS> so we keep the flag evaluation logic within the evaluate method.
> 
> PS> Then change FindOp.mustFindFirst to be an int of the opFlags calculated in the constructor:
> 
> PS>   this.opFlags = StreamOpFlag.IS_SHORT_CIRCUIT | (mustFindFirst ? 0 : StreamOpFlag.NOT_ORDERED);
> 
> PS> Paul.
> 
>>> On 22 Jan 2016, at 07:19, Tagir F. Valeev <amaembo at gmail.com> wrote:
>>> 
>>> Hello!
>>> 
>>> Seems that currently Stream.findFirst is not optimized for unordered
>>> source. I think it should work as findAny in this case. Here's a small
>>> patch which fixes this:
>>> 
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/find_patch.txt
>>> 
>>> Simple JMH test:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/FindTest.java
>>> 
>>> Original:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/jmh_out_orig.txt
>>> # JMH 1.11.2 (released 85 days ago)
>>> # VM version: JDK 9-ea, VM 9-ea+99-2015-12-23-183325.javare.4146.nc
>>> Benchmark                    Mode  Cnt      Score      Error  Units
>>> FindTest.findAny             avgt   30  12439,631 ± 1787,866  us/op
>>> FindTest.findAnyUnordered    avgt   30  12923,080 ± 1072,537  us/op
>>> FindTest.findFirst           avgt   30  48047,467 ± 2713,489  us/op
>>> FindTest.findFirstUnordered  avgt   30  52648,893 ± 3934,682  us/op
>>> 
>>> Patched:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/jmh_out_patched.txt
>>> Benchmark                    Mode  Cnt      Score      Error  Units
>>> FindTest.findAny             avgt   30  11312,238 ±  386,627  us/op
>>> FindTest.findAnyUnordered    avgt   30  12136,953 ± 1536,817  us/op
>>> FindTest.findFirst           avgt   30  47517,776 ± 2844,607  us/op
>>> FindTest.findFirstUnordered  avgt   30  13147,492 ± 1140,592  us/op
>>> 
>>> If you think it's a reasonable thing to patch, I can log an issue,
>>> generate webrev and check whether jtreg tests pass.
>>> 
>>> With best regards,
>>> Tagir Valeev.
>>> 
> 




More information about the core-libs-dev mailing list