RFR 8148115: Stream.findFirst for unordered source optimization
Paul Sandoz
paul.sandoz at oracle.com
Tue Jan 26 10:38:47 UTC 2016
Hi Tagir,
Looks good. Just small things.
FindOps.java
—
293 if (!this.mustFindFirst) {
310 if (this.mustFindFirst) {
Small thing, no need for “this”.
LambdaTestHelpers
—
404 public static<T> void assertContains(Optional<T> actual, Iterator<T> it) {
405 if (actual.isPresent()) {
406 T r = actual.get();
407 boolean contained = false;
408 while (!contained && it.hasNext()) {
409 contained = Objects.equals(r, it.next());
410 }
411 assertTrue(contained);
412 }
413 else {
414 assertFalse(it.hasNext());
415 }
416 }
You can now use Optional.ifPresentOrElse, which was added in 9:
ifPresentOrElse(
t -> { …},
() -> assertFalse(it.hasNext()):
Up to you. Personally i prefer that to the isPresent then get. It potentially works better if there was also a non-optional receiving method to reduce the lambda size.
—
I was wondering if the findFirst tests would require updating.
For primitive streams, you are correct there are no unordered providers of data. in certain cases unordered is induced e.g. forEach testing.
A focused fix would be to exercise with say s.filter().unordered(), a more general fix would be as you say to add another provider for int, long and double (backed by say HashSet, or more simply/efficiently using unordered()).
We have to carefully review that because that may increase test times. We don’t have to apply to all data sets, plus we could also refine the ref data sets too in that regard, thus keep the overall actions similar or even smaller.
Shall we separate that in a new issue?
—
I will take this and other patches (close/limit etc) you propose in bulk.
Thanks,
Paul.
> On 23 Jan 2016, at 13:43, Tagir F. Valeev <amaembo at gmail.com> wrote:
>
> Hello, Paul!
>
> Thank you for review. I implemented it according to your suggestions,
> here's issue:
> https://bugs.openjdk.java.net/browse/JDK-8148115
> And webrev:
> http://cr.openjdk.java.net/~tvaleev/webrev/8148115/r1/
>
> This patch required some changes in existing tests as FindFirstOpTest
> asserted that findFirst() result must be equal to .iterator().next()
> even if the source is HashSet. I extracted the relevant part of
> FindAnyOpTest into assertContains(Optional, Iterator) method and
> placed it into LambdaTestHelpers.
>
> The corresponding changes in primitive tests are unnecessary, because
> it seems that primitive data providers never produce unordered
> streams. I can try to add unordered primitive sources and correct
> these tests as well if it's necessary.
>
> With best regards,
> Tagir Valeev.
>
>
> PS> Hi Tagir,
>
> PS> Thanks for looking at this.
>
> PS> I would prefer if a boolean was passed into the task constructor e.g.:
>
> PS> @Override
> PS> public <P_IN> O evaluateParallel(PipelineHelper<T> helper,
> PS> Spliterator<P_IN> spliterator) {
> PS> // This takes into account the upstream ops flags and the terminal
> PS> // op flags and therefore takes into account findFirst or findAny
> PS> boolean mustFindFirst =
> PS> StreamOpFlag.ORDERED.isKnown(helper.getStreamAndOpFlags());
> PS> return new FindTask<>(this, mustFindFirst, helper, spliterator).invoke();
> PS> }
>
> PS> so we keep the flag evaluation logic within the evaluate method.
>
> PS> Then change FindOp.mustFindFirst to be an int of the opFlags calculated in the constructor:
>
> PS> this.opFlags = StreamOpFlag.IS_SHORT_CIRCUIT | (mustFindFirst ? 0 : StreamOpFlag.NOT_ORDERED);
>
> PS> Paul.
>
>>> On 22 Jan 2016, at 07:19, Tagir F. Valeev <amaembo at gmail.com> wrote:
>>>
>>> Hello!
>>>
>>> Seems that currently Stream.findFirst is not optimized for unordered
>>> source. I think it should work as findAny in this case. Here's a small
>>> patch which fixes this:
>>>
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/find_patch.txt
>>>
>>> Simple JMH test:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/FindTest.java
>>>
>>> Original:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/jmh_out_orig.txt
>>> # JMH 1.11.2 (released 85 days ago)
>>> # VM version: JDK 9-ea, VM 9-ea+99-2015-12-23-183325.javare.4146.nc
>>> Benchmark Mode Cnt Score Error Units
>>> FindTest.findAny avgt 30 12439,631 ± 1787,866 us/op
>>> FindTest.findAnyUnordered avgt 30 12923,080 ± 1072,537 us/op
>>> FindTest.findFirst avgt 30 48047,467 ± 2713,489 us/op
>>> FindTest.findFirstUnordered avgt 30 52648,893 ± 3934,682 us/op
>>>
>>> Patched:
>>> http://cr.openjdk.java.net/~tvaleev/patches/findFirst/jmh_out_patched.txt
>>> Benchmark Mode Cnt Score Error Units
>>> FindTest.findAny avgt 30 11312,238 ± 386,627 us/op
>>> FindTest.findAnyUnordered avgt 30 12136,953 ± 1536,817 us/op
>>> FindTest.findFirst avgt 30 47517,776 ± 2844,607 us/op
>>> FindTest.findFirstUnordered avgt 30 13147,492 ± 1140,592 us/op
>>>
>>> If you think it's a reasonable thing to patch, I can log an issue,
>>> generate webrev and check whether jtreg tests pass.
>>>
>>> With best regards,
>>> Tagir Valeev.
>>>
>
More information about the core-libs-dev
mailing list