From dl at cs.oswego.edu Mon Jul 1 03:32:58 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 06:32:58 -0400 Subject: CompletionStage In-Reply-To: References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> Message-ID: <51D15ADA.2050808@cs.oswego.edu> On 06/30/13 17:00, Sam Pullara wrote: > On Sun, Jun 30, 2013 at 1:34 PM, Doug Lea
They can (re)throw any exception they like when completed exceptionally. > > > This is really ugly. Thanks again for spotting this problem. Adding onExceptionalCompletion to cope, which pretty much forces adding onNormalCompletion for symmetry: /** * Returns a new CompletionStage with the same result * or exception as this stage, and when this stage completes, * executes the given action only if this stage * completes exceptionally. This would be equivalent in effect to * {@code exceptionally(ex -> { action.accept(ex); throw ex})} * if this were expressible without type check errors. * * @param action the action to perform if this CompletionStage * completed exceptionally */ public CompletionStage onExceptionalCompletion (Consumer action); /** * Returns a new CompletionStage with the same result * or exception as this stage, and when this stage completes, * executes the given action only if this stage * completes normally. This is equivalent in effect to * {@code thenApply(x -> { action.accept(x); return x})}. * * @param action the action to perform if this CompletionStage * completed normally */ public CompletionStage onNormalCompletion (Consumer action); From dl at cs.oswego.edu Mon Jul 1 05:45:51 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 08:45:51 -0400 Subject: CompletionStage In-Reply-To: References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D0A326.9060709@cs.oswego.edu> <51D0B915.9050801@cs.oswego.edu> Message-ID: <51D179FF.40906@cs.oswego.edu> On 06/30/13 20:24, Sam Pullara wrote: > Experimenting a bit with the API to see if I could use it for cancellation, here > is a program that I would like to work much differently than it does with the > current system: > > @Test > public void testCancellation() throws ExecutionException, > InterruptedException { > AtomicBoolean cancelled = new AtomicBoolean(); > AtomicBoolean handled = new AtomicBoolean(); > AtomicBoolean handleCalledWithValue = new AtomicBoolean(); > CompletableFuture other = supplyAsync(() -> "Doomed value"); > CompletableFuture future = supplyAsync(() -> { > sleep(1000); > return "Doomed value"; > }).exceptionally(t -> { > cancelled.set(true); > return null; > }).thenCombine(other, (a, b) -> a + ", " + b).handle((v, t) -> { > if (t == null) { > handleCalledWithValue.set(true); > } > handled.set(true); > return null; > }); > sleep(100); > future.cancel(true); > sleep(1000); > try { > future.get(); > fail("Should have thrown"); > } catch (CancellationException ce) { > System.out.println("future cancelled: " + future.isCancelled()); > System.out.println("other cancelled: " + other.isCancelled()); > System.out.println("exceptionally called: " + cancelled.get()); > System.out.println("handle called: " + handled.get()); > System.out.println("handle called with value: " + > handleCalledWithValue.get()); > } > } > I think that variable "future" is not bound to the stage you have in mind? (The joys of fluency...) Try it with: ... CompletableFuture future = CompletableFuture.supplyAsync(() -> { sleep(1000); return "Doomed value"; }); future.cancel(true); future.exceptionally(... // or, as of now, you could do instead future.onExceptionalCompletion(t -> { ... -Doug From paul.sandoz at oracle.com Mon Jul 1 06:46:27 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 1 Jul 2013 15:46:27 +0200 Subject: Bikeshed: Spliterator "fail-fast" Message-ID: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> Hi, The Spliterator doc states: *

A Spliterator that does not report {@code IMMUTABLE} or * {@code CONCURRENT} is expected to have a documented policy concerning: * when the spliterator binds to the element source; and detection of * structural interference of the element source detected after binding. ... * After binding a Spliterator should, on a best-effort basis, throw * {@link ConcurrentModificationException} if structural interference is * detected. Spliterators that do this are called fail-fast. As Mike pointed out to me "fail-fast" is not accurate since the implementations for bulk traversal, specifically forEachRemaining, can throw a CME after traversal has completed. - fail-finally - fail-ultimately - fail-eventually ? Paul. From brian.goetz at oracle.com Mon Jul 1 07:20:43 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 01 Jul 2013 10:20:43 -0400 Subject: MumbleCloseable In-Reply-To: <51D03359.2020707@cs.oswego.edu> References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> Message-ID: <51D1903B.4030306@oracle.com> >> Bikeshed opportunity: should the annotation be nested or at top level? > > I think nested, but if so, it seems cruel to give it such a long name: > > @MayHoldCloseableResource.DefinitelyHoldsCloseableResource > > So the remaining bikeshed opportunity is what's shorter > but still crystal-clear? We probably can't get away with > just: > > @MayHoldCloseableResource.Yes > > Any ideas? Alan suggested: .HoldsResource From dl at cs.oswego.edu Mon Jul 1 07:24:04 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 10:24:04 -0400 Subject: MumbleCloseable In-Reply-To: <51D1903B.4030306@oracle.com> References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> <51D1903B.4030306@oracle.com> Message-ID: <51D19104.9040207@cs.oswego.edu> On 07/01/13 10:20, Brian Goetz wrote: >>> Bikeshed opportunity: should the annotation be nested or at top level? >> >> I think nested, but if so, it seems cruel to give it such a long name: >> >> @MayHoldCloseableResource.DefinitelyHoldsCloseableResource >> >> So the remaining bikeshed opportunity is what's shorter >> but still crystal-clear? We probably can't get away with >> just: >> >> @MayHoldCloseableResource.Yes >> >> Any ideas? > > Alan suggested: .HoldsResource > I suggest: Take it and claim success. -Doug From forax at univ-mlv.fr Mon Jul 1 07:45:36 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 01 Jul 2013 16:45:36 +0200 Subject: Bikeshed: Spliterator "fail-fast" In-Reply-To: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> References: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> Message-ID: <51D19610.50307@univ-mlv.fr> On 07/01/2013 03:46 PM, Paul Sandoz wrote: > Hi, > > The Spliterator doc states: > > *

A Spliterator that does not report {@code IMMUTABLE} or > * {@code CONCURRENT} is expected to have a documented policy concerning: > * when the spliterator binds to the element source; and detection of > * structural interference of the element source detected after binding. > ... > * After binding a Spliterator should, on a best-effort basis, throw > * {@link ConcurrentModificationException} if structural interference is > * detected. Spliterators that do this are called fail-fast. > > As Mike pointed out to me "fail-fast" is not accurate since the implementations for bulk traversal, specifically forEachRemaining, can throw a CME after traversal has completed. > > - fail-finally > - fail-ultimately > - fail-eventually > > ? fail-slow :) > > Paul. R?mi From dl at cs.oswego.edu Mon Jul 1 08:21:35 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 11:21:35 -0400 Subject: CompletionStage In-Reply-To: <51D15ADA.2050808@cs.oswego.edu> References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D15ADA.2050808@cs.oswego.edu> Message-ID: <51D19E7F.4080203@cs.oswego.edu> On 07/01/13 06:32, Doug Lea wrote: > On 06/30/13 17:00, Sam Pullara wrote: >> On Sun, Jun 30, 2013 at 1:34 PM, Doug Lea

> They can (re)throw any exception they like when completed exceptionally. >> >> >> This is really ugly. > > Thanks again for spotting this problem. Adding > onExceptionalCompletion to cope, which pretty > much forces adding onNormalCompletion for symmetry: (I should have known that anything requiring dealing with the delicate support for exceptions might take a few tries...) No, because this then branches inconsistently when you contemplate asyncs. Better to have a unified handle()-like version, and allow all three forms: /** * Returns a new CompletionStage with the same result or exception * as this stage, and when this stage completes, executes the * given action with the result (or {@code null} if none) and the * exception (or {@code null} if none) of this stage. * * @param action the action to perform */ public CompletionStage onCompletion (BiConsumer action); /** * Returns a new CompletionStage with the same result or exception * as this stage, and when this stage completes, executes the * given action executes the given action using this stage's * default asynchronous execution facility, with the result (or * {@code null} if none) and the exception (or {@code null} if * none) of this stage as arguments. * * @param action the action to perform */ public CompletionStage onCompletionAsync (BiConsumer action); /** * Returns a new CompletionStage with the same result or exception * as this stage, and when this stage completes, executes using * the supplied Executor, the given action with the result (or * {@code null} if none) and the exception (or {@code null} if * none) of this stage as arguments. * * @param action the action to perform */ public CompletionStage onCompletionAsync (BiConsumer action, Executor executor); From spullara at gmail.com Mon Jul 1 09:16:04 2013 From: spullara at gmail.com (Sam Pullara) Date: Mon, 1 Jul 2013 09:16:04 -0700 Subject: CompletionStage In-Reply-To: <51D179FF.40906@cs.oswego.edu> References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D0A326.9060709@cs.oswego.edu> <51D0B915.9050801@cs.oswego.edu> <51D179FF.40906@cs.oswego.edu> Message-ID: On Jul 1, 2013, at 5:45 AM, Doug Lea
wrote: > > I think that variable "future" is not bound to the stage you have in mind? > (The joys of fluency...) > > Try it with: > ... > CompletableFuture future = CompletableFuture.supplyAsync(() -> { > sleep(1000); > return "Doomed value"; > }); > future.cancel(true); > future.exceptionally(... > > // or, as of now, you could do instead > future.onExceptionalCompletion(t -> { Makes some sense except for cancellation. I want that cancel() propagated through all the related futures in order to implement polling myself without threading the cancellation indicator through every layer. Sam > ... > > -Doug > > > From dl at cs.oswego.edu Mon Jul 1 09:31:21 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 12:31:21 -0400 Subject: CompletionStage In-Reply-To: References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D0A326.9060709@cs.oswego.edu> <51D0B915.9050801@cs.oswego.edu> <51D179FF.40906@cs.oswego.edu> Message-ID: <51D1AED9.9050201@cs.oswego.edu> On 07/01/13 12:16, Sam Pullara wrote: > > On Jul 1, 2013, at 5:45 AM, Doug Lea
wrote: >> >> I think that variable "future" is not bound to the stage you have in mind? >> (The joys of fluency...) >> >> Try it with: >> ... >> CompletableFuture future = CompletableFuture.supplyAsync(() -> { >> sleep(1000); >> return "Doomed value"; >> }); >> future.cancel(true); >> future.exceptionally(... >> >> // or, as of now, you could do instead >> future.onExceptionalCompletion(t -> { > > Makes some sense except for cancellation. I want that cancel() propagated through all the related futures in order to implement polling myself without threading the cancellation indicator through every layer. > Automatically cancelling the "other" in thenCombine and related methods would not be universally popular, but we now have a good answer for this! Feel free to create your own CompletionStage implementation that does so. Sound good? In other news, I think that with the multi-use onCompletion, plus allowing all three (plain, async, custom) of handle, we need to get rid of the problematic "exceptionally" method. A few people will be unhappy, but better than living with complaints about its javac-won't-let-me-rethrow problems for years. -Doug From spullara at gmail.com Mon Jul 1 09:38:21 2013 From: spullara at gmail.com (Sam Pullara) Date: Mon, 1 Jul 2013 09:38:21 -0700 Subject: CompletionStage In-Reply-To: <51D1AED9.9050201@cs.oswego.edu> References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D0A326.9060709@cs.oswego.edu> <51D0B915.9050801@cs.oswego.edu> <51D179FF.40906@cs.oswego.edu> <51D1AED9.9050201@cs.oswego.edu> Message-ID: <555A3CA4-4D02-4FAF-A8BA-7466754FA1E9@gmail.com> On Jul 1, 2013, at 9:31 AM, Doug Lea
wrote: > Automatically cancelling the "other" in thenCombine and related > methods would not be universally popular, but we now have a > good answer for this! Feel free to create your own CompletionStage > implementation that does so. Sound good? It is nice that I can put my Promise under the interface. Not sure how many people will type their APIs that way. > In other news, I think that with the multi-use onCompletion, > plus allowing all three (plain, async, custom) of handle, > we need to get rid of the problematic "exceptionally" > method. A few people will be unhappy, but better than > living with complaints about its javac-won't-let-me-rethrow > problems for years. As long as there is still at least one method that lets you transform an exception to a value, that makes sense to me. Sam From dl at cs.oswego.edu Mon Jul 1 11:54:10 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 01 Jul 2013 14:54:10 -0400 Subject: CompletionStage In-Reply-To: <555A3CA4-4D02-4FAF-A8BA-7466754FA1E9@gmail.com> References: <519FBD77.9030400@oracle.com> <51CC6BB7.3000804@cs.oswego.edu> <51CCCAE7.8020208@cs.oswego.edu> <51D02B6C.6000103@cs.oswego.edu> <51D09640.4040700@cs.oswego.edu> <51D0A326.9060709@cs.oswego.edu> <51D0B915.9050801@cs.oswego.edu> <51D179FF.40906@cs.oswego.edu> <51D1AED9.9050201@cs.oswego.edu> <555A3CA4-4D02-4FAF-A8BA-7466754FA1E9@gmail.com> Message-ID: <51D1D052.9090105@cs.oswego.edu> On 07/01/13 12:38, Sam Pullara wrote: >> In other news, I think that with the multi-use onCompletion, >> plus allowing all three (plain, async, custom) of handle, >> we need to get rid of the problematic "exceptionally" >> method. A few people will be unhappy, but better than >> living with complaints about its javac-won't-let-me-rethrow >> problems for years. > > As long as there is still at least one method that lets you transform an exception to a value, that makes sense to me. > Yes. onCompletion renamed as whenComplete (which might have been your original suggestion?!) preserves outcome, handle can transform it. They make a perfect pair. I'm about to claim success for now and move to next phase. Stay tuned for c-i post. -Doug From brian.goetz at oracle.com Mon Jul 1 12:13:41 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 01 Jul 2013 15:13:41 -0400 Subject: Fwd: hg: lambda/lambda/jdk: JDK-8017513: Support for closeable streams In-Reply-To: <20130701191012.751B8486D6@hg.openjdk.java.net> References: <20130701191012.751B8486D6@hg.openjdk.java.net> Message-ID: <51D1D4E5.4070107@oracle.com> Implementation and spec for MumbleCloseable, please review. -------- Original Message -------- Subject: hg: lambda/lambda/jdk: JDK-8017513: Support for closeable streams Date: Mon, 01 Jul 2013 19:09:57 +0000 From: brian.goetz at oracle.com To: lambda-dev at openjdk.java.net Changeset: 64e40c435d66 Author: briangoetz Date: 2013-07-01 15:09 -0400 URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/64e40c435d66 JDK-8017513: Support for closeable streams ! src/share/classes/java/nio/file/Files.java + src/share/classes/java/util/MayHoldCloseableResource.java ! src/share/classes/java/util/stream/AbstractPipeline.java ! src/share/classes/java/util/stream/BaseStream.java - src/share/classes/java/util/stream/CloseableStream.java - src/share/classes/java/util/stream/DelegatingStream.java ! src/share/classes/java/util/stream/DoublePipeline.java ! src/share/classes/java/util/stream/DoubleStream.java ! src/share/classes/java/util/stream/IntPipeline.java ! src/share/classes/java/util/stream/IntStream.java ! src/share/classes/java/util/stream/LongPipeline.java ! src/share/classes/java/util/stream/LongStream.java ! src/share/classes/java/util/stream/ReferencePipeline.java ! src/share/classes/java/util/stream/Stream.java ! src/share/classes/java/util/stream/Streams.java ! test/java/nio/Files/FilesLambdaTest.java ! test/java/nio/file/Files/StreamTest.java ! test/java/util/stream/bootlib/java/util/stream/DoubleStreamTestScenario.java ! test/java/util/stream/bootlib/java/util/stream/IntStreamTestScenario.java ! test/java/util/stream/bootlib/java/util/stream/LongStreamTestScenario.java ! test/java/util/stream/bootlib/java/util/stream/StreamTestScenario.java + test/java/util/stream/test/org/openjdk/tests/java/util/stream/StreamCloseTest.java From brian.goetz at oracle.com Mon Jul 1 13:39:38 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 01 Jul 2013 16:39:38 -0400 Subject: Make BaseStream public? Message-ID: <51D1E90A.1030508@oracle.com> I am currently crawling through the docs getting ready for another pass on the specification. The good news is that the class hierarchy has simplified a lot since the first pass. The bad news is I am trying to tack a course between the twin evils of duplicated documentation and excessive linking (each level of linking loses some percentage of the readers.) The classes Stream, IntStream, LongStream, and DoubleStream all implement BaseStream, which describes behaviors common to all streams (closeability, iterator/spliterator, sequential/parallel, ordering). Currently BaseStream is package-private, but I am thinking that it might be valuable to elevate it to public, for possibly two reasons: - Having a common supertype allows library code to abstract over all stream types. Our tests use this ability, for example. - Most of the documentation for the stream classes (sequential and parallel behavior, ordering, etc) is generic to all stream types, so putting it in one base class (where it can be @inheritDoc'ed or linked) is more natural than cutting and pasting it in N places, or dumping it all in the package doc. From paul.sandoz at oracle.com Tue Jul 2 01:42:28 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Tue, 2 Jul 2013 10:42:28 +0200 Subject: Make BaseStream public? In-Reply-To: <51D1E90A.1030508@oracle.com> References: <51D1E90A.1030508@oracle.com> Message-ID: <8602569D-8A32-47A6-ACD6-013853B1DA4B@oracle.com> On Jul 1, 2013, at 10:39 PM, Brian Goetz wrote: > I am currently crawling through the docs getting ready for another pass on the specification. The good news is that the class hierarchy has simplified a lot since the first pass. The bad news is I am trying to tack a course between the twin evils of duplicated documentation and excessive linking (each level of linking loses some percentage of the readers.) > > The classes Stream, IntStream, LongStream, and DoubleStream all implement BaseStream, which describes behaviors common to all streams (closeability, iterator/spliterator, sequential/parallel, ordering). Currently BaseStream is package-private, but I am thinking that it might be valuable to elevate it to public, for possibly two reasons: > > - Having a common supertype allows library code to abstract over all stream types. Our tests use this ability, for example. > > - Most of the documentation for the stream classes (sequential and parallel behavior, ordering, etc) is generic to all stream types, so putting it in one base class (where it can be @inheritDoc'ed or linked) is more natural than cutting and pasting it in N places, or dumping it all in the package doc. > +1 If/when there is an SPI there could be value in making a test framework available publicly as well without resorting to boot classpath tricks. I am struggling to recall why we did not do that earlier. At one point we had some methods/types on BaseStream we did not want to expose publicly, but they all withered away as the implementation got simpler. Paul. From brian.goetz at oracle.com Tue Jul 2 08:31:24 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 02 Jul 2013 11:31:24 -0400 Subject: Make BaseStream public? In-Reply-To: <8602569D-8A32-47A6-ACD6-013853B1DA4B@oracle.com> References: <51D1E90A.1030508@oracle.com> <8602569D-8A32-47A6-ACD6-013853B1DA4B@oracle.com> Message-ID: <51D2F24C.20104@oracle.com> Paul and I groveled through old notes and came to the conclusion that the reasons BaseStream was not public earlier had to do with bad interactions (limits of generics) with abstractions that no longer exist, including StreamShape and MapStream. So the proximate reasons we originally made it nonpublic have gone away. BaseStream currently holds the following methods: iterator/spliterator sequential/parallel/isParallel unordered close/onClose We could (not saying we should) consider adding the following methods common to all Stream types, which if you squint a bit might fit in this club: limit/substream count On 7/2/2013 4:42 AM, Paul Sandoz wrote: > > On Jul 1, 2013, at 10:39 PM, Brian Goetz wrote: > >> I am currently crawling through the docs getting ready for another pass on the specification. The good news is that the class hierarchy has simplified a lot since the first pass. The bad news is I am trying to tack a course between the twin evils of duplicated documentation and excessive linking (each level of linking loses some percentage of the readers.) >> >> The classes Stream, IntStream, LongStream, and DoubleStream all implement BaseStream, which describes behaviors common to all streams (closeability, iterator/spliterator, sequential/parallel, ordering). Currently BaseStream is package-private, but I am thinking that it might be valuable to elevate it to public, for possibly two reasons: >> >> - Having a common supertype allows library code to abstract over all stream types. Our tests use this ability, for example. >> >> - Most of the documentation for the stream classes (sequential and parallel behavior, ordering, etc) is generic to all stream types, so putting it in one base class (where it can be @inheritDoc'ed or linked) is more natural than cutting and pasting it in N places, or dumping it all in the package doc. >> > > +1 > > If/when there is an SPI there could be value in making a test framework available publicly as well without resorting to boot classpath tricks. > > I am struggling to recall why we did not do that earlier. At one point we had some methods/types on BaseStream we did not want to expose publicly, but they all withered away as the implementation got simpler. > > Paul. > From brian.goetz at oracle.com Tue Jul 2 12:44:42 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 02 Jul 2013 15:44:42 -0400 Subject: Bikeshed: what do we call the distinguished method of a SAM? Message-ID: <51D32DAA.5080400@oracle.com> Working on the spec for the SAMs. In consultation with Doug, converging on a style that casts a SAM as *representing* an abstract entity such as a function or operation, and treating the SAM *as if* it were that function. For example: /** * Represents a function that accepts one argument and produces a result. * * @param the type of the input to the function * @param the type of the result of the function * * @since 1.8 */ @FunctionalInterface public interface Function { /** * Applies this function to an argument. * * @param t the function argument * @return the function result */ R apply(T t); I think one thing that is missing is tying together the sole SAM method with the SAM class. This is obvious in a SAM with no default or static methods (and no methods from Object), but starts to get lost in the noise as the method count adds up. I'm thinking of something like: * This is a functional interface whose _____ method is {@link #apply}. For some value of ____. What do we call the primary SAM method? The implementation method? The primary SAM method? The abstract method? From brian.goetz at oracle.com Tue Jul 2 12:55:14 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 02 Jul 2013 15:55:14 -0400 Subject: Bikeshed: what do we call the distinguished method of a SAM? In-Reply-To: <51D32DAA.5080400@oracle.com> References: <51D32DAA.5080400@oracle.com> Message-ID: <51D33022.9080607@oracle.com> How about: *

This is a functional interface * whose functional abstract method is {@link #apply(Object)}. where this is defined in package info: * Functional interfaces provide target types for lambda expressions * and method references. Each functional interface has a single abstract method, * called the functional abstract method for that functional interface, * to which the lambda expression's parameter and return types are matched or * adapted. Functional interfaces can provide a target type in multiple contexts, * such as assignment context, method invocation, or cast context: On 7/2/2013 3:44 PM, Brian Goetz wrote: > Working on the spec for the SAMs. In consultation with Doug, converging > on a style that casts a SAM as *representing* an abstract entity such as > a function or operation, and treating the SAM *as if* it were that > function. For example: > > /** > * Represents a function that accepts one argument and produces a result. > * > * @param the type of the input to the function > * @param the type of the result of the function > * > * @since 1.8 > */ > @FunctionalInterface > public interface Function { > > /** > * Applies this function to an argument. > * > * @param t the function argument > * @return the function result > */ > R apply(T t); > > I think one thing that is missing is tying together the sole SAM method > with the SAM class. This is obvious in a SAM with no default or static > methods (and no methods from Object), but starts to get lost in the > noise as the method count adds up. > > I'm thinking of something like: > > * This is a functional interface whose _____ method > is {@link #apply}. > > For some value of ____. What do we call the primary SAM method? The > implementation method? The primary SAM method? The abstract method? > From spullara at gmail.com Tue Jul 2 13:31:14 2013 From: spullara at gmail.com (Sam Pullara) Date: Tue, 2 Jul 2013 13:31:14 -0700 Subject: Bikeshed: what do we call the distinguished method of a SAM? In-Reply-To: <51D33022.9080607@oracle.com> References: <51D32DAA.5080400@oracle.com> <51D33022.9080607@oracle.com> Message-ID: <336909AB-82F8-4D52-B535-6CD9F2FAE974@gmail.com> Looks good to me. Sam On Jul 2, 2013, at 12:55 PM, Brian Goetz wrote: > How about: > > *

This is a functional interface > * whose functional abstract method is {@link #apply(Object)}. > > where this is defined in package info: > > * Functional interfaces provide target types for lambda expressions > * and method references. Each functional interface has a single abstract method, > * called the functional abstract method for that functional interface, > * to which the lambda expression's parameter and return types are matched or > * adapted. Functional interfaces can provide a target type in multiple contexts, > * such as assignment context, method invocation, or cast context: > > > On 7/2/2013 3:44 PM, Brian Goetz wrote: >> Working on the spec for the SAMs. In consultation with Doug, converging >> on a style that casts a SAM as *representing* an abstract entity such as >> a function or operation, and treating the SAM *as if* it were that >> function. For example: >> >> /** >> * Represents a function that accepts one argument and produces a result. >> * >> * @param the type of the input to the function >> * @param the type of the result of the function >> * >> * @since 1.8 >> */ >> @FunctionalInterface >> public interface Function { >> >> /** >> * Applies this function to an argument. >> * >> * @param t the function argument >> * @return the function result >> */ >> R apply(T t); >> >> I think one thing that is missing is tying together the sole SAM method >> with the SAM class. This is obvious in a SAM with no default or static >> methods (and no methods from Object), but starts to get lost in the >> noise as the method count adds up. >> >> I'm thinking of something like: >> >> * This is a functional interface whose _____ method >> is {@link #apply}. >> >> For some value of ____. What do we call the primary SAM method? The >> implementation method? The primary SAM method? The abstract method? >> From kevinb at google.com Tue Jul 2 13:52:13 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 2 Jul 2013 13:52:13 -0700 Subject: Bikeshed: what do we call the distinguished method of a SAM? In-Reply-To: <336909AB-82F8-4D52-B535-6CD9F2FAE974@gmail.com> References: <51D32DAA.5080400@oracle.com> <51D33022.9080607@oracle.com> <336909AB-82F8-4D52-B535-6CD9F2FAE974@gmail.com> Message-ID: I wonder if "functional method" says just as much? On Tue, Jul 2, 2013 at 1:31 PM, Sam Pullara wrote: > Looks good to me. > > Sam > > On Jul 2, 2013, at 12:55 PM, Brian Goetz wrote: > > > How about: > > > > *

This is a functional > interface > > * whose functional abstract method is {@link #apply(Object)}. > > > > where this is defined in package info: > > > > * Functional interfaces provide target types for lambda > expressions > > * and method references. Each functional interface has a single > abstract method, > > * called the functional abstract method for that functional > interface, > > * to which the lambda expression's parameter and return types are > matched or > > * adapted. Functional interfaces can provide a target type in multiple > contexts, > > * such as assignment context, method invocation, or cast context: > > > > > > On 7/2/2013 3:44 PM, Brian Goetz wrote: > >> Working on the spec for the SAMs. In consultation with Doug, converging > >> on a style that casts a SAM as *representing* an abstract entity such as > >> a function or operation, and treating the SAM *as if* it were that > >> function. For example: > >> > >> /** > >> * Represents a function that accepts one argument and produces a > result. > >> * > >> * @param the type of the input to the function > >> * @param the type of the result of the function > >> * > >> * @since 1.8 > >> */ > >> @FunctionalInterface > >> public interface Function { > >> > >> /** > >> * Applies this function to an argument. > >> * > >> * @param t the function argument > >> * @return the function result > >> */ > >> R apply(T t); > >> > >> I think one thing that is missing is tying together the sole SAM method > >> with the SAM class. This is obvious in a SAM with no default or static > >> methods (and no methods from Object), but starts to get lost in the > >> noise as the method count adds up. > >> > >> I'm thinking of something like: > >> > >> * This is a functional interface whose _____ method > >> is {@link #apply}. > >> > >> For some value of ____. What do we call the primary SAM method? The > >> implementation method? The primary SAM method? The abstract method? > >> > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130702/151fc046/attachment.html From brian.goetz at oracle.com Tue Jul 2 15:14:01 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 02 Jul 2013 18:14:01 -0400 Subject: Bikeshed: what do we call the distinguished method of a SAM? In-Reply-To: <51D32DAA.5080400@oracle.com> References: <51D32DAA.5080400@oracle.com> Message-ID: <51D350A9.80604@oracle.com> Here's a webrev of the current state of this re-spec, covers all the SAMs: http://cr.openjdk.java.net/~briangoetz/tmp/webrev/ On 7/2/2013 3:44 PM, Brian Goetz wrote: > Working on the spec for the SAMs. In consultation with Doug, converging > on a style that casts a SAM as *representing* an abstract entity such as > a function or operation, and treating the SAM *as if* it were that > function. For example: > > /** > * Represents a function that accepts one argument and produces a result. > * > * @param the type of the input to the function > * @param the type of the result of the function > * > * @since 1.8 > */ > @FunctionalInterface > public interface Function { > > /** > * Applies this function to an argument. > * > * @param t the function argument > * @return the function result > */ > R apply(T t); > > I think one thing that is missing is tying together the sole SAM method > with the SAM class. This is obvious in a SAM with no default or static > methods (and no methods from Object), but starts to get lost in the > noise as the method count adds up. > > I'm thinking of something like: > > * This is a functional interface whose _____ method > is {@link #apply}. > > For some value of ____. What do we call the primary SAM method? The > implementation method? The primary SAM method? The abstract method? > From brian.goetz at oracle.com Tue Jul 2 15:26:10 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 02 Jul 2013 18:26:10 -0400 Subject: Bikeshed: what do we call the distinguished method of a SAM? In-Reply-To: References: <51D32DAA.5080400@oracle.com> <51D33022.9080607@oracle.com> <336909AB-82F8-4D52-B535-6CD9F2FAE974@gmail.com> Message-ID: <51D35382.6030007@oracle.com> Yeah, that seems better. On 7/2/2013 4:52 PM, Kevin Bourrillion wrote: > I wonder if "functional method" says just as much? > > > On Tue, Jul 2, 2013 at 1:31 PM, Sam Pullara > wrote: > > Looks good to me. > > Sam > > On Jul 2, 2013, at 12:55 PM, Brian Goetz > wrote: > > > How about: > > > > *

This is a functional > interface > > * whose functional abstract method is {@link #apply(Object)}. > > > > where this is defined in package info: > > > > * Functional interfaces provide target types for lambda > expressions > > * and method references. Each functional interface has a single > abstract method, > > * called the functional abstract method for that > functional interface, > > * to which the lambda expression's parameter and return types are > matched or > > * adapted. Functional interfaces can provide a target type in > multiple contexts, > > * such as assignment context, method invocation, or cast context: > > > > > > On 7/2/2013 3:44 PM, Brian Goetz wrote: > >> Working on the spec for the SAMs. In consultation with Doug, > converging > >> on a style that casts a SAM as *representing* an abstract entity > such as > >> a function or operation, and treating the SAM *as if* it were that > >> function. For example: > >> > >> /** > >> * Represents a function that accepts one argument and produces > a result. > >> * > >> * @param the type of the input to the function > >> * @param the type of the result of the function > >> * > >> * @since 1.8 > >> */ > >> @FunctionalInterface > >> public interface Function { > >> > >> /** > >> * Applies this function to an argument. > >> * > >> * @param t the function argument > >> * @return the function result > >> */ > >> R apply(T t); > >> > >> I think one thing that is missing is tying together the sole SAM > method > >> with the SAM class. This is obvious in a SAM with no default or > static > >> methods (and no methods from Object), but starts to get lost in the > >> noise as the method count adds up. > >> > >> I'm thinking of something like: > >> > >> * This is a functional interface whose _____ > method > >> is {@link #apply}. > >> > >> For some value of ____. What do we call the primary SAM method? > The > >> implementation method? The primary SAM method? The abstract > method? > >> > > > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com > From paul.sandoz at oracle.com Wed Jul 3 07:20:52 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 3 Jul 2013 16:20:52 +0200 Subject: Bikeshed: Spliterator "fail-fast" In-Reply-To: <51D19610.50307@univ-mlv.fr> References: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> <51D19610.50307@univ-mlv.fr> Message-ID: <3C816FF0-AA1B-40BE-BAA9-CA8946E54A11@oracle.com> On Jul 1, 2013, at 4:45 PM, Remi Forax wrote: > On 07/01/2013 03:46 PM, Paul Sandoz wrote: >> Hi, >> >> The Spliterator doc states: >> >> *

A Spliterator that does not report {@code IMMUTABLE} or >> * {@code CONCURRENT} is expected to have a documented policy concerning: >> * when the spliterator binds to the element source; and detection of >> * structural interference of the element source detected after binding. >> ... >> * After binding a Spliterator should, on a best-effort basis, throw >> * {@link ConcurrentModificationException} if structural interference is >> * detected. Spliterators that do this are called fail-fast. >> >> As Mike pointed out to me "fail-fast" is not accurate since the implementations for bulk traversal, specifically forEachRemaining, can throw a CME after traversal has completed. >> >> - fail-finally >> - fail-ultimately >> - fail-eventually >> >> ? > > fail-slow :) > :-) Although when going parallel it might not be slower than that of using an iterator over the source. Paul. From paul.sandoz at oracle.com Thu Jul 4 00:21:59 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 4 Jul 2013 09:21:59 +0200 Subject: StreamSupport method consolidation Message-ID: <8F6A98D2-3B30-4D4F-A7D4-D1B7801BDA0F@oracle.com> Hi, I just pushed a change that consolidates the methods on StreamSupport: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/33275b926f0d http://hg.openjdk.java.net/lambda/lambda/jdk/file/33275b926f0d/src/share/classes/java/util/stream/StreamSupport.java There are now 4x of of spliterator-based and supplier-of-spliterator-based methods, and a boolean parameter controls whether the returned stream is sequential or parallel. I contemplated the middle ground of having overloads of the spliterator-based methods with and without a boolean parameter but concluded after experimenting it was neater and less over-whelming to be consistent with 8 methods. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130704/40159a41/attachment.html From brian.goetz at oracle.com Fri Jul 5 15:03:51 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 05 Jul 2013 18:03:51 -0400 Subject: Spec pass on Stream Message-ID: <51D742C7.8020704@oracle.com> Here's my latest whack at the interface doc for Stream. The intent is to cover the basic concepts (what is a stream) and the must-describes (no interference with source during pipeline execution) here, and then link to package doc for more detail. Ideally I would have liked to @inheritDoc this all from BaseStream to {,Int,Long,Double}Stream, but javadoc didn't cooperate, so there would likely be cutting and pasting across the five XxxStream interfaces. /** * A sequence of elements supporting sequential and parallel aggregate * operations. For example: * *

{@code
  *     int sum = widgets.stream()
  *                      .filter(b -> b.getColor() == RED)
  *                      .mapToInt(b -> b.getWeight())
  *                      .sum();
  * }
* * In this example, {@code widgets} is a {@code Collection}. We create * a stream of {@code Widget} objects via {@link Collection#stream Collection.stream()}, * filter it to produce a stream containing only the red widgets, and then * transform it into a stream of {@code int} values representing the weight of * each red widget. Then this stream is summed to produce a total weight. * *

To perform a computation, stream * operations are composed into a * stream pipeline. A stream pipeline consists of a source (which * might be an array, a collection, a generator function, an IO channel, * etc), zero or more intermediate operations (which transform a * stream into another stream, such as {@link Stream#filter(Predicate)}), and a * terminal operation (which produces a result or side-effect, such * as {@link IntStream#sum()} or {@link IntStream#forEach(IntConsumer)}). * Streams are lazy; computation on the source data is only performed when the * terminal operation is initiated, and source elements are consumed only * as needed. * *

Collections and streams, while bearing some superficial similarities, * have different goals. Collections are primarily concerned with the efficient * management of, and access to, their elements. By contrast, streams do not * provide a means to directly access or manipulate their elements, and are * instead concerned with declaratively describing their source and the * computational operations which will be performed in aggregate on that source. * However, if the provided stream operations do not offer the desired * functionality, the {@link #iterator()} and {@link #spliterator()} operations * can be used to perform a controlled traversal. * *

A stream pipeline, like the "widgets" example above, can be viewed as * a query on the stream source. Unless the source was explicitly * designed for concurrent modification (such as a {@link ConcurrentHashMap}), * unpredictable or erroneous behavior may result from modifying the stream * source while it is being queried. * *

Most stream operations accept parameters that describe user-specified * behavior, such as the lambda expression {@code b -> b.getWeight()} passed to * {@code mapToInt} in the example above. Such parameters are always instances * of a functional interface such * as {@link java.util.function.Function}, and are often lambda expressions or * method references. These parameters can never be null, should not modify the * stream source, and should be * effectively stateless * (their result should not depend on any state that might change during * execution of the stream pipeline.) * *

A stream should be operated on (invoke an intermediate or terminal stream * operation) only once. This rules out, for example, "forked" streams, where * the same source feeds two or more pipelines, or multiple traversals of the * same stream. A stream implementation may throw {@link IllegalStateException} * if it detects that the stream is being reused. However, since some stream * operations may return their receiver rather than a new stream object, it may * not be possible to detect reuse in all cases. * *

When executed (by initiating its terminal operation), stream pipelines may * execute either sequentially or in * parallel. This * execution mode is a property of the stream. Streams are created * with an initial choice of sequential or parallel execution. (For example, * {@link Collection#stream() Collection.stream()} creates a sequential stream, * and {@link Collection#parallelStream() Collection.parallelStream()} creates * a parallel one.) The orientation of a stream pipeline may be modified by * the {@link #sequential()} or {@link #parallel()} methods, and may be queried * with the {@link #isParallel()} method. Whether a stream pipeline executes in * sequential or parallel is determined by the orientation of the stream * instance on which the terminal operation is invoked. * * @param type of stream elements * @since 1.8 * @see java.util.stream */ From joe.bowbeer at gmail.com Fri Jul 5 18:59:43 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 5 Jul 2013 18:59:43 -0700 Subject: Spec pass on Stream In-Reply-To: <51D742C7.8020704@oracle.com> References: <51D742C7.8020704@oracle.com> Message-ID: Nice. A few nits below. 1. I think "invoking" reads better in the paren. comment below: "A stream should be operated on ([invoking] an intermediate or terminal stream operation) only once." 2. I'm confused about stream vs. stream pipeline and execution mode vs. orientation in the following sentences: Streams are created with an initial choice of sequential or parallel > execution. The orientation of a stream pipeline may be modified by the > {@link #sequential()} or {@link #parallel()} methods, and may be queried > with the {@link #isParallel()} method. Whether a stream pipeline executes > in sequential or parallel is determined by the orientation of the stream > instance on which the terminal operation is invoked. > First, I think it might be clearer to remove "orientation" and use "execution mode" or "execution mode orientation" instead. Second,what's the difference between a stream instance and a stream pipeline? The term "stream instance" is never defined, but I infer that it may be one of the streams created by an intermediate transform op. Stream pipeline is defined previously, but some reiteration may be useful. Here is my attempt at a fix-up, omitting the paren. sentence for brevity: Streams are created with an initial execution mode of sequential or > parallel. The execution mode of a stream pipeline may be modified by the > {@link #sequential()} or {@link #parallel()} methods, and may be queried > with the {@link #isParallel()} method. The ultimate execution mode of a > stream pipeline is determined by the execution mode of the final stream > instance on which the terminal operation is invoked. 3. As a general comment, that doesn't apply so much to the doc in this message, I think there are too many parenthetical comments in the current stream doc. Remedies include (1) removing the parens in the case of a complete sentence, (2) changing to comma-delimited phrases, and (3) moving to a separate sentence. Also try dashes for variety? For example: "A stream should be operated on — invoking an intermediate or terminal stream operation — only once." --Joe On Fri, Jul 5, 2013 at 3:03 PM, Brian Goetz wrote: > Here's my latest whack at the interface doc for Stream. The intent is to > cover the basic concepts (what is a stream) and the must-describes (no > interference with source during pipeline execution) here, and then link to > package doc for more detail. > > Ideally I would have liked to @inheritDoc this all from BaseStream to > {,Int,Long,Double}Stream, but javadoc didn't cooperate, so there would > likely be cutting and pasting across the five XxxStream interfaces. > > /** > * A sequence of elements supporting sequential and parallel aggregate > * operations. For example: > * > *

{@code
>  *     int sum = widgets.stream()
>  *                      .filter(b -> b.getColor() == RED)
>  *                      .mapToInt(b -> b.getWeight())
>  *                      .sum();
>  * }
> * > * In this example, {@code widgets} is a {@code Collection}. We > create > * a stream of {@code Widget} objects via {@link Collection#stream > Collection.stream()}, > * filter it to produce a stream containing only the red widgets, and then > * transform it into a stream of {@code int} values representing the > weight of > * each red widget. Then this stream is summed to produce a total weight. > * > *

To perform a computation, stream > * operations are composed > into a > * stream pipeline. A stream pipeline consists of a source (which > * might be an array, a collection, a generator function, an IO channel, > * etc), zero or more intermediate operations (which transform a > * stream into another stream, such as {@link Stream#filter(Predicate)}), > and a > * terminal operation (which produces a result or side-effect, > such > * as {@link IntStream#sum()} or {@link IntStream#forEach(IntConsumer)** > }). > * Streams are lazy; computation on the source data is only performed when > the > * terminal operation is initiated, and source elements are consumed only > * as needed. > * > *

Collections and streams, while bearing some superficial similarities, > * have different goals. Collections are primarily concerned with the > efficient > * management of, and access to, their elements. By contrast, streams do > not > * provide a means to directly access or manipulate their elements, and are > * instead concerned with declaratively describing their source and the > * computational operations which will be performed in aggregate on that > source. > * However, if the provided stream operations do not offer the desired > * functionality, the {@link #iterator()} and {@link #spliterator()} > operations > * can be used to perform a controlled traversal. > * > *

A stream pipeline, like the "widgets" example above, can be viewed as > * a query on the stream source. Unless the source was explicitly > * designed for concurrent modification (such as a {@link > ConcurrentHashMap}), > * unpredictable or erroneous behavior may result from modifying the stream > * source while it is being queried. > * > *

Most stream operations accept parameters that describe user-specified > * behavior, such as the lambda expression {@code b -> b.getWeight()} > passed to > * {@code mapToInt} in the example above. Such parameters are always > instances > * of a functional > interface such > * as {@link java.util.function.Function}, and are often lambda > expressions or > * method references. These parameters can never be null, should not > modify the > * stream source, and should be > * effectively > stateless > * (their result should not depend on any state that might change during > * execution of the stream pipeline.) > * > *

A stream should be operated on (invoke an intermediate or terminal > stream > * operation) only once. This rules out, for example, "forked" streams, > where > * the same source feeds two or more pipelines, or multiple traversals of > the > * same stream. A stream implementation may throw {@link > IllegalStateException} > * if it detects that the stream is being reused. However, since some > stream > * operations may return their receiver rather than a new stream object, > it may > * not be possible to detect reuse in all cases. > * > *

When executed (by initiating its terminal operation), stream > pipelines may > * execute either sequentially or in > * parallel. This > * execution mode is a property of the stream. Streams are created > * with an initial choice of sequential or parallel execution. (For > example, > * {@link Collection#stream() Collection.stream()} creates a sequential > stream, > * and {@link Collection#parallelStream() Collection.parallelStream()} > creates > * a parallel one.) The orientation of a stream pipeline may be modified > by > * the {@link #sequential()} or {@link #parallel()} methods, and may be > queried > * with the {@link #isParallel()} method. Whether a stream pipeline > executes in > * sequential or parallel is determined by the orientation of the stream > * instance on which the terminal operation is invoked. > * > * @param type of stream elements > * @since 1.8 > * @see **java.util.stream > */ > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130705/9f069758/attachment.html From david.holmes at oracle.com Sun Jul 7 19:10:43 2013 From: david.holmes at oracle.com (David Holmes) Date: Mon, 08 Jul 2013 12:10:43 +1000 Subject: Bikeshed: Spliterator "fail-fast" In-Reply-To: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> References: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> Message-ID: <51DA1FA3.3000709@oracle.com> Hi Paul, On 1/07/2013 11:46 PM, Paul Sandoz wrote: > Hi, > > The Spliterator doc states: > > *

A Spliterator that does not report {@code IMMUTABLE} or > * {@code CONCURRENT} is expected to have a documented policy concerning: > * when the spliterator binds to the element source; and detection of > * structural interference of the element source detected after binding. > ... > * After binding a Spliterator should, on a best-effort basis, throw > * {@link ConcurrentModificationException} if structural interference is > * detected. Spliterators that do this are called fail-fast. > > As Mike pointed out to me "fail-fast" is not accurate since the implementations for bulk traversal, specifically forEachRemaining, can throw a CME after traversal has completed. > > - fail-finally > - fail-ultimately > - fail-eventually > > ? Nothing. It is either fail-fast or else you don't say anything. Any definition of fail-fast for Spliterator should be consistent with that of Iterator. David > Paul. > From david.holmes at oracle.com Sun Jul 7 19:15:02 2013 From: david.holmes at oracle.com (David Holmes) Date: Mon, 08 Jul 2013 12:15:02 +1000 Subject: chain -> andThen? In-Reply-To: <51CDF028.2040700@oracle.com> References: <51CDA726.9080402@oracle.com> <51CDF028.2040700@oracle.com> Message-ID: <51DA20A6.1080003@oracle.com> On 29/06/2013 6:20 AM, Brian Goetz wrote: > A related issue with chain/andThen for Consumers is whether it should do: > > { a.accept(v); b.accept(v); } > > or > > { try { a.accept(v); } finally { b.accept(v); } } > The former. This applies on all the composite "actions" in the different interfaces. David > > On 6/28/2013 11:09 AM, Brian Goetz wrote: >> Working my way slowly through the API looking for rough edges... >> >> We eventually settled on "andThen" as the composition method for >> function-like things. This was chosen because it made the order >> explicit; "do my thing, and then this other thing." This method shows >> up on Comparator, XxxFunction, XxxOperator. >> >> There's a similar method on Consumer, called "chain", which means "let >> me have the argument, then pass it to some other Consumer". I think it >> makes sense to rename this to "andThen" as well. >> >> Consumer logIt = s -> logger.log(s); >> Consumer printIt = s -> System.out.println(s); >> Consumer logAndPrint = logIt.andThen(printIt); >> >> instead of >> >> Consumer logAndPrint = logIt.chain(printIt); >> From brian.goetz at oracle.com Mon Jul 8 11:36:03 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 08 Jul 2013 14:36:03 -0400 Subject: Cleanup: StreamBuilder Message-ID: <51DB0693.80101@oracle.com> The set of public classes and interfaces in java.util.stream now stands at: Collector Collector.Characteristics Collectors DoubleStream IntStream LongStream Stream StreamBuilder StreamBuilder.OfDouble StreamBuilder.OfInt StreamBuilder.OfLong StreamSupport As I've been working my way through the specs, it seems more natural to move the StreamBuilder classes to nested classes of XxxStream, because (a) there is now a static builder() method in those classes, (b) like the Stream clases, StreamBuilder.OfXxx does not extend StreamBuilder, and (c) it reduces the emphasis on these minor classes. So I propose: StreamBuilder -> Stream.Builder StreamBuilder.OfXxx -> XxxStream.Builder From forax at univ-mlv.fr Mon Jul 8 11:52:26 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 08 Jul 2013 20:52:26 +0200 Subject: Cleanup: StreamBuilder In-Reply-To: <51DB0693.80101@oracle.com> References: <51DB0693.80101@oracle.com> Message-ID: <51DB0A6A.7070207@univ-mlv.fr> On 07/08/2013 08:36 PM, Brian Goetz wrote: > The set of public classes and interfaces in java.util.stream now > stands at: > > Collector > Collector.Characteristics > Collectors > DoubleStream > IntStream > LongStream > Stream > StreamBuilder > StreamBuilder.OfDouble > StreamBuilder.OfInt > StreamBuilder.OfLong > StreamSupport > > As I've been working my way through the specs, it seems more natural > to move the StreamBuilder classes to nested classes of XxxStream, > because (a) there is now a static builder() method in those classes, > (b) like the Stream clases, StreamBuilder.OfXxx does not extend > StreamBuilder, and (c) it reduces the emphasis on these minor classes. > > So I propose: > > StreamBuilder -> Stream.Builder > StreamBuilder.OfXxx -> XxxStream.Builder > yes ! R?mi From paul.sandoz at oracle.com Mon Jul 8 12:00:47 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 8 Jul 2013 21:00:47 +0200 Subject: Cleanup: StreamBuilder In-Reply-To: <51DB0693.80101@oracle.com> References: <51DB0693.80101@oracle.com> Message-ID: <5BFF33E9-DB3D-4312-A3FF-3FF3E88A702E@oracle.com> On Jul 8, 2013, at 8:36 PM, Brian Goetz wrote: > The set of public classes and interfaces in java.util.stream now stands at: > > Collector > Collector.Characteristics > Collectors > DoubleStream > IntStream > LongStream > Stream > StreamBuilder > StreamBuilder.OfDouble > StreamBuilder.OfInt > StreamBuilder.OfLong > StreamSupport > > As I've been working my way through the specs, it seems more natural to move the StreamBuilder classes to nested classes of XxxStream, because (a) there is now a static builder() method in those classes, (b) like the Stream clases, StreamBuilder.OfXxx does not extend StreamBuilder, and (c) it reduces the emphasis on these minor classes. > > So I propose: > > StreamBuilder -> Stream.Builder > StreamBuilder.OfXxx -> XxxStream.Builder > +1 It was bugging me that StreamBuilder was so disproportionately visible in the stream package javadoc. Paul. From dl at cs.oswego.edu Mon Jul 8 12:19:49 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 08 Jul 2013 15:19:49 -0400 Subject: Cleanup: StreamBuilder In-Reply-To: <51DB0693.80101@oracle.com> References: <51DB0693.80101@oracle.com> Message-ID: <51DB10D5.3020608@cs.oswego.edu> On 07/08/13 14:36, Brian Goetz wrote: > StreamBuilder -> Stream.Builder > StreamBuilder.OfXxx -> XxxStream.Builder > I vote yes. I always vote yes about nesting static classes as much as you can as one way of improving chances of coping when the world changes and you really want to do something you hadn't planned. -Doug From tim at peierls.net Mon Jul 8 12:33:19 2013 From: tim at peierls.net (Tim Peierls) Date: Mon, 8 Jul 2013 15:33:19 -0400 Subject: Cleanup: StreamBuilder In-Reply-To: <51DB0693.80101@oracle.com> References: <51DB0693.80101@oracle.com> Message-ID: On Mon, Jul 8, 2013 at 2:36 PM, Brian Goetz wrote: > StreamBuilder -> Stream.Builder > StreamBuilder.OfXxx -> XxxStream.Builder > Good! --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130708/50313333/attachment.html From brian.goetz at oracle.com Mon Jul 8 13:57:45 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 08 Jul 2013 16:57:45 -0400 Subject: Cleanup: StreamBuilder In-Reply-To: References: <51DB0693.80101@oracle.com> Message-ID: <51DB27C9.2070008@oracle.com> Webrev at: http://cr.openjdk.java.net/~briangoetz/JDK-8020062/webrev/ On 7/8/2013 3:33 PM, Tim Peierls wrote: > On Mon, Jul 8, 2013 at 2:36 PM, Brian Goetz > wrote: > > StreamBuilder -> Stream.Builder > StreamBuilder.OfXxx -> XxxStream.Builder > > > Good! > > --tim From brian.goetz at oracle.com Mon Jul 8 22:32:10 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 09 Jul 2013 01:32:10 -0400 Subject: Collector / Collectors In-Reply-To: <51CB3248.8050102@oracle.com> References: <51C0838B.7000703@oracle.com> <51C9F70C.4080801@oracle.com> <51C9FCFB.4050508@oracle.com> <51CB3248.8050102@oracle.com> Message-ID: <51DBA05A.8080208@oracle.com> Returning to this... We tried the collector.andThen() approach. People liked the API, but we ran afoul of type inference issues. I think the remaining option is a combinator: collector+finisher -> collector but the previously suggested name (finishing) had an "inside out" feel. But, I think a slight name tweak will do the job: groupingBy(Person::getCity, collectingAndThen(maxBy(comparing(Person::getHeight)), Optional::get))); to suggest what is going on. On 6/26/2013 2:26 PM, Brian Goetz wrote: > > 2. Making the one-arg reducing(), and minBy/maxBy, return Optional > means that queries like "tallest person by city" end up with Optional in > the value: > > Map> m > = people.collect(groupingBy(Person::getCity, > maxBy(comparing(Person::getHeight))); > > Which is doubly bad because the optionals here will *always* be present, > since otherwise there'd be no associated key. > > I can see a few ways to address this: > - Provide non-optional versions of minBy/maxBy/reducing, which would > probably have to return null for "no elements". > - Provide a way to add a finisher to a Collector, which is more > general (and could be used, say, to turn toList() into something that > always collects to an immutable list.) > > The latter could, in turn, be done in one of two ways: > > - Add a andThen(f) method to Collector. Then the above would read: > > groupingBy(Person::getCity, > maxBy(comparing(Person::getHeight)) > .andThen(Optional::get)) > > - Add a combinator: > > groupingBy(Person::getCity, > finishing(maxBy(comparing(Person::getHeight)), > Optional::get))); > > I prefer the former because it reads better in usage; using a combinator > function feels a little "inside out." From brian.goetz at oracle.com Tue Jul 9 13:10:09 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 09 Jul 2013 16:10:09 -0400 Subject: Spec review request Message-ID: <51DC6E21.2050405@oracle.com> I believe the spec for all classes in java.util.stream is ready for API review. Please see current docs here: http://cr.openjdk.java.net/~briangoetz/doctmp/doc/ Some notes: - The package docs are not yet done; when they are finished, links to the package docs will be added in appropriate places. - Ignore instances of ... in the doc. This is an artifact of @link'ing to classes in a package for which doc is not generated. - Any loose ends still in flight may have yet to be reflected. From forax at univ-mlv.fr Tue Jul 9 16:02:54 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 10 Jul 2013 01:02:54 +0200 Subject: Bikeshed: Spliterator "fail-fast" In-Reply-To: <51DA1FA3.3000709@oracle.com> References: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> <51DA1FA3.3000709@oracle.com> Message-ID: <51DC969E.8010605@univ-mlv.fr> On 07/08/2013 04:10 AM, David Holmes wrote: > Hi Paul, > > On 1/07/2013 11:46 PM, Paul Sandoz wrote: >> Hi, >> >> The Spliterator doc states: >> >> *

A Spliterator that does not report {@code >> IMMUTABLE} or >> * {@code CONCURRENT} is expected to have a documented policy >> concerning: >> * when the spliterator binds to the element source; and >> detection of >> * structural interference of the element source detected after >> binding. >> ... >> * After binding a Spliterator should, on a best-effort basis, throw >> * {@link ConcurrentModificationException} if structural >> interference is >> * detected. Spliterators that do this are called fail-fast. >> >> As Mike pointed out to me "fail-fast" is not accurate since the >> implementations for bulk traversal, specifically forEachRemaining, >> can throw a CME after traversal has completed. >> >> - fail-finally >> - fail-ultimately >> - fail-eventually >> >> ? > > Nothing. It is either fail-fast or else you don't say anything. > > Any definition of fail-fast for Spliterator should be consistent with > that of Iterator. > > David David, I agree with you, two semantics is too complex here. Anyway, playing the devil advocate, most implementation of Iterator are not as fail-fast as they could, they don't check the collection modification in hasNext() but only in next(). > >> Paul. >> R?mi From dl at cs.oswego.edu Wed Jul 10 12:09:57 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 10 Jul 2013 15:09:57 -0400 Subject: Class SplittableRandom Message-ID: <51DDB185.2040904@cs.oswego.edu> [Note: I'll be posting this separately to openjdk core-libs and concurrency-interest] We expect that using random numbers in parallel Stream computations will be common. (We know it is common in parallel computing in general.) But we had left support for it in an unsatisfactory state. If you want to create a stream of random numbers to drive a parallel computation, you'd choose among two options, neither of them providing what you probably want: (1) Use a stream based on a single shared java.util.Random object, in which case your program will encounter stunning slowdowns when run with many cores; or (2) Use a stream based on ThreadLocalRandom, which avoids contention, but gives you no control over the use or properties of the per-thread singleton Random object. While the ThreadLocalRandom option is great for many purposes, you wouldn't want to use it in, say, a high-quality Monte Carlo simulation. Enter Guy Steele. Guy has been working on an algorithm that addresses exactly the substantial range of uses not otherwise supported: It is, in essence, the Random number generator analog of a Spliterator. Class SplittableRandom supports method split() that creates a sub-generator that when used in parallel with the original, maintains its statistical properties. When Brian Goetz and I heard that this was nearing completion, we entered drop-everything mode to explore whether it could be added now in time for JDK8. We conclude that it should. We've been helping with JDK-ifying the basic algorithm, integrating java.util.Stream support, etc, to enable addition as class java.util.SplittableRandom. Just to be on the cautious side though, we are for the moment treating this in the same way we treat jsr166 candidates for potential OpenJDK integration. The initial version is available at http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/SplittableRandom.java?view=log With API docs at: http://gee.cs.oswego.edu/dl/jsr166/dist/docs/java/util/SplittableRandom.html This post serves as a request for comment, with shorter than usual turnaround (a couple of days) before considering a request to integrate into OpenJDK 8. So, please take a look. Here are answers to some likely questions: Q: How much faster is it than java.util.Random? A: In sequential usages, usually at least twice as fast for long and double methods; usually only slightly faster for int methods. In parallel usages, SplittableRandom is almost arbitrarily faster. The very first simple parallel Stream program I wrote (to generate and sum nextLong()'s) ran 2900 times faster than the java.util.Random equivalent on a 32-way machine. Q: When can/should I use it instead of java.util.Random? A: Whenever you are not sharing one across Threads. Instances of SplittableRandom are not thread-safe. They are designed to be split, not shared, across threads. When class SplittableRandom applies (or you can rework your program to make it apply), it is usually a better choice. Not only is it usually faster, it also has better statistical independence and uniformity properties. Q: When can/should I use it instead of java.util.concurrent.ThreadLocalRandom? A: When you are doing structured fork/join computations, so you can explicitly split one rather than relying on the per-thread singleton instance. Q: Why is this in java.util, not java.util.concurrent? A: Because, like java.util.Spliterator, SplittableRandom is a tool for arranging isolated parallel computations that don't entail any concurrency control themselves. Q: Why isn't SplittableRandom a subclass of Random? A: Class Random requires thread-safety in its spec. It would be nonsensical for SplittableRandom to comply. Q: Why don't you at least come up with a new interface that defines methods shared with java.util.Random? A: We spent a couple of days exploring this. We think it could and probably should be done, but not now. Method names and specs of SplittableRandom are chosen to make it possible. But we encountered enough short-term obstacles to conclude that this is an unwise move for JDK8. Among the issues are that we'd need to adjust some specs and possibly some code in java.util.Random, and that we are at a loss about whether or how to generalize SplittableRandom's added Stream methods. In the mean time, it would be more than acceptable for SplittableRandom to be used primarily in new code (or new adaptions of old code) that wouldn't need or want to be interoperable with code using java.util.Random. Q: Are we going to revisit with SplittableRandom all those memory contention issues we saw with ThreadLocalRandom? A: Most likely not. Most of the memory contention issues surrounding ThreadLocalRandom arise because they are long-lived. SplittableRandoms will tend to be short-lived. In any case, now that we have the tools to cope (@Contended), we can evaluate and adjust if more detailed empirical analysis warrants. From brian.goetz at oracle.com Wed Jul 10 14:12:21 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Jul 2013 17:12:21 -0400 Subject: Loose end: collect signature Message-ID: <51DDCE35.8080901@oracle.com> In reviewing the docs, I notice that the signature for the three-arg collect() is gratuitously out of sync with Collector. Collector is a tuple of ( Supplier, BiConsumer, BinaryOperator ). But collect() takes a Supplier, a BiConsumer, and a BiConsumer. I think we should change the three-arg collect() to match Collector. From brian.goetz at oracle.com Wed Jul 10 14:14:57 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Jul 2013 17:14:57 -0400 Subject: Loose end: collect signature In-Reply-To: <51DDCE35.8080901@oracle.com> References: <51DDCE35.8080901@oracle.com> Message-ID: <51DDCED1.3060709@oracle.com> Ah, now I remember why we did it this way. For three-arg collect(), BiConsumer matches the signature of existing methods like List::addAll, StringBuilder::add, BitSet::and, so it is easier to use these with method refs. On 7/10/2013 5:12 PM, Brian Goetz wrote: > In reviewing the docs, I notice that the signature for the three-arg > collect() is gratuitously out of sync with Collector. > > Collector is a tuple of ( Supplier, BiConsumer, > BinaryOperator ). But collect() takes a Supplier, a > BiConsumer, and a BiConsumer. > > I think we should change the three-arg collect() to match Collector. > From brian.goetz at oracle.com Thu Jul 11 10:29:47 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Jul 2013 13:29:47 -0400 Subject: MumbleCloseable In-Reply-To: <51D03359.2020707@cs.oswego.edu> References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> Message-ID: <51DEEB8B.9080106@oracle.com> A small wrinkle has emerged here; if you look at the Javadoc for Stream: http://cr.openjdk.java.net/~briangoetz/doctmp/doc/java/util/stream/Stream.html the nested annotation HoldsResource bleeds through into the Javadoc for any subtype of MHCR. This is really leaky; this should not be considered a nested member, and its presence in the docs will be confusing to users of Stream, who will wonder "WTF is this?" Options: - Kill the annotation for now, and revisit at a later time. This deprives APIs of the ability to say "Hey, I really do return a resource-ful stream" and deprives static analysis tools of the ability to perform sharper inspections. Since a Javadoc reboot is on the board for Java 9, we may be able to get finer control over doc generation like this in the future, making this a more useful tool for API designers, but right now, its pretty ugly. - Move the annotation to top level. Would need a longer name, but that's probably OK as @HoldsCloseableResource is still shorter than @MayHoldCloseableResource.HoldsResource. Crappy namespace management, but that's life in java.util. - Leave as is. In order, my preferences are #2, #1, and #3. Other opinions? On 6/30/2013 9:32 AM, Doug Lea wrote: > On 06/28/13 17:59, Brian Goetz wrote: > >> Here's the current draft spec for interface and annotation. > > Pasted below is version after an edit pass between Brian and me. > >> Bikeshed opportunity: should the annotation be nested or at top level? > > I think nested, but if so, it seems cruel to give it such a long name: > > @MayHoldCloseableResource.DefinitelyHoldsCloseableResource > > So the remaining bikeshed opportunity is what's shorter > but still crystal-clear? We probably can't get away with > just: > > @MayHoldCloseableResource.Yes > > Any ideas? > > ... > > /** > * An object that may (but need not) hold one or more references to > * resources that will be released when closed. Such objects may be > * used with try-with-resources or related {@code try...finally} > * constructions that ensure they are closed as soon as they are no > * longer needed. Interface MayHoldCloseableResource indicates that > * only a minority of usages warrant resource control constructions: > * those specialized to known resource-bearing instances, or those > * that must operate in complete generality. > * > *

For example, most usages of the {@link java.util.stream.Stream} > * classes operate on data sources such as an array, {@code > * Collection}, or generator function that do not require or benefit > * from explicit resource control. However, some uses of IO channels > * as data sources do -- a stream operation that opens many files may > * exhaust available system resources unless each is closed promptly, > * rather than waiting for them to be garbage collected. > * > *

Annotation {@code DefinitelyHoldsCloseableResource} may be used > * to guide users deciding whether resource-control constructions are > * warranted when using particular implementations of > * MayHoldCloseableResource. > */ > public interface MayHoldCloseableResource extends AutoCloseable { > /** > * Closes this resource, relinquishing any underlying resources. > * This method is invoked automatically on objects managed by the > * {@code try}-with-resources statement. > * > * Implementers of this interface are strongly encouraged > * to make their {@code close} methods idempotent. > * > * @see AutoCloseable#close() > */ > @Override > void close(); > > /** > * Indicates that a variable holding a {@code > MayHoldCloseableResource} or > * a method returning a {@code MayHoldCloseableResource} definitely > does > * hold a closeable resource. > */ > @Retention(RetentionPolicy.CLASS) > @Documented > @Target({ElementType.FIELD, ElementType.LOCAL_VARIABLE, > ElementType.METHOD }) > @interface DefinitelyHoldsCloseableResource { } > } > From brian.goetz at oracle.com Thu Jul 11 10:38:36 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Jul 2013 13:38:36 -0400 Subject: Spec review request In-Reply-To: <51DC6E21.2050405@oracle.com> References: <51DC6E21.2050405@oracle.com> Message-ID: <51DEED9C.2050106@oracle.com> I've updated the package doc as well. So the doc set can now be considered complete and ready for review. On 7/9/2013 4:10 PM, Brian Goetz wrote: > I believe the spec for all classes in java.util.stream is ready for API > review. Please see current docs here: > > http://cr.openjdk.java.net/~briangoetz/doctmp/doc/ > > Some notes: > - The package docs are not yet done; when they are finished, links to > the package docs will be added in appropriate places. > - Ignore instances of ... in the doc. This is an > artifact of @link'ing to classes in a package for which doc is not > generated. > - Any loose ends still in flight may have yet to be reflected. > From forax at univ-mlv.fr Thu Jul 11 10:50:26 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 11 Jul 2013 19:50:26 +0200 Subject: MumbleCloseable In-Reply-To: <51DEEB8B.9080106@oracle.com> References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> <51DEEB8B.9080106@oracle.com> Message-ID: <51DEF062.8020309@univ-mlv.fr> On 07/11/2013 07:29 PM, Brian Goetz wrote: > A small wrinkle has emerged here; if you look at the Javadoc for Stream: > > > http://cr.openjdk.java.net/~briangoetz/doctmp/doc/java/util/stream/Stream.html > > > the nested annotation HoldsResource bleeds through into the Javadoc > for any subtype of MHCR. This is really leaky; this should not be > considered a nested member, and its presence in the docs will be > confusing to users of Stream, who will wonder "WTF is this?" > > Options: > - Kill the annotation for now, and revisit at a later time. This > deprives APIs of the ability to say "Hey, I really do return a > resource-ful stream" and deprives static analysis tools of the ability > to perform sharper inspections. Since a Javadoc reboot is on the > board for Java 9, we may be able to get finer control over doc > generation like this in the future, making this a more useful tool for > API designers, but right now, its pretty ugly. > > - Move the annotation to top level. Would need a longer name, but > that's probably OK as @HoldsCloseableResource is still shorter than > @MayHoldCloseableResource.HoldsResource. Crappy namespace management, > but that's life in java.util. > > - Leave as is. > > In order, my preferences are #2, #1, and #3. > > Other opinions? Remove the annotation and MayHoldCloseableResource and document in Stream that Stream doesn't follow the strict semantics of AutoCloseable. It will be enough for static analysis tools for 8 and post pone the inclusion of MayHoldCloseableResource in 9. R?mi > > On 6/30/2013 9:32 AM, Doug Lea wrote: >> On 06/28/13 17:59, Brian Goetz wrote: >> >>> Here's the current draft spec for interface and annotation. >> >> Pasted below is version after an edit pass between Brian and me. >> >>> Bikeshed opportunity: should the annotation be nested or at top level? >> >> I think nested, but if so, it seems cruel to give it such a long name: >> >> @MayHoldCloseableResource.DefinitelyHoldsCloseableResource >> >> So the remaining bikeshed opportunity is what's shorter >> but still crystal-clear? We probably can't get away with >> just: >> >> @MayHoldCloseableResource.Yes >> >> Any ideas? >> >> ... >> >> /** >> * An object that may (but need not) hold one or more references to >> * resources that will be released when closed. Such objects may be >> * used with try-with-resources or related {@code try...finally} >> * constructions that ensure they are closed as soon as they are no >> * longer needed. Interface MayHoldCloseableResource indicates that >> * only a minority of usages warrant resource control constructions: >> * those specialized to known resource-bearing instances, or those >> * that must operate in complete generality. >> * >> *

For example, most usages of the {@link java.util.stream.Stream} >> * classes operate on data sources such as an array, {@code >> * Collection}, or generator function that do not require or benefit >> * from explicit resource control. However, some uses of IO channels >> * as data sources do -- a stream operation that opens many files may >> * exhaust available system resources unless each is closed promptly, >> * rather than waiting for them to be garbage collected. >> * >> *

Annotation {@code DefinitelyHoldsCloseableResource} may be used >> * to guide users deciding whether resource-control constructions are >> * warranted when using particular implementations of >> * MayHoldCloseableResource. >> */ >> public interface MayHoldCloseableResource extends AutoCloseable { >> /** >> * Closes this resource, relinquishing any underlying resources. >> * This method is invoked automatically on objects managed by the >> * {@code try}-with-resources statement. >> * >> * Implementers of this interface are strongly encouraged >> * to make their {@code close} methods idempotent. >> * >> * @see AutoCloseable#close() >> */ >> @Override >> void close(); >> >> /** >> * Indicates that a variable holding a {@code >> MayHoldCloseableResource} or >> * a method returning a {@code MayHoldCloseableResource} definitely >> does >> * hold a closeable resource. >> */ >> @Retention(RetentionPolicy.CLASS) >> @Documented >> @Target({ElementType.FIELD, ElementType.LOCAL_VARIABLE, >> ElementType.METHOD }) >> @interface DefinitelyHoldsCloseableResource { } >> } >> From sam at shv.com Thu Jul 11 12:20:29 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 12:20:29 -0700 Subject: Concerns about parallel streams Message-ID: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. Doug, what are your thoughts? How do you expect people to use it? I can imagine some heuristics that we could put in that might save us ? maybe by having a hook that decides when to really do parallel execution that gets executed every N ms with some statistics... Sam From brian.goetz at oracle.com Thu Jul 11 13:02:08 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Jul 2013 16:02:08 -0400 Subject: Concerns about parallel streams In-Reply-To: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> Message-ID: <51DF0F40.5000105@oracle.com> One thing on my list of things to doc is notes on methods that have particularly bad or surprising parallel performance. #1 on this list is limit(n) for large n when the stream is not sized or unordered. Other culprits are collecting to maps (since map merging is expensive.) Others? On 7/11/2013 3:20 PM, Sam Pullara wrote: > As it stands, and it seems we are far past changing this API, it is > simply too easy to get a parallel stream without thinking about > whether it is the right thing to do. I think we need to extensively > document when and why you would use parallel streams vs sequential > streams. We should include a cost model, a benchmark that will help > people figure out whether they should use it, and perhaps some rules > of thumbs where it makes sense. As it stands I think that we are > going to see some huge regressions in performance (both memory and > cpu usage) when people call .parallel() on streams that should be > evaluated sequentially. It would have been great to have the cost > model built into the system that would make a good guess as to > whether it should use parallel execution. > > Doug, what are your thoughts? How do you expect people to use it? I > can imagine some heuristics that we could put in that might save us ? > maybe by having a hook that decides when to really do parallel > execution that gets executed every N ms with some statistics... > > Sam > From joe.bowbeer at gmail.com Thu Jul 11 13:12:41 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 11 Jul 2013 13:12:41 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF0F40.5000105@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF0F40.5000105@oracle.com> Message-ID: I share Sam's concerns. In particular, the concern about memory, which may not be immediately obvious. Are there obvious places to warn about memory use? On Jul 11, 2013 1:02 PM, "Brian Goetz" wrote: > One thing on my list of things to doc is notes on methods that have > particularly bad or surprising parallel performance. #1 on this list is > limit(n) for large n when the stream is not sized or unordered. Other > culprits are collecting to maps (since map merging is expensive.) Others? > > On 7/11/2013 3:20 PM, Sam Pullara wrote: > >> As it stands, and it seems we are far past changing this API, it is >> simply too easy to get a parallel stream without thinking about >> whether it is the right thing to do. I think we need to extensively >> document when and why you would use parallel streams vs sequential >> streams. We should include a cost model, a benchmark that will help >> people figure out whether they should use it, and perhaps some rules >> of thumbs where it makes sense. As it stands I think that we are >> going to see some huge regressions in performance (both memory and >> cpu usage) when people call .parallel() on streams that should be >> evaluated sequentially. It would have been great to have the cost >> model built into the system that would make a good guess as to >> whether it should use parallel execution. >> >> Doug, what are your thoughts? How do you expect people to use it? I >> can imagine some heuristics that we could put in that might save us ? >> maybe by having a hook that decides when to really do parallel >> execution that gets executed every N ms with some statistics... >> >> Sam >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130711/5912250a/attachment.html From tim at peierls.net Thu Jul 11 13:24:54 2013 From: tim at peierls.net (Tim Peierls) Date: Thu, 11 Jul 2013 16:24:54 -0400 Subject: MumbleCloseable In-Reply-To: <51DEF062.8020309@univ-mlv.fr> References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> <51DEEB8B.9080106@oracle.com> <51DEF062.8020309@univ-mlv.fr> Message-ID: On Thu, Jul 11, 2013 at 1:50 PM, Remi Forax wrote: > Remove the annotation and MayHoldCloseableResource and document in Stream > that Stream doesn't follow the strict semantics of AutoCloseable. > It will be enough for static analysis tools for 8 and post pone the > inclusion of MayHoldCloseableResource in 9. > That sounds good to me. I think the whole MayHoldCloseableResource design has been too rushed. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130711/6ced4374/attachment.html From sam at shv.com Thu Jul 11 13:20:48 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 13:20:48 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF0F40.5000105@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF0F40.5000105@oracle.com> Message-ID: I think one of the biggest issues is that the programmer is making a compile time decision when essentially only the runtime environment matters. The biggest one (absent doing I/O in your stream operations!) js whether you have 1 core or less actually available on the machine (concurrent requests, competing applications, VMs, etc) for your task. My guess is that the difference in performance is because of memory usage. Doing a rough analysis of GC logs, I found that sequential streams allocate 10-100x more than the for loop and parallel uses another 10x on top of that. How is a developer supposed to make an informed decision when applying .parallel()? Especially in library code... Sam Micro benchmark https://github.com/spullara/parallelstream/blob/master/src/main/java/parallelstreams/Benchmark.java Results are in ns / run Executed on a dual i7 2ghz macbook air using todays build from lambda Elements: 100... for loop: 562.63 (0.03572569261668172) sequential stream: 2891.91 (0.18362953936887128) parallel stream: 15748.61 (1.0) Elements: 200... for loop: 665.05 (0.031517030475387994) sequential stream: 2347.97 (0.11127139620373919) parallel stream: 21101.29 (1.0) Elements: 400... for loop: 1648.85 (0.07297808201110305) sequential stream: 4069.37 (0.1801102693353079) parallel stream: 22593.77 (1.0) Elements: 800... for loop: 3351.44 (0.13019962876027844) sequential stream: 7899.36 (0.30688114346185313) parallel stream: 25740.78 (1.0) Elements: 1600... for loop: 6792.7 (0.190088701717702) sequential stream: 14291.05 (0.39992449845904654) parallel stream: 35734.37 (1.0) Elements: 3200... for loop: 17991.95 (0.3807979678301864) sequential stream: 25637.77 (0.5426210452840141) parallel stream: 47248.02 (1.0) Elements: 6400... for loop: 34608.11 (0.48269659279327126) sequential stream: 59680.89 (0.8323991763164765) parallel stream: 71697.44 (1.0) Elements: 12800... for loop: 87321.16 (0.7721388554617178) sequential stream: 109948.0 (0.9722170763684879) parallel stream: 113089.97 (1.0) Elements: 25600... for loop: 96544.27 (0.4561001524046007) parallel stream: 180886.05 (0.854552579587232) sequential stream: 211673.4 (1.0) Elements: 51200... for loop: 223773.95 (0.4661786747968137) parallel stream: 397319.45 (0.8277185734621876) sequential stream: 480017.56 (1.0) Process finished with exit code 0 On Jul 11, 2013, at 1:02 PM, Brian Goetz wrote: > One thing on my list of things to doc is notes on methods that have particularly bad or surprising parallel performance. #1 on this list is limit(n) for large n when the stream is not sized or unordered. Other culprits are collecting to maps (since map merging is expensive.) Others? > > On 7/11/2013 3:20 PM, Sam Pullara wrote: >> As it stands, and it seems we are far past changing this API, it is >> simply too easy to get a parallel stream without thinking about >> whether it is the right thing to do. I think we need to extensively >> document when and why you would use parallel streams vs sequential >> streams. We should include a cost model, a benchmark that will help >> people figure out whether they should use it, and perhaps some rules >> of thumbs where it makes sense. As it stands I think that we are >> going to see some huge regressions in performance (both memory and >> cpu usage) when people call .parallel() on streams that should be >> evaluated sequentially. It would have been great to have the cost >> model built into the system that would make a good guess as to >> whether it should use parallel execution. >> >> Doug, what are your thoughts? How do you expect people to use it? I >> can imagine some heuristics that we could put in that might save us ? >> maybe by having a hook that decides when to really do parallel >> execution that gets executed every N ms with some statistics... >> >> Sam >> From aleksey.shipilev at oracle.com Thu Jul 11 13:35:48 2013 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Fri, 12 Jul 2013 00:35:48 +0400 Subject: Concerns about parallel streams In-Reply-To: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> Message-ID: <51DF1724.7020005@oracle.com> On 07/11/2013 11:20 PM, Sam Pullara wrote: > Doug, what are your thoughts? How do you expect people to use it? I > can imagine some heuristics that we could put in that might save us ? > maybe by having a hook that decides when to really do parallel > execution that gets executed every N ms with some statistics... I am not Doug, but have been deeply involved in figuring out the parallel performance model. In short, it is formalizable down to the way of having four model parameters: P - number of processors (loosely, number of FJP workers) C - number of concurrent clients (i.e. Stream users) N - source size (e.g. collection.size()) Q - operation cost, per element Assuming the ideally splittable source and embarrassingly parallel operations, we confirmed the model is most heavily dependent on N*Q, which is exactly the amount of work we are presented with. At this point, break-even against sequential stream correlates with N*Q in order of 200-400 us, with P in (1, 32) on different machines. That is, with the simple filter taking around 5 ns per element, the break-even is somewhere around 40K-80K elements in the source. (Which is not really a good break-even point). While N is known in most cases, Q is really hard. The profiling would not really help with the operations taking different times all of the sudden. Also, we can't easily profile the very fast operations with both the good granularity *and* the low overhead. I working in the background to build up the benchmark to easily figure the break-even front in (P, C, N, Q) space for a given source and the pipeline. It should probably be available for the developers within the JDK. Thanks, -Aleksey. From forax at univ-mlv.fr Thu Jul 11 13:36:25 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 11 Jul 2013 22:36:25 +0200 Subject: MumbleCloseable In-Reply-To: References: <51C8EC8A.7020407@oracle.com> <51C9AF65.4060503@cs.oswego.edu> <51CE0746.4010605@oracle.com> <51D03359.2020707@cs.oswego.edu> <51DEEB8B.9080106@oracle.com> <51DEF062.8020309@univ-mlv.fr> Message-ID: <51DF1749.2030800@univ-mlv.fr> On 07/11/2013 10:24 PM, Tim Peierls wrote: > On Thu, Jul 11, 2013 at 1:50 PM, Remi Forax > wrote: > > Remove the annotation and MayHoldCloseableResource and document in > Stream that Stream doesn't follow the strict semantics of > AutoCloseable. > It will be enough for static analysis tools for 8 and post pone > the inclusion of MayHoldCloseableResource in 9. > > > That sounds good to me. I think the whole MayHoldCloseableResource > design has been too rushed. > > --tim > > as I said privately to Brian, the other problem is that I was wondering if we should not have a solution that works not only for resources but for all other warnings. Basically, @SupressWarnings but it works at callsite and what we want here is something that works at declaration site. R?mi From brian.goetz at oracle.com Thu Jul 11 13:37:56 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 11 Jul 2013 16:37:56 -0400 Subject: Concerns about parallel streams In-Reply-To: References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF0F40.5000105@oracle.com> Message-ID: <51DF17A4.5080207@oracle.com> I think this is to some degree, victim of our own success! Before streams, you had to choose seq vs par, and the code for one was massively different from the other. Now, you can make much cheaper course corrections, and can even course-correct at runtime -- this is a huge improvement. But what we don't have is a magic "figure it out for me." On 7/11/2013 4:20 PM, Sam Pullara wrote: > I think one of the biggest issues is that the programmer is making a compile time decision when essentially only the runtime environment matters. The biggest one (absent doing I/O in your stream operations!) js whether you have 1 core or less actually available on the machine (concurrent requests, competing applications, VMs, etc) for your task. > > My guess is that the difference in performance is because of memory usage. Doing a rough analysis of GC logs, I found that sequential streams allocate 10-100x more than the for loop and parallel uses another 10x on top of that. > > How is a developer supposed to make an informed decision when applying .parallel()? Especially in library code... > > Sam > > Micro benchmark https://github.com/spullara/parallelstream/blob/master/src/main/java/parallelstreams/Benchmark.java > Results are in ns / run > Executed on a dual i7 2ghz macbook air using todays build from lambda > > Elements: 100... > for loop: 562.63 (0.03572569261668172) > sequential stream: 2891.91 (0.18362953936887128) > parallel stream: 15748.61 (1.0) > Elements: 200... > for loop: 665.05 (0.031517030475387994) > sequential stream: 2347.97 (0.11127139620373919) > parallel stream: 21101.29 (1.0) > Elements: 400... > for loop: 1648.85 (0.07297808201110305) > sequential stream: 4069.37 (0.1801102693353079) > parallel stream: 22593.77 (1.0) > Elements: 800... > for loop: 3351.44 (0.13019962876027844) > sequential stream: 7899.36 (0.30688114346185313) > parallel stream: 25740.78 (1.0) > Elements: 1600... > for loop: 6792.7 (0.190088701717702) > sequential stream: 14291.05 (0.39992449845904654) > parallel stream: 35734.37 (1.0) > Elements: 3200... > for loop: 17991.95 (0.3807979678301864) > sequential stream: 25637.77 (0.5426210452840141) > parallel stream: 47248.02 (1.0) > Elements: 6400... > for loop: 34608.11 (0.48269659279327126) > sequential stream: 59680.89 (0.8323991763164765) > parallel stream: 71697.44 (1.0) > Elements: 12800... > for loop: 87321.16 (0.7721388554617178) > sequential stream: 109948.0 (0.9722170763684879) > parallel stream: 113089.97 (1.0) > Elements: 25600... > for loop: 96544.27 (0.4561001524046007) > parallel stream: 180886.05 (0.854552579587232) > sequential stream: 211673.4 (1.0) > Elements: 51200... > for loop: 223773.95 (0.4661786747968137) > parallel stream: 397319.45 (0.8277185734621876) > sequential stream: 480017.56 (1.0) > > Process finished with exit code 0 > > > On Jul 11, 2013, at 1:02 PM, Brian Goetz wrote: > >> One thing on my list of things to doc is notes on methods that have particularly bad or surprising parallel performance. #1 on this list is limit(n) for large n when the stream is not sized or unordered. Other culprits are collecting to maps (since map merging is expensive.) Others? >> >> On 7/11/2013 3:20 PM, Sam Pullara wrote: >>> As it stands, and it seems we are far past changing this API, it is >>> simply too easy to get a parallel stream without thinking about >>> whether it is the right thing to do. I think we need to extensively >>> document when and why you would use parallel streams vs sequential >>> streams. We should include a cost model, a benchmark that will help >>> people figure out whether they should use it, and perhaps some rules >>> of thumbs where it makes sense. As it stands I think that we are >>> going to see some huge regressions in performance (both memory and >>> cpu usage) when people call .parallel() on streams that should be >>> evaluated sequentially. It would have been great to have the cost >>> model built into the system that would make a good guess as to >>> whether it should use parallel execution. >>> >>> Doug, what are your thoughts? How do you expect people to use it? I >>> can imagine some heuristics that we could put in that might save us ? >>> maybe by having a hook that decides when to really do parallel >>> execution that gets executed every N ms with some statistics... >>> >>> Sam >>> From joe.bowbeer at gmail.com Thu Jul 11 13:52:37 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 11 Jul 2013 13:52:37 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF1724.7020005@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF1724.7020005@oracle.com> Message-ID: Aleksey: Can you add memory parameters to your model? Including both the memory overhead of parallel streams and the memory working set of each parallel task. Some tasks may be embarrassingly parallel -- except for the unfortunate constraint that only K instances will fit in the available memory. By the way, another concern not mentioned is UI responsiveness. Adding .parallel() competes with the UI thread's ability to respond to user input. How to model UI degradation? --Joe On Thu, Jul 11, 2013 at 1:35 PM, Aleksey Shipilev < aleksey.shipilev at oracle.com> wrote: > On 07/11/2013 11:20 PM, Sam Pullara wrote: > > Doug, what are your thoughts? How do you expect people to use it? I > > can imagine some heuristics that we could put in that might save us ? > > maybe by having a hook that decides when to really do parallel > > execution that gets executed every N ms with some statistics... > > I am not Doug, but have been deeply involved in figuring out the > parallel performance model. In short, it is formalizable down to the way > of having four model parameters: > P - number of processors (loosely, number of FJP workers) > C - number of concurrent clients (i.e. Stream users) > N - source size (e.g. collection.size()) > Q - operation cost, per element > > Assuming the ideally splittable source and embarrassingly parallel > operations, we confirmed the model is most heavily dependent on N*Q, > which is exactly the amount of work we are presented with. At this > point, break-even against sequential stream correlates with N*Q in order > of 200-400 us, with P in (1, 32) on different machines. > > That is, with the simple filter taking around 5 ns per element, the > break-even is somewhere around 40K-80K elements in the source. (Which is > not really a good break-even point). > > While N is known in most cases, Q is really hard. The profiling would > not really help with the operations taking different times all of the > sudden. Also, we can't easily profile the very fast operations with both > the good granularity *and* the low overhead. > > I working in the background to build up the benchmark to easily figure > the break-even front in (P, C, N, Q) space for a given source and the > pipeline. It should probably be available for the developers within the > JDK. > > Thanks, > -Aleksey. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130711/0f8bf0f3/attachment.html From sam at shv.com Thu Jul 11 14:08:35 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 14:08:35 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF1724.7020005@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF1724.7020005@oracle.com> Message-ID: <68E99DF5-A3F1-4B6C-AB57-B07EC67F14C6@shv.com> Hoping Doug enters the thread soon?. How about we don't bother doing any work in parallel until at least a kernel time slice has passed running in sequential mode? On a linux box on AWS that amount of time is 16ms: -------------------------------------------- #include int main(int argc, char* argv[]) { struct timespec t; sched_rr_get_interval(0, &t); printf("%d\n", t.tv_nsec); } --------------------------------------------- That would at least stop people from screwing it up when N*Q is very small. I'm very much in favor of making this programmatic with a callback so at least the decision to bring the parallel pipeline up can be made by something at runtime. For example, my callback would return false whenever it was called inside an server side application that accepts concurrent requests. Maybe hang this off of the ForkJoinPool? public interface ParallelStreamChecker { /** * This method is called upon execution and then once every kernel timeslice until the Optional.isPresent. If you return of(true) the parallel pipeline will * be setup and the rest of the stream will be processed in parallel. If you return of(false) the stream will be processed sequentially. */ Optional check(long startNanoTime, long numberOfElementsProcessed); } This also makes sense the context of an enterprise application server that will probably replace the ForkJoin implementation and wants control over this kind of thing. Sam On Jul 11, 2013, at 1:35 PM, Aleksey Shipilev wrote: > On 07/11/2013 11:20 PM, Sam Pullara wrote: >> Doug, what are your thoughts? How do you expect people to use it? I >> can imagine some heuristics that we could put in that might save us ? >> maybe by having a hook that decides when to really do parallel >> execution that gets executed every N ms with some statistics... > > I am not Doug, but have been deeply involved in figuring out the > parallel performance model. In short, it is formalizable down to the way > of having four model parameters: > P - number of processors (loosely, number of FJP workers) > C - number of concurrent clients (i.e. Stream users) > N - source size (e.g. collection.size()) > Q - operation cost, per element > > Assuming the ideally splittable source and embarrassingly parallel > operations, we confirmed the model is most heavily dependent on N*Q, > which is exactly the amount of work we are presented with. At this > point, break-even against sequential stream correlates with N*Q in order > of 200-400 us, with P in (1, 32) on different machines. > > That is, with the simple filter taking around 5 ns per element, the > break-even is somewhere around 40K-80K elements in the source. (Which is > not really a good break-even point). > > While N is known in most cases, Q is really hard. The profiling would > not really help with the operations taking different times all of the > sudden. Also, we can't easily profile the very fast operations with both > the good granularity *and* the low overhead. > > I working in the background to build up the benchmark to easily figure > the break-even front in (P, C, N, Q) space for a given source and the > pipeline. It should probably be available for the developers within the JDK. > > Thanks, > -Aleksey. From dl at cs.oswego.edu Thu Jul 11 16:50:26 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 11 Jul 2013 19:50:26 -0400 Subject: Concerns about parallel streams In-Reply-To: <68E99DF5-A3F1-4B6C-AB57-B07EC67F14C6@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF1724.7020005@oracle.com> <68E99DF5-A3F1-4B6C-AB57-B07EC67F14C6@shv.com> Message-ID: <51DF44C2.9080705@cs.oswego.edu> On 07/11/13 17:08, Sam Pullara wrote: > Hoping Doug enters the thread soon?. (It's great to feel needed, but today maybe a little too much :-) A couple of quick notes: If you are writing from-scratch ForkJoin programs rather than stream-based ones, you become immediately aware that you have to make some decisions about task granularity and partitioning. The rule of thumb we state in FJ is if you stay above a thousand or so instructions per leaf task, you'll have a good chance of success. The big problem when you automate this via streams is that most programmers have nearly no idea about any of the components of this otherwise straightforward guidance. And as Aleksey explained, there are few prospects for magically automating in general. Yet any attempt at providing any form of parameterization of hinting to control this has been defensibly rejected. As a result we have a completely opaque cost model. No sense in pretending otherwise. Despite this, the easy guidance is: If you have a lot of data, or very costly per-element computations, the best practice is to use parallel(). Otherwise, feel free to experiment with it, but don't expect any miracles. We could even give factor-of-1000-proof numbers here: A million elements. A million instructions -Doug From dl at cs.oswego.edu Thu Jul 11 17:03:19 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 11 Jul 2013 20:03:19 -0400 Subject: Concerns about parallel streams In-Reply-To: <51DF44C2.9080705@cs.oswego.edu> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF1724.7020005@oracle.com> <68E99DF5-A3F1-4B6C-AB57-B07EC67F14C6@shv.com> <51DF44C2.9080705@cs.oswego.edu> Message-ID: <51DF47C7.8030102@cs.oswego.edu> Oops, I left out... On 07/11/13 19:50, Doug Lea wrote: > Despite this, the easy guidance is: > > If you have a lot of data, or very costly per-element computations, and are not using some hopelessly sequential data structure, > the best practice is to use parallel(). Otherwise, feel free to > experiment with it, but don't expect any miracles. > > We could even give factor-of-1000-proof numbers here: > A million elements. A million instructions > -Doug From david.holmes at oracle.com Thu Jul 11 20:35:46 2013 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Jul 2013 13:35:46 +1000 Subject: Concerns about parallel streams In-Reply-To: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> Message-ID: <51DF7992.3060800@oracle.com> Sam, On 12/07/2013 5:20 AM, Sam Pullara wrote: > As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. I think we addressed this at the start with the decision to require explicit rather than automatic parallelism. Hence I totally oppose any proposal that we run in sequential mode until we have used up a timeslice - that's the automatic parallelism path. Continuing on that explicit path, just as our libraries require explicit parallelism selection, so applications should also require/allow it. If an app chooses to always use parallel() then that is "automatic parallelism" at the app level - and that is as bad as auto-parallelism at the library level. Programmers don't have the runtime knowledge needed to determine whether parallelism will "work" - that is something that application deployers need to choose. So my advice for the docs here is two fold: a) programmers should stick with sequential unless parallel can be shown to have a significant benefit; and b) programmers should allow deployers/end-users to opt-in to parallelism where they have enabled it, rather than enabling it automatically. My 2c. Cheers, David ------ > Doug, what are your thoughts? How do you expect people to use it? I can imagine some heuristics that we could put in that might save us ? maybe by having a hook that decides when to really do parallel execution that gets executed every N ms with some statistics... > > Sam > From sam at shv.com Thu Jul 11 21:26:53 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 21:26:53 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF7992.3060800@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> Message-ID: <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> On Jul 11, 2013, at 8:35 PM, David Holmes wrote: > On 12/07/2013 5:20 AM, Sam Pullara wrote: >> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. > > I think we addressed this at the start with the decision to require > explicit rather than automatic parallelism. Hence I totally oppose any > proposal that we run in sequential mode until we have used up a > timeslice - that's the automatic parallelism path. You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. Sam From david.holmes at oracle.com Thu Jul 11 21:29:38 2013 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Jul 2013 14:29:38 +1000 Subject: Concerns about parallel streams In-Reply-To: <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> Message-ID: <51DF8632.5060901@oracle.com> On 12/07/2013 2:26 PM, Sam Pullara wrote: > On Jul 11, 2013, at 8:35 PM, David Holmes wrote: >> On 12/07/2013 5:20 AM, Sam Pullara wrote: >>> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. >> >> I think we addressed this at the start with the decision to require >> explicit rather than automatic parallelism. Hence I totally oppose any >> proposal that we run in sequential mode until we have used up a >> timeslice - that's the automatic parallelism path. > > You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. I don't agree with treating programmers like children. If they ask for parallel they get parallel. Who are we to try and second guess if they know what they are asking for? David > Sam > From sam at shv.com Thu Jul 11 21:39:16 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 21:39:16 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DF8632.5060901@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> Message-ID: <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> My point is that the programmer doesn't know. It only is known whether to use parallel mode or not at runtime under the specific performance and load circumstances in that environment. Unless all of that is fully specified at compile time, how would you decide whether to use parallel or sequential? My belief is that you can have an intuition that it might be better in some circumstances so you make it possible by requesting it, but only at runtime would the system actually run something in parallel after it verifies that the conditions merit it. The P, C, N and Q are all probably variable at runtime in the vast majority of use cases unless they are designing a system to run only on a specific piece of hardware, by itself, with a known data size and predictable algorithm performance. Sam On Jul 11, 2013, at 9:29 PM, David Holmes wrote: > On 12/07/2013 2:26 PM, Sam Pullara wrote: >> On Jul 11, 2013, at 8:35 PM, David Holmes wrote: >>> On 12/07/2013 5:20 AM, Sam Pullara wrote: >>>> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. >>> >>> I think we addressed this at the start with the decision to require >>> explicit rather than automatic parallelism. Hence I totally oppose any >>> proposal that we run in sequential mode until we have used up a >>> timeslice - that's the automatic parallelism path. >> >> You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. > > I don't agree with treating programmers like children. If they ask for > parallel they get parallel. Who are we to try and second guess if they > know what they are asking for? > > David > >> Sam >> From david.holmes at oracle.com Thu Jul 11 23:37:16 2013 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Jul 2013 16:37:16 +1000 Subject: Concerns about parallel streams In-Reply-To: <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> Message-ID: <51DFA41C.5010007@oracle.com> On 12/07/2013 2:39 PM, Sam Pullara wrote: > My point is that the programmer doesn't know. It only is known whether to use parallel mode or not at runtime under the specific performance and load circumstances in that environment. Unless all of that is fully specified at compile time, how would you decide whether to use parallel or sequential? My belief is that you can have an intuition that it might be better in some circumstances so you make it possible by requesting it, but only at runtime would the system actually run something in parallel after it verifies that the conditions merit it. The P, C, N and Q are all probably variable at runtime in the vast majority of use cases unless they are designing a system to run only on a specific piece of hardware, by itself, with a known data size and predictable algorithm performance. That is why I said the programmer has to be selective about what they parallelize and then require the runtime operator to opt-in to that. If I'm writing an app I can identify potential operations that would benefit from parallelism. But as you say I can't know for sure that in the final deployment this will be a good thing. Hence the deployer makes that final choice. David > Sam > > On Jul 11, 2013, at 9:29 PM, David Holmes wrote: > >> On 12/07/2013 2:26 PM, Sam Pullara wrote: >>> On Jul 11, 2013, at 8:35 PM, David Holmes wrote: >>>> On 12/07/2013 5:20 AM, Sam Pullara wrote: >>>>> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. >>>> >>>> I think we addressed this at the start with the decision to require >>>> explicit rather than automatic parallelism. Hence I totally oppose any >>>> proposal that we run in sequential mode until we have used up a >>>> timeslice - that's the automatic parallelism path. >>> >>> You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. >> >> I don't agree with treating programmers like children. If they ask for >> parallel they get parallel. Who are we to try and second guess if they >> know what they are asking for? >> >> David >> >>> Sam >>> From sam at shv.com Thu Jul 11 23:49:45 2013 From: sam at shv.com (Sam Pullara) Date: Thu, 11 Jul 2013 23:49:45 -0700 Subject: Concerns about parallel streams In-Reply-To: <51DFA41C.5010007@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> <51DFA41C.5010007@oracle.com> Message-ID: I don't think the deployer knows either and certainly can't make that decision for each individual stream in the system. Without a way to inject heuristics into the decision making process that make the choice based on measurements, I think we should probably not recommend its use outside of edge cases where everything is known. Deployment time is not really the same as runtime since you will not entirely know P, C, N and Q at that point though you might have a better idea about a few of them. Anyway, we don't have anything like this in the Javadocs. My concern at this point, considering the API is done, is to make sure that people understand how hard it is to use this feature correctly. Doug's earlier advice, paraphrased: "Millions of elements that each use millions of instructions to process, on an unshared system with an embarrassingly parallel pipeline is an ideal place to try using it.". I'm hoping that we can add the heuristics callback in JDK 9 at this point. Sam On Jul 11, 2013, at 11:37 PM, David Holmes wrote: > On 12/07/2013 2:39 PM, Sam Pullara wrote: >> My point is that the programmer doesn't know. It only is known whether to use parallel mode or not at runtime under the specific performance and load circumstances in that environment. Unless all of that is fully specified at compile time, how would you decide whether to use parallel or sequential? My belief is that you can have an intuition that it might be better in some circumstances so you make it possible by requesting it, but only at runtime would the system actually run something in parallel after it verifies that the conditions merit it. The P, C, N and Q are all probably variable at runtime in the vast majority of use cases unless they are designing a system to run only on a specific piece of hardware, by itself, with a known data size and predictable algorithm performance. > > That is why I said the programmer has to be selective about what they > parallelize and then require the runtime operator to opt-in to that. > > If I'm writing an app I can identify potential operations that would > benefit from parallelism. But as you say I can't know for sure that in > the final deployment this will be a good thing. Hence the deployer makes > that final choice. > > David > >> Sam >> >> On Jul 11, 2013, at 9:29 PM, David Holmes wrote: >> >>> On 12/07/2013 2:26 PM, Sam Pullara wrote: >>>> On Jul 11, 2013, at 8:35 PM, David Holmes wrote: >>>>> On 12/07/2013 5:20 AM, Sam Pullara wrote: >>>>>> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. >>>>> >>>>> I think we addressed this at the start with the decision to require >>>>> explicit rather than automatic parallelism. Hence I totally oppose any >>>>> proposal that we run in sequential mode until we have used up a >>>>> timeslice - that's the automatic parallelism path. >>>> >>>> You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. >>> >>> I don't agree with treating programmers like children. If they ask for >>> parallel they get parallel. Who are we to try and second guess if they >>> know what they are asking for? >>> >>> David >>> >>>> Sam >>>> From david.holmes at oracle.com Thu Jul 11 23:58:51 2013 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Jul 2013 16:58:51 +1000 Subject: Concerns about parallel streams In-Reply-To: References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> <51DFA41C.5010007@oracle.com> Message-ID: <51DFA92B.2090406@oracle.com> On 12/07/2013 4:49 PM, Sam Pullara wrote: > I don't think the deployer knows either and certainly can't make that decision for each individual stream in the system. Without a way to inject heuristics into the decision making process that make the choice based on measurements, I think we should probably not recommend its use outside of edge cases where everything is known. Deployment time is not really the same as runtime since you will not entirely know P, C, N and Q at that point though you might have a better idea about a few of them. I'm using deployment/runtime interchangeably which is not completely accurate. Point is the programmer should only enable potential parallelism and some one down the line has to then choose to actually use it for runtime. Is that practical? Not if there are many such decision points - but I don't think realistic apps will have that many. If they do then the possible tuning permutations will make things intractable anyway. That said to make the selection in the code you really would want a stream(boolean parallel) method, otherwise it is going to be ugly. > Anyway, we don't have anything like this in the Javadocs. My concern at this point, considering the API is done, is to make sure that people understand how hard it is to use this feature correctly. Doug's earlier advice, paraphrased: "Millions of elements that each use millions of instructions to process, on an unshared system with an embarrassingly parallel pipeline is an ideal place to try using it.". I'm hoping that we can add the heuristics callback in JDK 9 at this point. Yes guidance is needed. This is a very sharp tool. David > Sam > > On Jul 11, 2013, at 11:37 PM, David Holmes wrote: > >> On 12/07/2013 2:39 PM, Sam Pullara wrote: >>> My point is that the programmer doesn't know. It only is known whether to use parallel mode or not at runtime under the specific performance and load circumstances in that environment. Unless all of that is fully specified at compile time, how would you decide whether to use parallel or sequential? My belief is that you can have an intuition that it might be better in some circumstances so you make it possible by requesting it, but only at runtime would the system actually run something in parallel after it verifies that the conditions merit it. The P, C, N and Q are all probably variable at runtime in the vast majority of use cases unless they are designing a system to run only on a specific piece of hardware, by itself, with a known data size and predictable algorithm performance. >> >> That is why I said the programmer has to be selective about what they >> parallelize and then require the runtime operator to opt-in to that. >> >> If I'm writing an app I can identify potential operations that would >> benefit from parallelism. But as you say I can't know for sure that in >> the final deployment this will be a good thing. Hence the deployer makes >> that final choice. >> >> David >> >>> Sam >>> >>> On Jul 11, 2013, at 9:29 PM, David Holmes wrote: >>> >>>> On 12/07/2013 2:26 PM, Sam Pullara wrote: >>>>> On Jul 11, 2013, at 8:35 PM, David Holmes wrote: >>>>>> On 12/07/2013 5:20 AM, Sam Pullara wrote: >>>>>>> As it stands, and it seems we are far past changing this API, it is simply too easy to get a parallel stream without thinking about whether it is the right thing to do. I think we need to extensively document when and why you would use parallel streams vs sequential streams. We should include a cost model, a benchmark that will help people figure out whether they should use it, and perhaps some rules of thumbs where it makes sense. As it stands I think that we are going to see some huge regressions in performance (both memory and cpu usage) when people call .parallel() on streams that should be evaluated sequentially. It would have been great to have the cost model built into the system that would make a good guess as to whether it should use parallel execution. >>>>>> >>>>>> I think we addressed this at the start with the decision to require >>>>>> explicit rather than automatic parallelism. Hence I totally oppose any >>>>>> proposal that we run in sequential mode until we have used up a >>>>>> timeslice - that's the automatic parallelism path. >>>>> >>>>> You misunderstand me. I mean if you ask explicitly for parallel mode to not actually use it until we verify that you haven't made a big error. I agree with you except that I think we should protect them from making a big mistake when it is enabled and is unnecessary. >>>> >>>> I don't agree with treating programmers like children. If they ask for >>>> parallel they get parallel. Who are we to try and second guess if they >>>> know what they are asking for? >>>> >>>> David >>>> >>>>> Sam >>>>> From dl at cs.oswego.edu Fri Jul 12 06:06:34 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 12 Jul 2013 09:06:34 -0400 Subject: Concerns about parallel streams In-Reply-To: <51DFA92B.2090406@oracle.com> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> <51DFA41C.5010007@oracle.com> <51DFA92B.2090406@oracle.com> Message-ID: <51DFFF5A.9020003@cs.oswego.edu> On 07/12/13 02:58, David Holmes wrote: > That said to make the selection in the code you really would want a > stream(boolean parallel) method, otherwise it is going to be ugly. (Hey, how come none of you supported this when I first suggested it :-) Also bear in mind that we do have in place the much more programmer-controllable but non-fluent CHM bulk method API, that I put into place to address many of these concerns. Considering that a lot of parallel usages will be for big-data-ish stuff that requires Maps anyway, we have a good story for this. It might have been nicer to also have done almost the same thing for customized, re-invented ParallelArray, since this is the other most common set of parallel usages for which some users want/need explicit control. A year ago, I speculated that I might do this as a non-JDK extension, and still might, although plans for this are now also intertwined with prospects of value/struct/tuple support. Anyway, as you might recall or surmise, from day one, my main fear about all this was: Oh great, now billions of people are all going to start blaming ME for all their performance problems :-) -Doug From dl at cs.oswego.edu Fri Jul 12 06:47:37 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 12 Jul 2013 09:47:37 -0400 Subject: Concerns about parallel streams In-Reply-To: References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF1724.7020005@oracle.com> Message-ID: <51E008F9.7070206@cs.oswego.edu> On 07/11/13 16:52, Joe Bowbeer wrote: > Aleksey: Can you add memory parameters to your model? Including both the memory > overhead of parallel streams and the memory working set of each parallel task. Memory effects are always complicated. here are a few issues: 1. The more sequentially-oriented your code, the more memory it will consume when run in parallel, for less benefit. And these effects are not small. 2. There are usage idioms that will save vast amount of memory in parallel pipelines, for example merge-while-grouping. (MultiSets/MultiMaps: just say no!) But most programmers won't be familiar with the tradeoffs because they don't hurt as badly in most sequential usages. 3. Boxing has such enormous impact on space (and time) to have justified all the work on int/long/double forms. People really need to use them in cases where they might not have noticed performance problems when they wrote non-stream (and thus non-parallel) versions. 4. Using more cores intrinsically uses more memory. These days, vendors tend not to ship systems with memory proportional to cores, but most are not so disproportional to be a huge concern for most users. 5. We still face many GC and memory-system implementation issues that limit scalability. > By the way, another concern not mentioned is UI responsiveness. Adding > .parallel() competes with the UI thread's ability to respond to user input. I'd like to just say, Not our problem: The basic execution support can cope fine if the OS decides not to gives us all the CPU time and instead prioritize UI tasks. But OSes themselves are still evolving ways to do this. So these days, if you have <= 4cores, you might feel some impact. -Doug From brian.goetz at oracle.com Fri Jul 12 08:40:40 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 12 Jul 2013 11:40:40 -0400 Subject: Concerns about parallel streams In-Reply-To: <51DFFF5A.9020003@cs.oswego.edu> References: <07794898-173A-44E9-ABD1-C5684A600E7C@shv.com> <51DF7992.3060800@oracle.com> <3332E695-34FF-4B38-A25E-994B9DFF9962@shv.com> <51DF8632.5060901@oracle.com> <8DCED358-9536-4015-B95C-9C3AA0223635@shv.com> <51DFA41C.5010007@oracle.com> <51DFA92B.2090406@oracle.com> <51DFFF5A.9020003@cs.oswego.edu> Message-ID: <51E02378.8020208@oracle.com> > Anyway, as you might recall or surmise, from day one, > my main fear about all this was: Oh great, now billions > of people are all going to start blaming ME for all their > performance problems :-) I think "Doug gets blamed" is just an axiom of the Java universe. No sense arguing with gravity... But, we have seen this movie before. Remember java.util.concurrent, how many people took away the wrong message about when to use the big new hammers? (One edition of the Deitel book changed all their examples from synchronized to explicit locks, causing a whole generation of students to learn the wrong thing, and his excuse was "but you told us how much better they were...") Sam's point is that, while *we* know that parallelism is a sharp tool and should be used with care, by making it so easy, we have dived head-first into the moral hazard pool. While the answer is not "ok, then make it harder", we should at least make some attempt to educate about why .parallel() is not magic performance dust. There's nothing wrong with a "Parallel Performance for Dummies" section in the package doc. (Either that, or someone should write a book.) From forax at univ-mlv.fr Mon Jul 15 02:39:38 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Jul 2013 11:39:38 +0200 Subject: Stream of a reverse list Message-ID: <51E3C35A.1020704@univ-mlv.fr> How to get a stream of a list in reverse order without actually reversing the list ? do we need a List.reverseStream() or there is another way ? R?mi From joe.bowbeer at gmail.com Mon Jul 15 06:28:44 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 15 Jul 2013 06:28:44 -0700 Subject: Stream of a reverse list In-Reply-To: <51E3C35A.1020704@univ-mlv.fr> References: <51E3C35A.1020704@univ-mlv.fr> Message-ID: It seems like overkill to add a specific method for what is likely to be a general problem, but it's not clear to me what the best general solution is... One option is to create an IntStream of reversed indices, using iterate(), then map that to a stream of elements. Can generate() be used in this situation to generate the reversed stream directly? --Joe On Jul 15, 2013 2:41 AM, "Remi Forax" wrote: > How to get a stream of a list in reverse order without actually reversing > the list ? > > do we need a List.reverseStream() or there is another way ? > > R?mi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130715/449d41d1/attachment.html From brian.goetz at oracle.com Mon Jul 15 06:36:41 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Jul 2013 09:36:41 -0400 Subject: Stream of a reverse list In-Reply-To: References: <51E3C35A.1020704@univ-mlv.fr> Message-ID: <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. Given that it always requires a full barrier, toArray seems the best way to go. Sent from my iPad On Jul 15, 2013, at 9:28 AM, Joe Bowbeer wrote: > It seems like overkill to add a specific method for what is likely to be a general problem, but it's not clear to me what the best general solution is... > > One option is to create an IntStream of reversed indices, using iterate(), then map that to a stream of elements. > > Can generate() be used in this situation to generate the reversed stream directly? > > --Joe > > On Jul 15, 2013 2:41 AM, "Remi Forax" wrote: >> How to get a stream of a list in reverse order without actually reversing the list ? >> >> do we need a List.reverseStream() or there is another way ? >> >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130715/3825c3e6/attachment.html From forax at univ-mlv.fr Mon Jul 15 06:53:23 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Jul 2013 15:53:23 +0200 Subject: Stream of a reverse list In-Reply-To: <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> Message-ID: <51E3FED3.5020400@univ-mlv.fr> On 07/15/2013 03:36 PM, Brian Goetz wrote: > We did consider such a stream op and triaged it away as bring too > niche. It also requires a full barrier to get the first element. And > for infinite streams obviously blows up. > > Given that it always requires a full barrier, toArray seems the best > way to go. > > Sent from my iPad We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list and calls stream() on it. R?mi > > On Jul 15, 2013, at 9:28 AM, Joe Bowbeer > wrote: > >> It seems like overkill to add a specific method for what is likely to >> be a general problem, but it's not clear to me what the best general >> solution is... >> >> One option is to create an IntStream of reversed indices, using >> iterate(), then map that to a stream of elements. >> >> Can generate() be used in this situation to generate the reversed >> stream directly? >> >> --Joe >> >> On Jul 15, 2013 2:41 AM, "Remi Forax" > > wrote: >> >> How to get a stream of a list in reverse order without actually >> reversing the list ? >> >> do we need a List.reverseStream() or there is another way ? >> >> R?mi >> From joe.bowbeer at gmail.com Mon Jul 15 07:15:17 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 15 Jul 2013 07:15:17 -0700 Subject: Stream of a reverse list In-Reply-To: <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> Message-ID: Is there a general method recommended for creating a special-order stream from a collection? Is generate() appropriate here, even in the sequential case? Joe On Jul 15, 2013 6:36 AM, "Brian Goetz" wrote: > We did consider such a stream op and triaged it away as bring too niche. > It also requires a full barrier to get the first element. And for infinite > streams obviously blows up. > > Given that it always requires a full barrier, toArray seems the best way > to go. > > Sent from my iPad > > On Jul 15, 2013, at 9:28 AM, Joe Bowbeer wrote: > > It seems like overkill to add a specific method for what is likely to be a > general problem, but it's not clear to me what the best general solution > is... > > One option is to create an IntStream of reversed indices, using iterate(), > then map that to a stream of elements. > > Can generate() be used in this situation to generate the reversed stream > directly? > > --Joe > On Jul 15, 2013 2:41 AM, "Remi Forax" wrote: > >> How to get a stream of a list in reverse order without actually reversing >> the list ? >> >> do we need a List.reverseStream() or there is another way ? >> >> R?mi >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130715/a1f6e7fd/attachment.html From forax at univ-mlv.fr Mon Jul 15 08:24:06 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Jul 2013 17:24:06 +0200 Subject: BaseStream.onClose() should take an AutoCloseable as parameter Message-ID: <51E41416.20601@univ-mlv.fr> Currently onClose takes a Runnable so the way to transform a checked exception to an unchecked one has to be done by the caller of onClose() with the risk that the unchecked exception will be different from two different callers. I strongly disagree that checked exception should be hidden that way and nobody care but at least the code that transforms the checked exception to an unchecked one should be written once inside the pipeline implementation and documented once in the Stream javadoc. R?mi From david.lloyd at redhat.com Mon Jul 15 08:38:13 2013 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 15 Jul 2013 10:38:13 -0500 Subject: Stream of a reverse list In-Reply-To: <51E3FED3.5020400@univ-mlv.fr> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> Message-ID: <51E41765.3070604@redhat.com> On 07/15/2013 08:53 AM, Remi Forax wrote: > On 07/15/2013 03:36 PM, Brian Goetz wrote: >> We did consider such a stream op and triaged it away as bring too >> niche. It also requires a full barrier to get the first element. And >> for infinite streams obviously blows up. >> >> Given that it always requires a full barrier, toArray seems the best >> way to go. >> >> Sent from my iPad > > We reject having a method reverse() on Stream but not the fact that one > can create a Stream that will iterate over the list in a backward way. > But perhaps, it's better to have a method of List named reverseList() > that return a reverse view of the list > and calls stream() on it. Perhaps a viable alternative is to add a Collections.reversedView(List) method which presents a reverse-order view of a List, rather than growing the List API for this specific purpose. This would be more generally useful, if I understand the default implementation of .stream() and .spliterator() correctly. > > R?mi > >> >> On Jul 15, 2013, at 9:28 AM, Joe Bowbeer > > wrote: >> >>> It seems like overkill to add a specific method for what is likely to >>> be a general problem, but it's not clear to me what the best general >>> solution is... >>> >>> One option is to create an IntStream of reversed indices, using >>> iterate(), then map that to a stream of elements. >>> >>> Can generate() be used in this situation to generate the reversed >>> stream directly? >>> >>> --Joe >>> >>> On Jul 15, 2013 2:41 AM, "Remi Forax" >> > wrote: >>> >>> How to get a stream of a list in reverse order without actually >>> reversing the list ? >>> >>> do we need a List.reverseStream() or there is another way ? >>> >>> R?mi >>> > -- - DML From forax at univ-mlv.fr Mon Jul 15 09:33:21 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Jul 2013 18:33:21 +0200 Subject: Bitten by the lambda parameter name In-Reply-To: References: <51E40C99.30406@univ-mlv.fr> <51E41194.5030609@oracle.com> <51E41514.9060500@univ-mlv.fr> <51E41605.4070908@oracle.com> <51E4167E.6080701@oracle.com> Message-ID: <51E42451.9010703@univ-mlv.fr> On 07/15/2013 05:59 PM, Zhong Yu wrote: > On Mon, Jul 15, 2013 at 10:34 AM, Maurizio Cimadamore > wrote: >> On 15/07/13 16:32, Maurizio Cimadamore wrote: >>> On 15/07/13 16:28, Remi Forax wrote: >>>> On 07/15/2013 05:13 PM, Maurizio Cimadamore wrote: >>>>> On 15/07/13 15:52, Remi Forax wrote: >>>>>> This snippet not compile, >>>>>> Kind kind = ... >>>>>> partySetMap.computeIfAbsent(kind, kind -> new >>>>>> HashSet<>()).add(party); >>>>>> >>>>>> Each time I write more than a hundred lines of codes that use some >>>>>> lambdas, >>>>>> I fall into this trap. >>>>>> >>>>>> It's very annoying ! >>>>>> >>>>>> R?mi >>>>>> >>>>>> >>>>> Annoying yes - but there is a reason for it? If we provide special >>>>> scoping for lambda parameters then we will never be able to add >>>>> control abstraction syntax in a nice way; not saying that it's >>>>> something we want - but it's good to have option open at least. >>>> It's a crystal ball argument, in the future if we do that then ... >>>> It usually doesn't work because between now and the future, the way >>>> the feature will be introduced will change. >>>> >>> Well, yes and no - I remember we discussed a lot whether a lambda >>> should look (semantically) more like a block or an inner class. We >>> decided it should look like the former. This is a consequence of that > It is also very annoying that this doesn't compile > > int x = 1; > > { > int x = 2; > } > > It is too hard to give a distinct name to every variable. Remi is > right this is a big PITA. No, I'm fine with this. You don't need two different things to be named with the same name, so the rule on block helps to catch bugs. With a lambda, it's the opposite, you need to find two names for the same thing, and this stupid rule bugs me. R?mi > >>> decision. I think that mixing and matching semantics on a by-need >>> basis is not a good idea. >> And - one might argue the code you are trying to write is not that >> readable in the first place (adding random suffixes just to get it >> through javac is not very elegant readability-wise, but it does convery >> the concept that the two references of 'kind' which occur very close one >> to the other are indeed unrelated). > > > >> Maurizio >>> Maurizio >>>> In this peculiar case, if we add control abstraction syntax we will >>>> use a different syntax, >>>> so it's very annoying for no reason. >>>> >>>>> Maurizio >>>> R?mi >>>> >> From daniel.smith at oracle.com Mon Jul 15 11:19:16 2013 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 15 Jul 2013 12:19:16 -0600 Subject: Stream of a reverse list In-Reply-To: <51E3FED3.5020400@univ-mlv.fr> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> Message-ID: On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: > On 07/15/2013 03:36 PM, Brian Goetz wrote: >> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >> >> Given that it always requires a full barrier, toArray seems the best way to go. >> >> Sent from my iPad > > We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. > But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list > and calls stream() on it. Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) ?Dan From forax at univ-mlv.fr Mon Jul 15 11:30:17 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 15 Jul 2013 20:30:17 +0200 Subject: Stream of a reverse list In-Reply-To: References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> Message-ID: <51E43FB9.50502@univ-mlv.fr> On 07/15/2013 08:19 PM, Dan Smith wrote: > On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: > >> On 07/15/2013 03:36 PM, Brian Goetz wrote: >>> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >>> >>> Given that it always requires a full barrier, toArray seems the best way to go. >>> >>> Sent from my iPad >> We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. >> But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list >> and calls stream() on it. > Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. reverseStream will also call for a reverseParallelStream(), I think I prefer a reverseList (and reverseIterator like in Deque) declared on List. > Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. > > Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. I agree. > Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method Deque was not introduce in 1.2 but later, in 1.6 I think. > -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) Yes, if the list is not backed by an array. > > ?Dan R?mi From david.lloyd at redhat.com Mon Jul 15 12:01:59 2013 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 15 Jul 2013 14:01:59 -0500 Subject: Stream of a reverse list In-Reply-To: References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> Message-ID: <51E44727.5000404@redhat.com> On 07/15/2013 01:19 PM, Dan Smith wrote: > On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: > >> On 07/15/2013 03:36 PM, Brian Goetz wrote: >>> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >>> >>> Given that it always requires a full barrier, toArray seems the best way to go. >>> >>> Sent from my iPad >> >> We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. >> But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list >> and calls stream() on it. > > Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. > > Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) If that is directed at me... I did not suggest Collection, I suggested Collections, in lieu of adding a default method to List (though that's an option too). -- - DML From daniel.smith at oracle.com Tue Jul 16 12:59:14 2013 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 16 Jul 2013 13:59:14 -0600 Subject: Stream of a reverse list In-Reply-To: <51E44727.5000404@redhat.com> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> <51E44727.5000404@redhat.com> Message-ID: <2DD9C208-0CD9-4DC2-B9C8-5449F7DAF249@oracle.com> On Jul 15, 2013, at 1:01 PM, David M. Lloyd wrote: > On 07/15/2013 01:19 PM, Dan Smith wrote: >> On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: >> >>> On 07/15/2013 03:36 PM, Brian Goetz wrote: >>>> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >>>> >>>> Given that it always requires a full barrier, toArray seems the best way to go. >>>> >>>> Sent from my iPad >>> >>> We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. >>> But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list >>> and calls stream() on it. >> >> Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. >> >> Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) > > If that is directed at me... I did not suggest Collection, I suggested Collections, in lieu of adding a default method to List (though that's an option too). Sort of a conglomeration of you mentioning Collections and Brian talking about a Stream method. But, yeah, I did realize when reading carefully that you were talking about a static method operating on Lists. The idea is it's fairly painless to "streamify" existing concepts, like reverse List and Deque iteration; much more expensive (and perhaps ill-advised) to create new concepts, like every Stream having a reverse order. ?Dan From brian.goetz at oracle.com Tue Jul 16 13:31:24 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 16 Jul 2013 16:31:24 -0400 Subject: Stream of a reverse list In-Reply-To: <2DD9C208-0CD9-4DC2-B9C8-5449F7DAF249@oracle.com> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> <51E44727.5000404@redhat.com> <2DD9C208-0CD9-4DC2-B9C8-5449F7DAF249@oracle.com> Message-ID: <994AC909-52CF-4112-98D3-CEDA402297AB@oracle.com> All lists have a reverse order because they are finite. Not all streams have a reverse order. More generally, we probably want more "view" combinators on collections, for things like concat, reverse, or merge without paying the price of a full copy. That's a story for another day. Sent from my iPad On Jul 16, 2013, at 3:59 PM, Dan Smith wrote: > On Jul 15, 2013, at 1:01 PM, David M. Lloyd wrote: > >> On 07/15/2013 01:19 PM, Dan Smith wrote: >>> On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: >>> >>>> On 07/15/2013 03:36 PM, Brian Goetz wrote: >>>>> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >>>>> >>>>> Given that it always requires a full barrier, toArray seems the best way to go. >>>>> >>>>> Sent from my iPad >>>> >>>> We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. >>>> But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list >>>> and calls stream() on it. >>> >>> Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. >>> >>> Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) >> >> If that is directed at me... I did not suggest Collection, I suggested Collections, in lieu of adding a default method to List (though that's an option too). > > Sort of a conglomeration of you mentioning Collections and Brian talking about a Stream method. But, yeah, I did realize when reading carefully that you were talking about a static method operating on Lists. > > The idea is it's fairly painless to "streamify" existing concepts, like reverse List and Deque iteration; much more expensive (and perhaps ill-advised) to create new concepts, like every Stream having a reverse order. > > ?Dan From mike.duigou at oracle.com Thu Jul 18 13:29:28 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Thu, 18 Jul 2013 13:29:28 -0700 Subject: Bikeshed: Spliterator "fail-fast" In-Reply-To: <51DC969E.8010605@univ-mlv.fr> References: <05BFC7BF-93E8-4257-978D-0EC033D1DBE6@oracle.com> <51DA1FA3.3000709@oracle.com> <51DC969E.8010605@univ-mlv.fr> Message-ID: (Sorry for the delayed response. This is the final thread I processed in digging out of my post-Holiday email deluge) On Jul 9 2013, at 16:02 , Remi Forax wrote: > On 07/08/2013 04:10 AM, David Holmes wrote: >> Hi Paul, >> >> On 1/07/2013 11:46 PM, Paul Sandoz wrote: >>> Hi, >>> >>> The Spliterator doc states: >>> >>> *

A Spliterator that does not report {@code IMMUTABLE} or >>> * {@code CONCURRENT} is expected to have a documented policy concerning: >>> * when the spliterator binds to the element source; and detection of >>> * structural interference of the element source detected after binding. >>> ... >>> * After binding a Spliterator should, on a best-effort basis, throw >>> * {@link ConcurrentModificationException} if structural interference is >>> * detected. Spliterators that do this are called fail-fast. >>> >>> As Mike pointed out to me "fail-fast" is not accurate since the implementations for bulk traversal, specifically forEachRemaining, can throw a CME after traversal has completed. >>> >>> - fail-finally >>> - fail-ultimately >>> - fail-eventually >>> >>> ? >> >> Nothing. It is either fail-fast or else you don't say anything. Still throw the CME? After the fact would seem to be better than not at all. If a CME might be thrown we should mention it. >> >> Any definition of fail-fast for Spliterator should be consistent with that of Iterator. >> >> David > > David, I agree with you, two semantics is too complex here. > > Anyway, playing the devil advocate, most implementation of Iterator are not as fail-fast > as they could, they don't check the collection modification in hasNext() but only in next(). I believe that only checking in next() is reasonable compromise for best-effort without unduly impairing performance. One thing to note is that some CME are generated in defensive conditions where the data structure is attempting to avoid self-mutilation of ArrayIndexOutOfBoundsException, NPE, etc. separate from modCount checks. I would expect that Spliterators et al might have similar defensive checks even if they don't do per-elment modCount tracking. Mike From david.lloyd at redhat.com Mon Jul 22 15:02:31 2013 From: david.lloyd at redhat.com (David M. Lloyd) Date: Mon, 22 Jul 2013 17:02:31 -0500 Subject: Stream of a reverse list In-Reply-To: <994AC909-52CF-4112-98D3-CEDA402297AB@oracle.com> References: <51E3C35A.1020704@univ-mlv.fr> <2DB5F425-7279-405D-AB4D-DD9E8D3786A3@oracle.com> <51E3FED3.5020400@univ-mlv.fr> <51E44727.5000404@redhat.com> <2DD9C208-0CD9-4DC2-B9C8-5449F7DAF249@oracle.com> <994AC909-52CF-4112-98D3-CEDA402297AB@oracle.com> Message-ID: <51EDABF7.7060008@redhat.com> Sure. My point though (in case it wasn't clear) was that it's better to have a general reverse list view method than it is to have a reverseStream method on List (and possibly a host of other specific-purpose reversal methods in the vein of Deque's descending iterator). If that means the whole thing is put off because it sort of implies a more general effort, well, that's a judgment call I don't necessarily agree with (i.e. you draw that line here, I draw it there) but I respect it. On 07/16/2013 03:31 PM, Brian Goetz wrote: > All lists have a reverse order because they are finite. Not all streams have a reverse order. > > More generally, we probably want more "view" combinators on collections, for things like concat, reverse, or merge without paying the price of a full copy. That's a story for another day. > > > Sent from my iPad > > On Jul 16, 2013, at 3:59 PM, Dan Smith wrote: > >> On Jul 15, 2013, at 1:01 PM, David M. Lloyd wrote: >> >>> On 07/15/2013 01:19 PM, Dan Smith wrote: >>>> On Jul 15, 2013, at 7:53 AM, Remi Forax wrote: >>>> >>>>> On 07/15/2013 03:36 PM, Brian Goetz wrote: >>>>>> We did consider such a stream op and triaged it away as bring too niche. It also requires a full barrier to get the first element. And for infinite streams obviously blows up. >>>>>> >>>>>> Given that it always requires a full barrier, toArray seems the best way to go. >>>>>> >>>>>> Sent from my iPad >>>>> >>>>> We reject having a method reverse() on Stream but not the fact that one can create a Stream that will iterate over the list in a backward way. >>>>> But perhaps, it's better to have a method of List named reverseList() that return a reverse view of the list >>>>> and calls stream() on it. >>>> >>>> Has to be prioritized, of course, but I think 'List.reverseStream' is in principle a good suggestion. Looks a lot like the idea of having different methods on CharSequence to get char-based and int-based views of the same data. In this case, we're getting front-to-back and back-to-front views of the List. >>>> >>>> Collection is _not_ a good place to put a method like this, because Collections are not designed to support reverse-order traversal. Lists are (see ListIterator). As are Deques (see Deque.desendingIterator). (I was surprised, actually, to not find a similar List.reverseIterator method -- I guess the intended idiom is to call 'list.iterator(list.size())' and then iterate with 'ListIterator.previous'.) >>> >>> If that is directed at me... I did not suggest Collection, I suggested Collections, in lieu of adding a default method to List (though that's an option too). >> >> Sort of a conglomeration of you mentioning Collections and Brian talking about a Stream method. But, yeah, I did realize when reading carefully that you were talking about a static method operating on Lists. >> >> The idea is it's fairly painless to "streamify" existing concepts, like reverse List and Deque iteration; much more expensive (and perhaps ill-advised) to create new concepts, like every Stream having a reverse order. >> >> ?Dan -- - DML From brian.goetz at oracle.com Tue Jul 30 10:27:30 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 30 Jul 2013 10:27:30 -0700 Subject: EG meeting In-Reply-To: References: <51CCDA17.70605@oracle.com> Message-ID: Reminder: Meeting on Thursday 9A. I have the following attendees down: Oracle: Brian, Dan, Maurizio, Robert, Joe IBM: Dan and Ryan Google: Kevin and Colin Jetbrains: Andrey and Anna Joe B Sam Remi Vlad Crazy Bob I think this is close to the capacity of the room we reserved, so if I've missed anyone, let me know and I'll see what I can do. > > The plans are finalized, we will have an in-person meeting the Thursday of JVM Language Summit week, 9:00AM at "The Mansion" on Oracle's Santa Clara campus (same place as last year). From spullara at gmail.com Wed Jul 31 09:20:58 2013 From: spullara at gmail.com (Sam Pullara) Date: Wed, 31 Jul 2013 09:20:58 -0700 Subject: EG meeting In-Reply-To: References: <51CCDA17.70605@oracle.com> Message-ID: <2B48B275-0FAC-4A07-8A42-44C79C8B2200@gmail.com> Unfortunately, I have to bow out of this. I failed to reserve it on my calendar and now it is overrun. Sorry, Sam On Jul 30, 2013, at 10:27 AM, Brian Goetz wrote: > Reminder: Meeting on Thursday 9A. > > I have the following attendees down: > > Oracle: Brian, Dan, Maurizio, Robert, Joe > IBM: Dan and Ryan > Google: Kevin and Colin > Jetbrains: Andrey and Anna > Joe B > Sam > Remi > Vlad > Crazy Bob > > I think this is close to the capacity of the room we reserved, so if I've missed anyone, let me know and I'll see what I can do. > >>> The plans are finalized, we will have an in-person meeting the Thursday of JVM Language Summit week, 9:00AM at "The Mansion" on Oracle's Santa Clara campus (same place as last year). > From paul.sandoz at oracle.com Wed Jul 31 10:52:48 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 31 Jul 2013 18:52:48 +0100 Subject: Spliterator documentation Message-ID: Hi, Various places in java.util and java.util.concurrent need to clarify the stuff about spliterators. The lambda webrev linked to below tweaks stuff in java.util, assuming folks are OK with this i can also propose something for java.util.concurrent to be considered for 166: http://cr.openjdk.java.net/~psandoz/lambda/split-docs/webrev/ - the spliterators for empty Set/SortedSet/List/Map (+ NavigableMap/NavigableSet in tl) and singleton Set//List/Map don't conform to the reporting of characteristics for the correspond collection, they all reuse the empty or singleton spliterator implementation. I think it reasonable to relax the constraints on reporting certain characteristics here since with zero or 1 element there is not much clients can do regarding processing based on characteristics. - i still have not come up with an alternative name for "fail-fast", contrary to some opinions on the related email thread about this i do think we need to mention it. I updated spliterator to mention about bulk traversal and checking for co-modification after all elements have been traversed. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130731/52150c4d/attachment.html