From joe.bowbeer at gmail.com Fri Mar 1 11:27:47 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 1 Mar 2013 11:27:47 -0800 Subject: enhanced type-inference In-Reply-To: References: <5102C593.5070805@oracle.com> <5102CAC6.4050108@oracle.com> Message-ID: Update: The NetBeans editor has caught-up with the graph-inference compiler, and my simple samples now display error-free, in NetBeans jdk8lambda #1665 (with jdk8 binary dist. b79). bitbucket.org/joebowbeer/anagrams bitbucket.org/joebowbeer/stringcompare bitbucket.org/joebowbeer/wordchainkata My prior nits have all been resolved as well. --Joe On Wed, Jan 30, 2013 at 3:23 PM, Joe Bowbeer wrote: > Thanks! > > I picked up these changes when I installed binary snapshot b75. > > I updated my samples(*) accordingly and now the following static imports > are working without addition type annotation: > > Collectors.toList > ConcurrentCollectors.groupBy > Character.compare > > Nits: > > 1. orElse(null) is still a nuisance. Will this be resolved? > > 2. ConcurrentCollectors.groupBy vs. Collectors.groupingBy (should be > groupBy?) > > > http://download.java.net/lambda/b75/docs/api/java/util/stream/ConcurrentCollectors.html#groupBy(java.util.function.Function) > > http://download.java.net/lambda/b75/docs/api/java/util/stream/Collectors.html#groupingBy(java.util.function.Function) > > 3. NetBeans jdk8lambda build #1612 is not as smart as the new compiler and > flags the graph-inference lines as errors. > > http://bertram2.netbeans.org:8080/job/jdk8lambda/ > > > (*) Sample projects: > > bitbucket.org/joebowbeer/anagrams > bitbucket.org/joebowbeer/stringcompare > bitbucket.org/joebowbeer/wordchainkata > > > --Joe > > > > On Fri, Jan 25, 2013 at 10:11 AM, Brian Goetz wrote: > >> More info on new type inference. >> >> >> -------- Original Message -------- >> Subject: enhanced type-inference >> Date: Fri, 25 Jan 2013 17:49:07 +0000 >> From: Maurizio Cimadamore >> > >> Organization: Oracle >> To: lambda-dev >> >> Dear lambdackers, >> I've just pushed a patch that enables a more general inference support >> for nested generic method calls/stuck expressions. This scheme has been >> available for a while in lambda-repo (when using the hidden flag >> -XDuseGraphInference), but we have now decided it's time to flip the >> switch and make it the default when using JDK 8. In the past few weeks >> I've been hunting down as many bugs in the new inference scheme as >> possible, in order to provide a smooth transition from the old world to >> the new one. I hope the transition is indeed smooth - but, given the >> nature of the change, I also expect bugs to pop up here and there, so >> please, keep throwing the kitchen sink at javac and report your >> experience back to us; without your valuable feedback and dedication we >> would never have gotten thus far. >> >> Example of things that now work: >> >> Stream si = ... >> List l1 = si.into(new ArrayList<>()); //not really - too >> late for that ;-) >> List l2 = si.collect(toList()); >> List l3 = si.collect(toCollection(**ArrayList::new)); >> >> Thanks >> Maurizio >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130301/b9969e06/attachment.html From brian.goetz at oracle.com Sun Mar 3 15:16:04 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 03 Mar 2013 18:16:04 -0500 Subject: enhanced type-inference In-Reply-To: <5109CA11.4090905@oracle.com> References: <5102C593.5070805@oracle.com> <5102CAC6.4050108@oracle.com> <5109CA11.4090905@oracle.com> Message-ID: <5133D9B4.8040308@oracle.com> >> 1. orElse(null) is still a nuisance. Will this be resolved? > > Thanks for the reminder. This is resolved now, by renaming orElse(Supplier) to orElseGet. From brian.goetz at oracle.com Mon Mar 4 07:37:55 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 04 Mar 2013 10:37:55 -0500 Subject: Collectors inventory In-Reply-To: <51288E95.4010903@univ-mlv.fr> References: <5126A74A.3040509@oracle.com> <51288E95.4010903@univ-mlv.fr> Message-ID: <5134BFD3.7090602@oracle.com> >> As I promised a long time ago, here's an overview of what's in >> Collectors currently. > > I think there are too many methods in Collectors, we should restrain > ourselves to 2 forms (3 max). Let me make sure I understand the rationale for such a rule. Having more forms has a clear advantage: the client code is simpler (e.g., free of extra noise like HashMap::new when the user doesn't care what Map he gets.) And the implementations are trivial, so the implementation complexity is not an issue. Is the sole issue here the "OMG so many Collectors" reaction when the user goes to the Javadoc page for Collectors? >> There are 12 basic forms: >> - toCollection(ctor) >> - toList() >> - toSet() >> - toStringBuilder() >> - toStringJoiner(delimiter) >> - to{Long,Double}Statistics >> >> - groupingBy(classifier, mapFactory, downstream collector) >> - groupingReduce(classifier, mapFactory, mapper, reducer) >> - mapping(mappingFn, downstream collector) >> - joiningWith(mappingFunction, mergeFunction, mapFactory) >> - partitioningBy(predicate, downstream collector) >> - partitioningReduce(predicate, mapper, reducer) To be clear, has anyone objected to any of these basic forms, or are we only talking about the variants? >> GroupingBy has four forms: >> - groupingBy(T->K) -- standard groupBy, values of resulting Map are >> Collection >> - Same, but with explicit constructors for map and for rows (so you >> can produce, say, a TreeMap> and not just a >> Map>) >> - groupingBy(T->K, Collector) -- multi-level groupBy, where >> downstream is another Collector >> - Same, but with explicit ctor for map > > You can remove the third one give, you have the one with an explicit > constructor. I think its a false economy to suggest removing this one. Think about the user code: collect(groupBy(Foo::first, groupBy(Foo::second))) is really clear. The extra map ctor: collect(groupBy(Foo::first, groupBy(Foo::second), HashMap::new)) really feels like noise when reading the code -- all for the sake of removing a trivial overload? Also, for some collectors, we may want a specialized Map implementation, one that is, say, optimized for merging. (Partition, at this point, is basically groupBy with an optimized Map implementation.) In which case the explicit HashMap::new is a performance impediment. So, while I accept that removing the non-explicit-ctor versions could reduce the number of forms, I think its a false economy -- because the resulting user code is worse. From brian.goetz at oracle.com Mon Mar 4 07:55:41 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 04 Mar 2013 10:55:41 -0500 Subject: Code review request In-Reply-To: <512AD584.2080902@oracle.com> References: <512672E6.1050708@oracle.com> <51289F3D.1010609@univ-mlv.fr> <512AD584.2080902@oracle.com> Message-ID: <5134C3FD.4060106@oracle.com> >> All protected fields should not be protected but package visible. >> Classes are package private so there is no need to use a modifier which >> offer a wider visibility. >> The same is true for constructors. > > I believe some of these may end up being public (TBD), in which case > better to define member accessibility as if they were already public as > it greatly simplifies the changes needed later. That's exactly right. These classes were intended to be public and hopefully will be some day. Capturing what should be protected now is a lot easier than baking it back in later. From forax at univ-mlv.fr Mon Mar 4 10:27:12 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 04 Mar 2013 19:27:12 +0100 Subject: Collectors inventory In-Reply-To: <5134BFD3.7090602@oracle.com> References: <5126A74A.3040509@oracle.com> <51288E95.4010903@univ-mlv.fr> <5134BFD3.7090602@oracle.com> Message-ID: <5134E780.3060507@univ-mlv.fr> On 03/04/2013 04:37 PM, Brian Goetz wrote: >>> As I promised a long time ago, here's an overview of what's in >>> Collectors currently. >> >> I think there are too many methods in Collectors, we should restrain >> ourselves to 2 forms (3 max). > > Let me make sure I understand the rationale for such a rule. > > Having more forms has a clear advantage: the client code is simpler > (e.g., free of extra noise like HashMap::new when the user doesn't > care what Map he gets.) Having to open and read the javadoc each time you want to use a Collector or worst each time you read a code that uses a Collector is a big disadvantage IMO. The whole Collector API has to fit into a humain brain. > And the implementations are trivial, so the implementation > complexity is not an issue. No, the issue is more to understand the difference between all the overloads. > Is the sole issue here the "OMG so many Collectors" reaction when > the user goes to the Javadoc page for Collectors? It's more OMG, I have to read a code that use a Collector ... > >>> There are 12 basic forms: >>> - toCollection(ctor) >>> - toList() >>> - toSet() >>> - toStringBuilder() >>> - toStringJoiner(delimiter) >>> - to{Long,Double}Statistics >>> >>> - groupingBy(classifier, mapFactory, downstream collector) >>> - groupingReduce(classifier, mapFactory, mapper, reducer) >>> - mapping(mappingFn, downstream collector) >>> - joiningWith(mappingFunction, mergeFunction, mapFactory) >>> - partitioningBy(predicate, downstream collector) >>> - partitioningReduce(predicate, mapper, reducer) > > To be clear, has anyone objected to any of these basic forms, or are > we only talking about the variants? I am talking about variants. > >>> GroupingBy has four forms: >>> - groupingBy(T->K) -- standard groupBy, values of resulting Map are >>> Collection >>> - Same, but with explicit constructors for map and for rows (so you >>> can produce, say, a TreeMap> and not just a >>> Map>) >>> - groupingBy(T->K, Collector) -- multi-level groupBy, where >>> downstream is another Collector >>> - Same, but with explicit ctor for map >> >> You can remove the third one give, you have the one with an explicit >> constructor. > > I think its a false economy to suggest removing this one. Think about > the user code: > > collect(groupBy(Foo::first, groupBy(Foo::second))) > > is really clear. The extra map ctor: > > collect(groupBy(Foo::first, groupBy(Foo::second), HashMap::new)) > > really feels like noise when reading the code -- all for the sake of > removing a trivial overload? Also, for some collectors, we may want a > specialized Map implementation, one that is, say, optimized for > merging. (Partition, at this point, is basically groupBy with an > optimized Map implementation.) In which case the explicit > HashMap::new is a performance impediment. If you have such Map, you should made it public, people will re-use it. Now for groupBy of groupBy, it's a corner case, for a corner case, it's usually better to be a little more verbose if you end with only one form. Again, it's easier to read and easier to write. > > So, while I accept that removing the non-explicit-ctor versions could > reduce the number of forms, I think its a false economy -- because the > resulting user code is worse. > user code is better because there is less overload (or better one) that can match. maybe later, for jdk9 or jdk10, you can add more collectors if people ask, but I think here it's important to be as simple as possible. R?mi From mike.duigou at oracle.com Mon Mar 4 12:29:43 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Mon, 4 Mar 2013 12:29:43 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong Message-ID: Optional, OptionalDouble, OptionalInt and OptionalLong are now posted for review on core-libs and lambda-dev. Any comments can be sent to core-libs-dev or this list. http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ Mike From tim at peierls.net Mon Mar 4 12:47:12 2013 From: tim at peierls.net (Tim Peierls) Date: Mon, 4 Mar 2013 15:47:12 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: I like this. Typo in all four classes: "who's result" -> "whose result" (or find better wording) --tim On Mon, Mar 4, 2013 at 3:29 PM, Mike Duigou wrote: > Optional, OptionalDouble, OptionalInt and OptionalLong are now posted > for review on core-libs and lambda-dev. > > Any comments can be sent to core-libs-dev or this list. > > http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ > > Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130304/a7155ff7/attachment.html From joe.bowbeer at gmail.com Mon Mar 4 14:44:59 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 4 Mar 2013 14:44:59 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: Last I read, the Optional hashCode and equals methods would support only the identity hashCode()/equals(), but they appear to be delegating to the value's methods, if present. Why the change? Just wondering. On Mon, Mar 4, 2013 at 12:29 PM, Mike Duigou wrote: > Optional, OptionalDouble, OptionalInt and OptionalLong are now posted > for review on core-libs and lambda-dev. > > Any comments can be sent to core-libs-dev or this list. > > http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ > > Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130304/07004329/attachment.html From josh at bloch.us Mon Mar 4 14:57:24 2013 From: josh at bloch.us (Joshua Bloch) Date: Mon, 4 Mar 2013 14:57:24 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: A few minor comments based on a quick perusal of Optional.java: 39 * of code if the value is present.) Period should be outside paren. 47 private final static Optional EMPTY = new Optional<>(); Should be "private static final" according to JLS 8.3.1 Global: The "summary description" (first sentence) of the doc comment for most methods and constructors don't use the conventiontal (third-person singular) verb tense. In other words, the standard would be: 74 * Returns an empty {@code Optional}. but this code says: 74 * Return an empty {@code Optional}. Also the use of a class name as a noun is suspect. Either: 74 * Return an empty {@code Optional} instance. or: 74 * Return an empty optional. are generally preferable. Line 76 (only tangentially related to this change): 76 * @apiNote Historically, we used "Note that" to indicate that what followed was a consequence of some previous normative text, and didn't mandate any additional restrictions. 79 * Instead, use {@code isPresent()}. Shouldn't this be @link instead of @code? 84 @SuppressWarnings("unchecked") It's really confusing that the @suppressWarnings tag is on the method declaration when the the unchecked cast is in the body of the method. I'd create a local variable initialized to the cast expression, and place the warning on the local variable declaration, like so: @SuppressWarnings("unchecked") Optional result = (Optional) EMPTY; return result; 125 * Execute the specified consumer with the value if a value is present, The phrase "execute the specified consumer" is ungainly. It suggests martial law at Walmart. How about "have the specified consumer accept the value if it is present" or some such? 156 public T orElseGet(Supplier other) { Should Supplier be Supplier? That's certainly what a cursory analysis suggests, but I know how easy it is to get these things wrong:( 181 public boolean equals(Object o) { The equals behavior *must *be documented: that two Optional instances are equal if an only if (1) their present values are equal or (2) neither instance contain a value. If you don't document this, users can't depend on it and independent reimplementers aren't required to duplicate it. Oh, and speaking of reimplementation, Oracle and its amici (Microsoft , Scott McNealy , BSA , Picture Archive Council , Ralph Oman , Gene Spafford ) are still claiming that independently reimplementing an API that is described in a copyrighted document violates that copyright. Are you guys really comfortable with this? Finally: 172 public T orElseThrow(Supplier exceptionSupplier) throws V { I believe that X is a much better name than V for a type variable that represents an exception (Throwable). The name V is generally reserved for value types in key-value pairs. Josh On Mon, Mar 4, 2013 at 12:47 PM, Tim Peierls wrote: > I like this. > > Typo in all four classes: "who's result" -> "whose result" (or find better > wording) > > --tim > > > On Mon, Mar 4, 2013 at 3:29 PM, Mike Duigou wrote: > >> Optional, OptionalDouble, OptionalInt and OptionalLong are now posted >> for review on core-libs and lambda-dev. >> >> Any comments can be sent to core-libs-dev or this list. >> >> http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ >> >> Mike > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130304/38b1dd67/attachment.html From forax at univ-mlv.fr Wed Mar 6 01:47:56 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 06 Mar 2013 10:47:56 +0100 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: <513710CC.3010903@univ-mlv.fr> Ok, let be nuclear on this, There is no good reason to introduce Optional in java.util. It doen't work like Google's Guava Optional despite having the same name, it doesn't work like Scala's Option despite having a similar name, moreover the lambda pipeline face a similar issue with the design of collectors (see stream.collect()) but solve that similar problem with a different design, so the design of Optional is not even consistent with the rest of the stream API. So why do we want something like Optional, we want it to be able to represent the fact that as Mike states a returning result can have no value by example Colections.emptyList().stream().findFirst() should 'return' no value. As Stephen Colebourne said, Optional is a bad name because Scala uses Option [1] which can used in the same context, as result of a filter/map etc. but Option in Scala is a way to mask null. Given the name proximity, people will start to use Optional like Option in Scala and we will see methods returning things like Optional>>. Google's Guava, which is a popular library, defines a class named Optional, but allow to store null unlike the current proposed implementation, this will generate a lot of confusions and frustrations. In fact, we don't need Optional at all, because we don't need to return a value that can represent a value or no value, the idea is that methods like findFirst should take a lambda as parameter letting the user to decide what value should be returned by findFirst if there is a value and if there is no value. So instead of stream.findFirst().orElse(null) you will write stream.findFirst(orNull) with orNull() defined as like that public static Optionalizer orNull() { return (isPresent, element) -> isPresent? element: null; } The whole design is explained here [2] and is similar to the way Collectors are defined [3], it's basically the lambda way of thinking, instead of creating an object representing the different states resulting of a call to findFirst, findFirst takes a lambda as parameter which is fed with the states of a call. cheers, R?mi [1] http://www.scala-lang.org/api/current/index.html#scala.Option [2] http://mail.openjdk.java.net/pipermail/lambda-libs-spec-observers/2013-February/001470.html [3] http://hg.openjdk.java.net/lambda/lambda/jdk/file/tip/src/share/classes/java/util/stream/Collectors.java On 03/04/2013 09:29 PM, Mike Duigou wrote: > Hello All; > > This patch introduces Optional container objects to be used by the lambda streams libraries for returning results. > > The reference Optional type, as defined, intentionally does not allow null values. null may be used with the Optional.orElse() method. > > All of the Optional types define hashCode() and equals implementations. Use of Optional types in collections should be generally discouraged but having useful equals() and hashCode() is ever so convenient. > > http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ > > Mike > From forax at univ-mlv.fr Wed Mar 6 03:58:43 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 06 Mar 2013 12:58:43 +0100 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> Message-ID: <51372F73.70004@univ-mlv.fr> On 03/06/2013 11:54 AM, Jed Wesley-Smith wrote: > Really, this is a lot of fuss over nothing. > > There is actually no fundamental difference between Scala's Option, Guava's Optional, Fugue's Option, Java's Optional and Haskell's Maybe ? they are modelling the same thing, the possibility of a value not being present. > > The fact that there may be minor differences in api or semantics around whether null is a legal value are minor in the scheme of things (and yes, null is a pretty stupid legal value of a Some IMHO). > > Stephen's example is ludicrous, why have a list of optional values? You'd flatten down into just a list ? and an optional list only makes sense if the enclosed list is guaranteed to be non-empty, otherwise you just return an empty list! People like shooting their own feet. http://cs.calstatela.edu/wiki/index.php/Courses/CS_460/Fall_2012/Week_8/gamePlay.combat.BattleAnalysis > > If we are going to use potential straw-men as arguments we can stall all progress. Please concentrate on the important matters, let's disavow null as a valid value and save us all a billion dollars Also Scala Option is not the only way to solve the null problem. The JSR308 annotation @Nullable/@NonNull are recognized by Eclipse and IntelliJ at least. > . > > cheers, > jed. cheers, R?mi > > On 06/03/2013, at 8:47 PM, Remi Forax wrote: > >> Ok, let be nuclear on this, >> There is no good reason to introduce Optional in java.util. >> >> It doen't work like Google's Guava Optional despite having the same >> name, it doesn't work like Scala's Option despite having a similar name, >> moreover the lambda pipeline face a similar issue with the design of >> collectors (see stream.collect()) but solve that similar problem with a >> different design, so the design of Optional is not even consistent with >> the rest of the stream API. >> >> So why do we want something like Optional, we want it to be able to >> represent the fact that as Mike states a returning result can have no >> value by example Colections.emptyList().stream().findFirst() should >> 'return' no value. >> >> As Stephen Colebourne said, Optional is a bad name because Scala uses >> Option [1] which can used in the same context, as result of a filter/map >> etc. but Option in Scala is a way to mask null. Given the name >> proximity, people will start to use Optional like Option in Scala and we >> will see methods returning things like Optional>>. >> >> Google's Guava, which is a popular library, defines a class named >> Optional, but allow to store null unlike the current proposed >> implementation, this will generate a lot of confusions and frustrations. >> >> In fact, we don't need Optional at all, because we don't need to return >> a value that can represent a value or no value, >> the idea is that methods like findFirst should take a lambda as >> parameter letting the user to decide what value should be returned by >> findFirst if there is a value and if there is no value. >> So instead of >> stream.findFirst().orElse(null) >> you will write >> stream.findFirst(orNull) >> with orNull() defined as like that >> public static Optionalizer orNull() { >> return (isPresent, element) -> isPresent? element: null; >> } >> >> The whole design is explained here [2] and is similar to the way >> Collectors are defined [3], >> it's basically the lambda way of thinking, instead of creating an object >> representing the different states resulting of a call to findFirst, >> findFirst takes a lambda as parameter which is fed with the states of a >> call. >> >> cheers, >> R?mi >> >> [1] http://www.scala-lang.org/api/current/index.html#scala.Option >> [2] >> http://mail.openjdk.java.net/pipermail/lambda-libs-spec-observers/2013-February/001470.html >> [3] >> http://hg.openjdk.java.net/lambda/lambda/jdk/file/tip/src/share/classes/java/util/stream/Collectors.java >> >> >> On 03/04/2013 09:29 PM, Mike Duigou wrote: >>> Hello All; >>> >>> This patch introduces Optional container objects to be used by the lambda streams libraries for returning results. >>> >>> The reference Optional type, as defined, intentionally does not allow null values. null may be used with the Optional.orElse() method. >>> >>> All of the Optional types define hashCode() and equals implementations. Use of Optional types in collections should be generally discouraged but having useful equals() and hashCode() is ever so convenient. >>> >>> http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ >>> >>> Mike >>> >> From dl at cs.oswego.edu Wed Mar 6 04:09:47 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 06 Mar 2013 07:09:47 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <513710CC.3010903@univ-mlv.fr> References: <513710CC.3010903@univ-mlv.fr> Message-ID: <5137320B.60001@cs.oswego.edu> (Restricting to lambda-libs list...) On 03/06/13 04:47, Remi Forax wrote: > Ok, let be nuclear on this, > There is no good reason to introduce Optional in java.util. We agree about most of the rationale for not using Optional. But there are still people who say they want it. I don't think it is productive at this point to argue about features supporting an Optional-laden programming style. But we never seem to hit closure about features supporting an Optional-free style. So I'd like to re-propose a simple compromise. In the same way that there are Optional and basis-returning versions of reduce: T reduce(T identity, BinaryOperator reducer); Optional reduce(BinaryOperator reducer); (Where the basis-returning one can in turn be used to avoid Optional-returning min(), etc). We should do the same at least for find, or more in keeping with current API, findFirst and findAny: T findFirst(Predicate predicate, T ifNone); T findAny(Predicate predicate, T ifNone); People wanting to avoid Optional can then then get all of the derived versions (allMatch, plain findAny, etc) easily enough. Surprisingly enough, that's the only missing feature that would otherwise enable a completely Optional-free usage style of the Stream API. We have both proposed variants of this several times, but they don't seem to go anywhere. It would be nice to have a calm final discussion about why we would NOT do such an apparently sensible thing! -Doug From forax at univ-mlv.fr Wed Mar 6 08:43:10 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 06 Mar 2013 17:43:10 +0100 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5137320B.60001@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: <5137721E.7030502@univ-mlv.fr> On 03/06/2013 01:09 PM, Doug Lea wrote: > (Restricting to lambda-libs list...) > > On 03/06/13 04:47, Remi Forax wrote: >> Ok, let be nuclear on this, >> There is no good reason to introduce Optional in java.util. > > We agree about most of the rationale for not using Optional. > But there are still people who say they want it. > I don't think it is productive at this point to > argue about features supporting an Optional-laden > programming style. But we never seem to hit closure > about features supporting an Optional-free style. Hi Doug, while I agree that it's not productive to hope that people will not use Optional. like the status quo, Optional or Option are defined in Scala lib or in Guava lib and users can opt-in if they want. The design I propose let Guava by example to provide a simple Optionalizer (again not a good name), that returns an Optional. Something like Optional s = stream.findFirst(asOptional()); with asOptional defined like this: public static Optionalizer> asOptional() { return (isPresent, element) -> isPresent? Optional.of(element): Optional.absent(); } cheers, R?mi > So I'd like to re-propose a simple compromise. > In the same way that there are Optional and > basis-returning versions of reduce: > > T reduce(T identity, BinaryOperator reducer); > Optional reduce(BinaryOperator reducer); > > (Where the basis-returning one can in turn be used to > avoid Optional-returning min(), etc). We should do the > same at least for find, or more in keeping with current > API, findFirst and findAny: > > T findFirst(Predicate predicate, T ifNone); > T findAny(Predicate predicate, T ifNone); > > People wanting to avoid Optional can then then > get all of the derived versions (allMatch, plain > findAny, etc) easily enough. > > Surprisingly enough, that's the only missing > feature that would otherwise enable a completely > Optional-free usage style of the Stream API. > > We have both proposed variants of this several times, > but they don't seem to go anywhere. It would be nice > to have a calm final discussion about why we would NOT > do such an apparently sensible thing! > > -Doug > From tim at peierls.net Wed Mar 6 09:50:18 2013 From: tim at peierls.net (Tim Peierls) Date: Wed, 6 Mar 2013 12:50:18 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5137320B.60001@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: On Wed, Mar 6, 2013 at 7:09 AM, Doug Lea
wrote: > T findFirst(Predicate predicate, T ifNone); > T findAny(Predicate predicate, T ifNone); > > People wanting to avoid Optional can then then > get all of the derived versions (allMatch, plain > findAny, etc) easily enough. > > Surprisingly enough, that's the only missing > feature that would otherwise enable a completely > Optional-free usage style of the Stream API. > > We have both proposed variants of this several times, > but they don't seem to go anywhere. It would be nice > to have a calm final discussion about why we would NOT > do such an apparently sensible thing! I've had too much coffee to be calm, and I have no way of ensuring finality, but the foremost reason I see for not allowing an Optional-free usage style is that people will adopt it rather than use Optional. They will see it as a license to put null everywhere, and they'll get NPEs way downstream and blame it on Java. Optional should be (and currently is) a very limited abstraction, one that is only good for holding a potential result, testing for its presence, retrieving it if it is present, and providing an alternative if not. We should resist the temptation to make it into something more or make it into a knock-off of the similar Scala type. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/96d5386b/attachment.html From spullara at gmail.com Wed Mar 6 10:12:40 2013 From: spullara at gmail.com (Sam Pullara) Date: Wed, 6 Mar 2013 10:12:40 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: I am for removing it and in favor of providing a default if it doesn't have nearly the same functionality as the Scala Option. The way Optional is written right now I would tell people not to use it anyway and it would just be a wart on this API. Sam On Mar 6, 2013, at 9:50 AM, Tim Peierls wrote: > On Wed, Mar 6, 2013 at 7:09 AM, Doug Lea
wrote: > T findFirst(Predicate predicate, T ifNone); > T findAny(Predicate predicate, T ifNone); > > People wanting to avoid Optional can then then > get all of the derived versions (allMatch, plain > findAny, etc) easily enough. > > Surprisingly enough, that's the only missing > feature that would otherwise enable a completely > Optional-free usage style of the Stream API. > > We have both proposed variants of this several times, > but they don't seem to go anywhere. It would be nice > to have a calm final discussion about why we would NOT > do such an apparently sensible thing! > > I've had too much coffee to be calm, and I have no way of ensuring finality, but the foremost reason I see for not allowing an Optional-free usage style is that people will adopt it rather than use Optional. They will see it as a license to put null everywhere, and they'll get NPEs way downstream and blame it on Java. > > Optional should be (and currently is) a very limited abstraction, one that is only good for holding a potential result, testing for its presence, retrieving it if it is present, and providing an alternative if not. We should resist the temptation to make it into something more or make it into a knock-off of the similar Scala type. > > --tim > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/64f3ca62/attachment-0001.html From brian.goetz at oracle.com Wed Mar 6 10:55:35 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 13:55:35 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5137320B.60001@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: > On 03/06/13 04:47, Remi Forax wrote: >> Ok, let be nuclear on this, >> There is no good reason to introduce Optional in java.util. We already went around on this several times, made a decision, and no new information has come to light recently to warrant reopening the "do we want optional" discussion. This is just raising the same arguments we've seen before. Let's move on. On Mar 6, 2013, at 7:09 AM, Doug Lea wrote: > We agree about most of the rationale for not using Optional. > But there are still people who say they want it. > I don't think it is productive at this point to > argue about features supporting an Optional-laden > programming style. But we never seem to hit closure > about features supporting an Optional-free style. This is a reasonable discussion to have. Doug's list is not quite exhaustive -- there are also Option-bearing methods on the primitive streams such as min and max -- but its close. > So I'd like to re-propose a simple compromise. > In the same way that there are Optional and > basis-returning versions of reduce: I am OK with adding these as they are dirt-simple and give people a reasonable path to an option-free lifestyle. I am not OK with imposing an option-free lifestyle on everyone. > We have both proposed variants of this several times, > but they don't seem to go anywhere. Because they were only proposed as being *instead of* the Optional-bearing version, in which role they were deficient. > It would be nice > to have a calm final discussion about why we would NOT > do such an apparently sensible thing! Agreed. Willing to have a calm final discussion on these. From dl at cs.oswego.edu Wed Mar 6 11:24:53 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 06 Mar 2013 14:24:53 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: <51379805.20501@cs.oswego.edu> On 03/06/13 13:55, Brian Goetz wrote: > > This is a reasonable discussion to have. Doug's list is not quite exhaustive -- there are also Option-bearing methods on the primitive streams such as min and max -- but its close. > I just did a recheck and it seems that all are somehow derivable (e.g., long min as map to long, then reduce(MAX_VALUE, ...) except for findFirst/findAny (but including versions for int, long, double). >> It would be nice >> to have a calm final discussion about why we would NOT >> do such an apparently sensible thing! > > Agreed. Willing to have a calm final discussion on these. > OK. My position is just to be prudent and conservative in the face of looming deadlines: The range of applicability, recommended usages, and API of Optional are still controversial. We've just seen at least 6 different opinions about it in 24 hrs. Adding a couple of methods that do not require its use, but still obtain all of the base functionality of Streams is a small insurance policy in the face of uncertainty about how it will be received by a wide audience. -Doug From joe.bowbeer at gmail.com Wed Mar 6 11:31:35 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 6 Mar 2013 11:31:35 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: I might be OK with Doug's suggestion. I want to see a complete proposal. I think Remi's sugaring looks OK. For Option lovers, one way to view this: it enables someone to provide their own Option instead of the one we provide. Right? If not, then I'm less favorable. On Mar 6, 2013 10:56 AM, "Brian Goetz" wrote: > > On 03/06/13 04:47, Remi Forax wrote: > >> Ok, let be nuclear on this, > >> There is no good reason to introduce Optional in java.util. > > We already went around on this several times, made a decision, and no new > information has come to light recently to warrant reopening the "do we want > optional" discussion. This is just raising the same arguments we've seen > before. Let's move on. > > On Mar 6, 2013, at 7:09 AM, Doug Lea wrote: > > > We agree about most of the rationale for not using Optional. > > But there are still people who say they want it. > > I don't think it is productive at this point to > > argue about features supporting an Optional-laden > > programming style. But we never seem to hit closure > > about features supporting an Optional-free style. > > This is a reasonable discussion to have. Doug's list is not quite > exhaustive -- there are also Option-bearing methods on the primitive > streams such as min and max -- but its close. > > > So I'd like to re-propose a simple compromise. > > In the same way that there are Optional and > > basis-returning versions of reduce: > > I am OK with adding these as they are dirt-simple and give people a > reasonable path to an option-free lifestyle. I am not OK with imposing an > option-free lifestyle on everyone. > > > We have both proposed variants of this several times, > > but they don't seem to go anywhere. > > Because they were only proposed as being *instead of* the Optional-bearing > version, in which role they were deficient. > > > It would be nice > > to have a calm final discussion about why we would NOT > > do such an apparently sensible thing! > > Agreed. Willing to have a calm final discussion on these. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/c0f12f8f/attachment.html From brian.goetz at oracle.com Wed Mar 6 11:34:21 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 14:34:21 -0500 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: <9CB98193-8955-4181-B553-F3F2D3507886@oracle.com> > For Option lovers, one way to view this: it enables someone to provide their own Option instead of the one we provide. Right? If not, then I'm less favorable. > No, not right. It prevents people from distinguishing between a stream that is empty and a stream containing only the "orElse" value. Just like Map.get() prevents distinguishing between "not there" and "mapped to null." -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/30ee6695/attachment.html From joe.bowbeer at gmail.com Wed Mar 6 11:46:09 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 6 Mar 2013 11:46:09 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <9CB98193-8955-4181-B553-F3F2D3507886@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <9CB98193-8955-4181-B553-F3F2D3507886@oracle.com> Message-ID: In this case, I do not favor the addition. I'll believe the Option lovers that Options are cool and that our version is good enough. The alternative is OK too but I think it would be worse to include both as long as our Option is good enough. If our Option is not good enough then we should adopt the alternative instead. So far, in my one limited use (findFirst) or Optional works OK. Still wondering about equals and hashCode... On Mar 6, 2013 11:34 AM, "Brian Goetz" wrote: > For Option lovers, one way to view this: it enables someone to provide > their own Option instead of the one we provide. Right? If not, then I'm > less favorable. > > No, not right. It prevents people from distinguishing between a stream > that is empty and a stream containing only the "orElse" value. Just like > Map.get() prevents distinguishing between "not there" and "mapped to null." > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/0e77be87/attachment.html From forax at univ-mlv.fr Wed Mar 6 11:44:53 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 06 Mar 2013 20:44:53 +0100 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <9CB98193-8955-4181-B553-F3F2D3507886@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <9CB98193-8955-4181-B553-F3F2D3507886@oracle.com> Message-ID: <51379CB5.8020003@univ-mlv.fr> On 03/06/2013 08:34 PM, Brian Goetz wrote: >> >> For Option lovers, one way to view this: it enables someone to >> provide their own Option instead of the one we provide. Right? If >> not, then I'm less favorable. >> > No, not right. It prevents people from distinguishing between a > stream that is empty and a stream containing only the "orElse" value. > Just like Map.get() prevents distinguishing between "not there" and > "mapped to null." > I don't know if 'it' is my proposal or not. If it is, yes you can use any Option implementations you want because you know if the value is present or not and if the value is present, you know the value. So you can return the Option implementation you want. R?mi From Donald.Raab at gs.com Wed Mar 6 12:32:05 2013 From: Donald.Raab at gs.com (Raab, Donald) Date: Wed, 6 Mar 2013 15:32:05 -0500 Subject: Collectors inventory In-Reply-To: <5126A74A.3040509@oracle.com> References: <5126A74A.3040509@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Some suggestions: 1. I would rename overloaded forms for map and mapping for primitive types to: a. map, mapInt, mapLong, mapDouble on Stream b. mapping, mappingInt, mappingLong, mappingDouble on Collector 2. Rename joinWith to toMap 3. Move toStatistics directly to IntStream, LongStream, DoubleStream instead of using stream.collect(toStatistics()) The primitive form renames will make it less confusing to the reader of the code as to what they are getting, especially if someone codes in a fluent style. When the primitive streams came out in one of the binary releases, I was confused editing our kata exercises, because the code had moved from being a Stream to a DoubleStream, and the protocols for these are slightly different. Being able to compose Collectors/reducers is an interesting idea. I think being able to move some of this protocol to a builder approach may make it easier to discover and use rather than having a lot of overloaded static forms on Collectors. > -----Original Message----- > From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- > libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz > Sent: Thursday, February 21, 2013 6:02 PM > To: lambda-libs-spec-experts at openjdk.java.net > Subject: Collectors inventory > > As I promised a long time ago, here's an overview of what's in > Collectors currently. > > There are 12 basic forms: > - toCollection(ctor) > - toList() > - toSet() > - toStringBuilder() > - toStringJoiner(delimiter) > - to{Long,Double}Statistics > > - groupingBy(classifier, mapFactory, downstream collector) > - groupingReduce(classifier, mapFactory, mapper, reducer) > - mapping(mappingFn, downstream collector) > - joiningWith(mappingFunction, mergeFunction, mapFactory) > - partitioningBy(predicate, downstream collector) > - partitioningReduce(predicate, mapper, reducer) > > The toXxx forms should be obvious. > > Mapping has four versions, analogous to Stream.map: > - mapping(T -> U, Collector) > - mapping(T -> int, Collector.OfInt) > - mapping(T -> long, Collector.OfLong) > - mapping(T -> double, Collector.OfDouble) > > GroupingBy has four forms: > - groupingBy(T->K) -- standard groupBy, values of resulting Map are > Collection > - Same, but with explicit constructors for map and for rows (so you > can produce, say, a TreeMap> and not just a > Map>) > - groupingBy(T->K, Collector) -- multi-level groupBy, where > downstream is another Collector > - Same, but with explicit ctor for map > > GroupingReduce has four forms: > - groupingReduce(T->K, BinaryOperator) // simple reduce > - groupingReduce(T->K, Function, BinaryOperator) // map- > reduce > - above two with explicit map ctors > > JoiningWith has four forms: > - joiningWith(T->U) > - same, but with explicit Map ctor > - same, but with merge function for handling duplicates > - same, with both explicit map ctor and merge function > > PartitioningBy has three forms: > - partitioningBy(Predicate) > - Same, but with explicit constructor for Collection (so you can get > a Map>) > - partitioningBy(Predicate, Collector) // multi-level > > PartitioningReduce has two forms: > - predicate + reducer > - predicate + mapper + reducer > > Impl note: in any category, all but one are one-liners that delegate to > the general form. > > Plus, all the Map-bearing ones have a concurrent and non-concurrent > version. > From brian.goetz at oracle.com Wed Mar 6 14:47:29 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 17:47:29 -0500 Subject: map and flatMap names (was: Collectors inventory) In-Reply-To: <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <156C5C65-7D58-4966-9877-7E5F116B2509@oracle.com> Breaking this into multiple messages. > 1. I would rename overloaded forms for map and mapping for primitive types to: > a. map, mapInt, mapLong, mapDouble on Stream So, we've run into some issues with the compiler being able to tell the difference between the various flatMap overloadings. We don't have that problem with the map overloadings, but it would be good to be consistent in our naming. To be specific, I think what you're suggesting is: - rename map(ToIntMapper) and friends to something like mapToInt() - rename flatMap(FlatMapper.ToInt) and friends to something like flatMapToInt() I would be OK with this. I think there's some readability value that is gained, since the return type (IntStream) does not usually appear in the code since the user is likely to keep chaining, having "Int" appear somewhere makes it clear that we're switching stream shapes. On Mar 6, 2013, at 3:32 PM, Raab, Donald wrote: > Some suggestions: > > 1. I would rename overloaded forms for map and mapping for primitive types to: > a. map, mapInt, mapLong, mapDouble on Stream > b. mapping, mappingInt, mappingLong, mappingDouble on Collector > 2. Rename joinWith to toMap > 3. Move toStatistics directly to IntStream, LongStream, DoubleStream instead of using stream.collect(toStatistics()) > > The primitive form renames will make it less confusing to the reader of the code as to what they are getting, especially if someone codes in a fluent style. When the primitive streams came out in one of the binary releases, I was confused editing our kata exercises, because the code had moved from being a Stream to a DoubleStream, and the protocols for these are slightly different. > > Being able to compose Collectors/reducers is an interesting idea. I think being able to move some of this protocol to a builder approach may make it easier to discover and use rather than having a lot of overloaded static forms on Collectors. > >> -----Original Message----- >> From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- >> libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz >> Sent: Thursday, February 21, 2013 6:02 PM >> To: lambda-libs-spec-experts at openjdk.java.net >> Subject: Collectors inventory >> >> As I promised a long time ago, here's an overview of what's in >> Collectors currently. >> >> There are 12 basic forms: >> - toCollection(ctor) >> - toList() >> - toSet() >> - toStringBuilder() >> - toStringJoiner(delimiter) >> - to{Long,Double}Statistics >> >> - groupingBy(classifier, mapFactory, downstream collector) >> - groupingReduce(classifier, mapFactory, mapper, reducer) >> - mapping(mappingFn, downstream collector) >> - joiningWith(mappingFunction, mergeFunction, mapFactory) >> - partitioningBy(predicate, downstream collector) >> - partitioningReduce(predicate, mapper, reducer) >> >> The toXxx forms should be obvious. >> >> Mapping has four versions, analogous to Stream.map: >> - mapping(T -> U, Collector) >> - mapping(T -> int, Collector.OfInt) >> - mapping(T -> long, Collector.OfLong) >> - mapping(T -> double, Collector.OfDouble) >> >> GroupingBy has four forms: >> - groupingBy(T->K) -- standard groupBy, values of resulting Map are >> Collection >> - Same, but with explicit constructors for map and for rows (so you >> can produce, say, a TreeMap> and not just a >> Map>) >> - groupingBy(T->K, Collector) -- multi-level groupBy, where >> downstream is another Collector >> - Same, but with explicit ctor for map >> >> GroupingReduce has four forms: >> - groupingReduce(T->K, BinaryOperator) // simple reduce >> - groupingReduce(T->K, Function, BinaryOperator) // map- >> reduce >> - above two with explicit map ctors >> >> JoiningWith has four forms: >> - joiningWith(T->U) >> - same, but with explicit Map ctor >> - same, but with merge function for handling duplicates >> - same, with both explicit map ctor and merge function >> >> PartitioningBy has three forms: >> - partitioningBy(Predicate) >> - Same, but with explicit constructor for Collection (so you can get >> a Map>) >> - partitioningBy(Predicate, Collector) // multi-level >> >> PartitioningReduce has two forms: >> - predicate + reducer >> - predicate + mapper + reducer >> >> Impl note: in any category, all but one are one-liners that delegate to >> the general form. >> >> Plus, all the Map-bearing ones have a concurrent and non-concurrent >> version. >> > From brian.goetz at oracle.com Wed Mar 6 14:49:46 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 17:49:46 -0500 Subject: Collectors inventory In-Reply-To: <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <7893BD78-67FF-4464-BA67-BED49C5ABC02@oracle.com> Breaking this into multiple messages. > 2. Rename joinWith to toMap Why didn't I think of this! I like this. Much clearer. We currently have four forms, which form the cross product of: { with merge function, without } x { with explicit ctor, without } If it makes the "too many Collectors" contingent happier, we can reduce this to two forms: Collector> toMap(mapper) Collector> toMap(mapper, merger, mapCtor) From brian.goetz at oracle.com Wed Mar 6 14:51:05 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 17:51:05 -0500 Subject: Collectors inventory In-Reply-To: <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: Breaking this into multiple messages. > 3. Move toStatistics directly to IntStream, LongStream, DoubleStream instead of using stream.collect(toStatistics()) No objection to adding these to IntStream and friends (one-liner), but I think it also has to stay in Collectors, so that you can do things like gather statistics on properties of object streams, such as "sales volume statistics by salesman" queries. From joe.bowbeer at gmail.com Wed Mar 6 14:57:38 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 6 Mar 2013 14:57:38 -0800 Subject: Collectors inventory In-Reply-To: References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: I don't know what toStatistics does and can't guess, so I don't favor making it top-level on primitive streams. I'd rather encapsulate this inside Collectors where there is a chance that some overview documentation might explain what these more arcane (user-friendly?) Collectors methods do. I'm all in favor or name changes that make them more self-explanatory. toMap seems like a step in that direction. Joe On Wed, Mar 6, 2013 at 2:51 PM, Brian Goetz wrote: > Breaking this into multiple messages. > > > 3. Move toStatistics directly to IntStream, LongStream, DoubleStream > instead of using stream.collect(toStatistics()) > > No objection to adding these to IntStream and friends (one-liner), but I > think it also has to stay in Collectors, so that you can do things like > gather statistics on properties of object streams, such as "sales volume > statistics by salesman" queries. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/b83e7c0b/attachment.html From brian.goetz at oracle.com Wed Mar 6 14:59:33 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 6 Mar 2013 17:59:33 -0500 Subject: Collectors inventory In-Reply-To: References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <664501BD-6ECF-4B64-8D14-5A677B1E35B2@oracle.com> Its just like the toStatistics of ParallelArray; it reduces the stream to a Statistics object which has count, sum, min, and max methods. On Mar 6, 2013, at 5:57 PM, Joe Bowbeer wrote: > I don't know what toStatistics does and can't guess, so I don't favor making it top-level on primitive streams. > > I'd rather encapsulate this inside Collectors where there is a chance that some overview documentation might explain what these more arcane (user-friendly?) Collectors methods do. > > I'm all in favor or name changes that make them more self-explanatory. toMap seems like a step in that direction. > > Joe > > > On Wed, Mar 6, 2013 at 2:51 PM, Brian Goetz wrote: > Breaking this into multiple messages. > > > 3. Move toStatistics directly to IntStream, LongStream, DoubleStream instead of using stream.collect(toStatistics()) > > No objection to adding these to IntStream and friends (one-liner), but I think it also has to stay in Collectors, so that you can do things like gather statistics on properties of object streams, such as "sales volume statistics by salesman" queries. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/1969f147/attachment-0001.html From joe.bowbeer at gmail.com Wed Mar 6 15:05:38 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 6 Mar 2013 15:05:38 -0800 Subject: Collectors inventory In-Reply-To: <664501BD-6ECF-4B64-8D14-5A677B1E35B2@oracle.com> References: <5126A74A.3040509@oracle.com> <6712820CB52CFB4D842561213A77C05404C62F132F@GSCMAMP09EX.firmwide.corp.gs.com> <664501BD-6ECF-4B64-8D14-5A677B1E35B2@oracle.com> Message-ID: I'm just saying it's not what I want to see when I'm typing dot-crtl-space after a stream in my IDE... Something like toParallelArray().toStatistics() is OK. Joe On Wed, Mar 6, 2013 at 2:59 PM, Brian Goetz wrote: > Its just like the toStatistics of ParallelArray; it reduces the stream to > a Statistics object which has count, sum, min, and max methods. > > > On Mar 6, 2013, at 5:57 PM, Joe Bowbeer wrote: > > I don't know what toStatistics does and can't guess, so I don't favor > making it top-level on primitive streams. > > I'd rather encapsulate this inside Collectors where there is a chance that > some overview documentation might explain what these more arcane > (user-friendly?) Collectors methods do. > > I'm all in favor or name changes that make them more self-explanatory. > toMap seems like a step in that direction. > > Joe > > > On Wed, Mar 6, 2013 at 2:51 PM, Brian Goetz wrote: > >> Breaking this into multiple messages. >> >> > 3. Move toStatistics directly to IntStream, LongStream, DoubleStream >> instead of using stream.collect(toStatistics()) >> >> No objection to adding these to IntStream and friends (one-liner), but I >> think it also has to stay in Collectors, so that you can do things like >> gather statistics on properties of object streams, such as "sales volume >> statistics by salesman" queries. >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130306/5bf3d2ab/attachment.html From brian.goetz at oracle.com Thu Mar 7 11:07:03 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 07 Mar 2013 14:07:03 -0500 Subject: Collectors.joiningWith Message-ID: <5138E557.1090102@oracle.com> On Don's suggestion, I replaced the four forms of Collectors.joiningWith with two forms of toMap: Collector> toMap(Function mapper) Collector toMap(Function mapper, Supplier mapSupplier, BinaryOperator mergeFunction) The four forms we had were: { has merge function, not } x { has ctor, not } and these were replace with two forms: has nothing, has everything. I think the toMap name makes more sense anyway. We'll have to expose an entry point for the default merge function (which throws), currently is called Collectors.throwingMerger(). So if you want the default behavior but with TreeMap, you'd do: TreeMap m = s.collect(toMap(func, TreeMap::new, throwingMerger())); From brian.goetz at oracle.com Thu Mar 7 15:09:20 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 07 Mar 2013 18:09:20 -0500 Subject: Collectors inventory In-Reply-To: <5126A74A.3040509@oracle.com> References: <5126A74A.3040509@oracle.com> Message-ID: <51391E20.3030409@oracle.com> Update on this: On 2/21/2013 6:01 PM, Brian Goetz wrote: > As I promised a long time ago, here's an overview of what's in > Collectors currently. > > There are 12 basic forms: > - toCollection(ctor) > - toList() > - toSet() > - toStringBuilder() > - toStringJoiner(delimiter) > - to{Long,Double}Statistics To this group, we'll add: toMap -- two forms (explicit map ctor and not) toConcurrentMap -- two forms (same) and get rid of the four forms of joiningWith in each of Collectors and ConcurrentCollectors. This leaves us with the more complex mess of: > - groupingBy(classifier, mapFactory, downstream collector) > - groupingReduce(classifier, mapFactory, mapper, reducer) > - partitioningBy(predicate, downstream collector) > - partitioningReduce(predicate, mapper, reducer) for which a new story is being prepared, stay tuned. From mike.duigou at oracle.com Fri Mar 8 12:08:26 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 8 Mar 2013 12:08:26 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: Corrected. Thank you. On Mar 4 2013, at 12:47 , Tim Peierls wrote: > I like this. > > Typo in all four classes: "who's result" -> "whose result" (or find better wording) > > --tim > > On Mon, Mar 4, 2013 at 3:29 PM, Mike Duigou wrote: > Optional, OptionalDouble, OptionalInt and OptionalLong are now posted for review on core-libs and lambda-dev. > > Any comments can be sent to core-libs-dev or this list. > > http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ > > Mike > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130308/896292f3/attachment.html From mike.duigou at oracle.com Fri Mar 8 12:16:35 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 8 Mar 2013 12:16:35 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: We talked to Kevin about their experiences with Guava's Optional. His response was that they felt reasonable hashCode/equals methods were obligatory and without them users would, if not immediately then eventually, curse us for not providing them. The implementations are added with grudging reluctance. Mike On Mar 4 2013, at 14:44 , Joe Bowbeer wrote: > Last I read, the Optional hashCode and equals methods would support only the identity hashCode()/equals(), but they appear to be delegating to the value's methods, if present. > > Why the change? Just wondering. > > > On Mon, Mar 4, 2013 at 12:29 PM, Mike Duigou wrote: > Optional, OptionalDouble, OptionalInt and OptionalLong are now posted for review on core-libs and lambda-dev. > > Any comments can be sent to core-libs-dev or this list. > > http://cr.openjdk.java.net/~mduigou/JDK-8001642/0/webrev/ > > Mike > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130308/7d53d0d9/attachment.html From mike.duigou at oracle.com Fri Mar 8 12:24:31 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 8 Mar 2013 12:24:31 -0800 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: Message-ID: Thank you for the corrections! Mike From forax at univ-mlv.fr Sun Mar 10 06:26:51 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 10 Mar 2013 14:26:51 +0100 Subject: Spliterator.IMMUTABLE Message-ID: <513C8A1B.9090301@univ-mlv.fr> I've just discovered Spliterator.IMMUTABLE, I think this flag has the wrong name, given it doesn't mean that the Spliterator is immutable (usually tryAdvance change the state of the Spliterator) but the fact that it doesn't act as a view of the collection that creates it. Maybe DETACHED is a better word ? cheers, R?mi From dl at cs.oswego.edu Sun Mar 10 06:35:24 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 10 Mar 2013 09:35:24 -0400 Subject: Spliterator.IMMUTABLE In-Reply-To: <513C8A1B.9090301@univ-mlv.fr> References: <513C8A1B.9090301@univ-mlv.fr> Message-ID: <513C8C1C.20003@cs.oswego.edu> On 03/10/13 09:26, Remi Forax wrote: > I've just discovered Spliterator.IMMUTABLE, > I think this flag has the wrong name, given it doesn't mean that the Spliterator > is immutable (usually tryAdvance change the state of the Spliterator) but the > fact that it doesn't act as a view of the collection that creates it. > ... and CONCURRENT doesn't mean that the Spliterator is concurrent. And so on. As stated in the javadocs, spliterator characteristics apply to the elements and/or their sources, not the Spliterators themselves. Maybe this could be better clarified in the javadocs. -Doug From forax at univ-mlv.fr Sun Mar 10 06:35:09 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 10 Mar 2013 14:35:09 +0100 Subject: Spliterator.IMMUTABLE In-Reply-To: <513C8C1C.20003@cs.oswego.edu> References: <513C8A1B.9090301@univ-mlv.fr> <513C8C1C.20003@cs.oswego.edu> Message-ID: <513C8C0D.7050708@univ-mlv.fr> On 03/10/2013 02:35 PM, Doug Lea wrote: > On 03/10/13 09:26, Remi Forax wrote: >> I've just discovered Spliterator.IMMUTABLE, >> I think this flag has the wrong name, given it doesn't mean that the >> Spliterator >> is immutable (usually tryAdvance change the state of the Spliterator) >> but the >> fact that it doesn't act as a view of the collection that creates it. >> > > ... and CONCURRENT doesn't mean that the Spliterator is concurrent. > And so on. As stated in the javadocs, spliterator characteristics apply > to the elements and/or their sources, not the Spliterators themselves. > Maybe this could be better clarified in the javadocs. Spliterator.IMMUTABLE doesn't means that the source is immutable too, it means that even if the elements of the source changed after creation, the element pushed by the Spliterator will not changed. maybe SNAPSHOT is better ? > > -Doug > > > R?mi From dl at cs.oswego.edu Sun Mar 10 07:15:16 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 10 Mar 2013 10:15:16 -0400 Subject: Spliterator.IMMUTABLE In-Reply-To: <513C8C0D.7050708@univ-mlv.fr> References: <513C8A1B.9090301@univ-mlv.fr> <513C8C1C.20003@cs.oswego.edu> <513C8C0D.7050708@univ-mlv.fr> Message-ID: <513C9574.3090109@cs.oswego.edu> On 03/10/13 09:35, Remi Forax wrote: > On 03/10/2013 02:35 PM, Doug Lea wrote: >> On 03/10/13 09:26, Remi Forax wrote: >>> I've just discovered Spliterator.IMMUTABLE, >>> I think this flag has the wrong name, given it doesn't mean that the Spliterator >>> is immutable (usually tryAdvance change the state of the Spliterator) but the >>> fact that it doesn't act as a view of the collection that creates it. >>> >> >> ... and CONCURRENT doesn't mean that the Spliterator is concurrent. >> And so on. As stated in the javadocs, spliterator characteristics apply >> to the elements and/or their sources, not the Spliterators themselves. >> Maybe this could be better clarified in the javadocs. > > Spliterator.IMMUTABLE doesn't means that the source is immutable too, > it means that even if the elements of the source changed after creation, > the element pushed by the Spliterator will not changed. > This is too Collections-centric a view. Spliterator.characteristics are intended to apply across anything you can define a Spliterator for. (They include a few properties that are not yet exploited much in Streams.) As far as a user of a Spliterator is concerned, there are three cases of potential interest here: CONCURRENT: The structure (e.g., number, order) and/or elements are allowed to change dynamically and in such cases are traversed under a given defined semantics. IMMUTABLE: The structure and elements cannot change during traversal. [Other] Any change in structure and/or elements represents a usage error, in which case the Spliterator is expected to have a documented course of action upon any detected change (normally a best-effort ConcurrentModificationException check). Now, whether an IMMUTABLE spliterator arises because the source is immutable (for example a raw input stream) or because it is a snapshot (for example CopyOnWriteArrayList) doesn't matter. Does that help? -Doug From brian.goetz at oracle.com Sun Mar 10 14:20:14 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 10 Mar 2013 17:20:14 -0400 Subject: Collectors inventory In-Reply-To: <5126A74A.3040509@oracle.com> References: <5126A74A.3040509@oracle.com> Message-ID: <513CF90E.9020803@oracle.com> OK, I've revamped Collectors in a way that may avoid the overload that Kevin, Remi, and Joe were concerned about. At the same time, I've integrated concurrent collection into the model in a more obvious way. The key problem is grouping-by. There are essentially sixteen forms of groupingBy: { concurrent, not } x { with explicit Map constructors, not } x { simple group-by, cascaded group-by (downstream collector), simple reduce, map-reduce } Its pretty hard to argue than any of these dimensions can be obviously jettisoned. And simply pruning around the edges (e.g., "get rid of this variant") doesn't do the job. Nor does "only provide the most general form", which guarantees that no one will be able to use it at all. With the help of Don and his team last week, I came up with an alternate framing for groupingBy (and also partitioningBy, which has the same problems). The key is to introduce an additional type, call it GroupingCollector, off of which we can hang some of the variants, and this lets us reduce the number of top-level collectors. The current inventory, under this scheme (which I'll check in soon) is: - to{Collection,List,Set} - toString{Builder,Joiner} - to{Int,Long,Double}Statistics - toMap(mappingFn) // was mappedTo - toMap(mappingFn, mapCtor) - toConcurrentMap(mappingFn) // was ConcurrentCollectors.mappedTo - toConcurrentMap(mappingFn, mapCtor) - mapping(mappingFn, downstreamCollector) // plus primitive forms - groupingBy(classifierFn) - groupingBy(classifierFn, mapCtor) - groupingByConcurrent(classifierFn) - groupingByConcurrent(classifierFn, mapCtor) - partitioningBy(predicate) - partitioningByConcurrent(predicate) This is a significant reduction in top-level forms -- we drop from 16 groupingXxx forms to four, a similar reduction for partitioning forms, and -- most importantly ConcurrentCollectors *just goes away*. Where it moves to is that the return type of groupingBy gets more complicated. Instead of returning a simple Collector, it returns a GroupingCollector. In its current form, GroupingCollector implements Collector -- meaning you can use groupingBy(f) as a plain collector -- but the more advanced forms (cascading, reducing) are hanging as extra methods off the GroupingCollector. For example: // Simple form -- people by city Map> m = people.stream().collect(groupingBy(Person::getCity)); // Two-level form -- people by state, city // Uses .then(otherCollector) method Map>> m = people.stream() .collect(groupingBy(Person::getState) .then(groupingBy(Person::getCity))); // Reducing form -- count of people by city // Uses .thenReducing(mapper, reducer) method Map m = people.stream() .collect(groupingBy(Person::getState) .thenReducing(p -> 1, Integer::sum)); The methods that appear on GroupingCollector are: .then(Collector downstream) -- cascaded groupBy .thenReducing(BinaryOperator) -- reduce .thenReducing(Function, BinaryOperator) -- map/reduce Partitioning is similar except the thenReducing methods need an identity argument too. public static interface GroupingCollector extends Collector>> { Collector> then(Collector downstream); Collector> thenReducing(BinaryOperator reducer); Collector> thenReducing(Function mapper, BinaryOperator reducer); } } The slightly weird thing about this is that a GroupingCollector is both a Collector (for the simple form) and a factory for collectors (for the cascaded forms). This makes the user code better (a simple group by is just collecting(groupingBy(f))), but makes the type harder to understand. We can adjust this tradeoff by severing the "extends Collector" and adding another method for "get me a simple collector", but I'm not sure this is an improvement. This would probably look like: groupingBy(fn).toList() or some such. One variant we did jettison is the one where you provide an explicit Collection ctor, so you could group into a Set instead of a List. (You can still get this with groupingBy(f).then(toCollection(ctor)). If we did the above transformation, this could come back as: groupingBy(fn).toCollection(ctor) or some such. Overall this seems a much more approachable set of Collectors. Still a few fine details to work out, including: - Does "GroupingCollector extends Collector" simplify or complicate? - Naming of everything - Do we want to add back the "grouping to explicit collection" form. From brian.goetz at oracle.com Sun Mar 10 16:22:02 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 10 Mar 2013 19:22:02 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5137320B.60001@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> Message-ID: <513D159A.7050008@oracle.com> I've posted a survey for the EG at: https://www.surveymonkey.com/s/NSXMYC2 where people can express their preference between: - Leave things as they are (Optional-bearing methods for findXxx and reduce); - Add, as Doug suggests, non-optional versions of these too. Implementation / spec complexity is a non-issue here -- the implementations are trivial. The sole issue is whether the API is better with one version or with both. The password has been communicated directly to the EG; contact me if you didn't get it. Usual survey rules: enter your name with your response, all results will be made public after the survey closes. I'll set a closing time of 6PM PT Wednesday of this week. On 3/6/2013 7:09 AM, Doug Lea wrote: > (Restricting to lambda-libs list...) > > On 03/06/13 04:47, Remi Forax wrote: >> Ok, let be nuclear on this, >> There is no good reason to introduce Optional in java.util. > > We agree about most of the rationale for not using Optional. > But there are still people who say they want it. > I don't think it is productive at this point to > argue about features supporting an Optional-laden > programming style. But we never seem to hit closure > about features supporting an Optional-free style. > So I'd like to re-propose a simple compromise. > In the same way that there are Optional and > basis-returning versions of reduce: > > T reduce(T identity, BinaryOperator reducer); > Optional reduce(BinaryOperator reducer); > > (Where the basis-returning one can in turn be used to > avoid Optional-returning min(), etc). We should do the > same at least for find, or more in keeping with current > API, findFirst and findAny: > > T findFirst(Predicate predicate, T ifNone); > T findAny(Predicate predicate, T ifNone); > > People wanting to avoid Optional can then then > get all of the derived versions (allMatch, plain > findAny, etc) easily enough. > > Surprisingly enough, that's the only missing > feature that would otherwise enable a completely > Optional-free usage style of the Stream API. > > We have both proposed variants of this several times, > but they don't seem to go anywhere. It would be nice > to have a calm final discussion about why we would NOT > do such an apparently sensible thing! > > -Doug > From brian.goetz at oracle.com Sun Mar 10 16:51:09 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 10 Mar 2013 19:51:09 -0400 Subject: Survey on map/flatMap disambiguation Message-ID: <513D1C6D.40906@oracle.com> I've posted a survey for the EG at: https://www.surveymonkey.com/s/NT5DW7G where people can express their opinion on the issue of flatMap disambiguation (see thread entitled "flatMap ambiguity"). The password has been communicated directly to the EG; contact me if you didn't get it. Usual survey rules: enter your name with your response, all results will be made public after the survey closes. I'll set a closing time of 6PM PT Wednesday of this week. From brian.goetz at oracle.com Sun Mar 10 17:07:07 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 10 Mar 2013 20:07:07 -0400 Subject: Arrays methods Message-ID: <513D202B.7070702@oracle.com> Here's a summary of stream-related methods that are currently in java.util.Arrays. All have one-line implementations. Object/int/long/double versions of: stream(T[] array) stream(T[] array, int start, int end) parallelStream(T[]) parallelStream(T[] array, int start, int end) spliterator(T[] array) spliterator(T[] array, int start, int end) Object/all primitive versions of: indices(array) The first group seems basically required but the last group seems like we can get rid of it. (It expands to Streams.intRange(0, array.length)). What we're missing are methods for in-place parallel mutation of arrays, such as: Arrays.fill(T[] array, int -> T generator) Arrays.fillParallel(T[] array, int -> T generator) One can easily simulate these with intRange(0, length).forEach(i -> { array[i] = generator.apply(i); }) but (a) this is harder to read than the above fill forms and (b) it is less obvious how to discover this idiom. The indices was an attempt to make that easier but is not really any better: Arrays.indices(array).forEach(i -> { array[i] = generator.apply(i); }) So I think we should ditch the indices() methods but consider adding array fill methods. There'd be at least 9 x 2 (array types x {seq,par}) and possibly x 2 more (whole array, subarray). Though they're all trivial. From sam at sampullara.com Sun Mar 10 17:15:34 2013 From: sam at sampullara.com (Sam Pullara) Date: Sun, 10 Mar 2013 17:15:34 -0700 Subject: Arrays methods In-Reply-To: <513D202B.7070702@oracle.com> References: <513D202B.7070702@oracle.com> Message-ID: The fill methods look good to me. Sam On Sun, Mar 10, 2013 at 5:07 PM, Brian Goetz wrote: > Here's a summary of stream-related methods that are currently in > java.util.Arrays. All have one-line implementations. > > Object/int/long/double versions of: > stream(T[] array) > stream(T[] array, int start, int end) > parallelStream(T[]) > parallelStream(T[] array, int start, int end) > spliterator(T[] array) > spliterator(T[] array, int start, int end) > > Object/all primitive versions of: > indices(array) > > The first group seems basically required but the last group seems like we > can get rid of it. (It expands to Streams.intRange(0, array.length)). > > What we're missing are methods for in-place parallel mutation of arrays, > such as: > > Arrays.fill(T[] array, int -> T generator) > Arrays.fillParallel(T[] array, int -> T generator) > > One can easily simulate these with > > intRange(0, length).forEach(i -> { array[i] = generator.apply(i); }) > > but (a) this is harder to read than the above fill forms and (b) it is less > obvious how to discover this idiom. The indices was an attempt to make that > easier but is not really any better: > > Arrays.indices(array).forEach(i -> { array[i] = generator.apply(i); }) > > So I think we should ditch the indices() methods but consider adding array > fill methods. There'd be at least 9 x 2 (array types x {seq,par}) and > possibly x 2 more (whole array, subarray). Though they're all trivial. > From joe.bowbeer at gmail.com Sun Mar 10 17:25:31 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Sun, 10 Mar 2013 17:25:31 -0700 Subject: Arrays methods In-Reply-To: <513D202B.7070702@oracle.com> References: <513D202B.7070702@oracle.com> Message-ID: indices(array) could be useful for sparse arrays, if that is in the cards. I think that Streams.intRange(0, array.length) is a large enough expression that indices has some utility. And I think the relationship between indices() and fill() is overstated. OTOH, I have not yet used indices, so I cannot defend it very strongly. On Sun, Mar 10, 2013 at 5:07 PM, Brian Goetz wrote: > Here's a summary of stream-related methods that are currently in > java.util.Arrays. All have one-line implementations. > > Object/int/long/double versions of: > stream(T[] array) > stream(T[] array, int start, int end) > parallelStream(T[]) > parallelStream(T[] array, int start, int end) > spliterator(T[] array) > spliterator(T[] array, int start, int end) > > Object/all primitive versions of: > indices(array) > > The first group seems basically required but the last group seems like we > can get rid of it. (It expands to Streams.intRange(0, array.length)). > > What we're missing are methods for in-place parallel mutation of arrays, > such as: > > Arrays.fill(T[] array, int -> T generator) > Arrays.fillParallel(T[] array, int -> T generator) > > One can easily simulate these with > > intRange(0, length).forEach(i -> { array[i] = generator.apply(i); }) > > but (a) this is harder to read than the above fill forms and (b) it is > less obvious how to discover this idiom. The indices was an attempt to > make that easier but is not really any better: > > Arrays.indices(array).forEach(**i -> { array[i] = generator.apply(i); }) > > So I think we should ditch the indices() methods but consider adding array > fill methods. There'd be at least 9 x 2 (array types x {seq,par}) and > possibly x 2 more (whole array, subarray). Though they're all trivial. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130310/efa2be14/attachment.html From paul.sandoz at oracle.com Mon Mar 11 01:58:34 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 11 Mar 2013 09:58:34 +0100 Subject: Arrays methods In-Reply-To: <513D202B.7070702@oracle.com> References: <513D202B.7070702@oracle.com> Message-ID: <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> On Mar 11, 2013, at 1:07 AM, Brian Goetz wrote: > Here's a summary of stream-related methods that are currently in java.util.Arrays. All have one-line implementations. > > Object/int/long/double versions of: > stream(T[] array) > stream(T[] array, int start, int end) > parallelStream(T[]) > parallelStream(T[] array, int start, int end) > spliterator(T[] array) > spliterator(T[] array, int start, int end) > > Object/all primitive versions of: > indices(array) > > The first group seems basically required but the last group seems like we can get rid of it. (It expands to Streams.intRange(0, array.length)). > > What we're missing are methods for in-place parallel mutation of arrays, such as: > > Arrays.fill(T[] array, int -> T generator) > Arrays.fillParallel(T[] array, int -> T generator) > > One can easily simulate these with > > intRange(0, length).forEach(i -> { array[i] = generator.apply(i); }) > > but (a) this is harder to read than the above fill forms and (b) it is less obvious how to discover this idiom. The indices was an attempt to make that easier but is not really any better: > > Arrays.indices(array).forEach(i -> { array[i] = generator.apply(i); }) > > So I think we should ditch the indices() methods but consider adding array fill methods. Yes, indices while capturing a useful idiom seems to go only half way i.e. the presumption being if one generated indices of the array one would likely want to update the elements of the array at those indices. FWIW we could add range methods that takes one parameter: range(array.length) I think that would be better than indices. Paul. > There'd be at least 9 x 2 (array types x {seq,par}) and possibly x 2 more (whole array, subarray). Though they're all trivial. > From paul.sandoz at oracle.com Mon Mar 11 02:04:58 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 11 Mar 2013 10:04:58 +0100 Subject: Spliterator.IMMUTABLE In-Reply-To: <513C9574.3090109@cs.oswego.edu> References: <513C8A1B.9090301@univ-mlv.fr> <513C8C1C.20003@cs.oswego.edu> <513C8C0D.7050708@univ-mlv.fr> <513C9574.3090109@cs.oswego.edu> Message-ID: <241311A5-3B60-4D59-A379-70A75D315F49@oracle.com> On Mar 10, 2013, at 3:15 PM, Doug Lea
wrote: > IMMUTABLE: The structure and elements cannot change during traversal. > > [Other] Any change in structure and/or elements represents a usage error, > in which case the Spliterator is expected to have a documented course > of action upon any detected change (normally a best-effort ConcurrentModificationException check). > > Now, whether an IMMUTABLE spliterator arises because the source is > immutable (for example a raw input stream) or because it > is a snapshot (for example CopyOnWriteArrayList) doesn't matter. > An example of the former are the Spliterators for ranges (which also happen to report DISTINCT, and SORTED when step > 0). Paul. From brian.goetz at oracle.com Mon Mar 11 11:46:20 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 11 Mar 2013 14:46:20 -0400 Subject: Arrays methods In-Reply-To: <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> Message-ID: <513E267C.4090709@oracle.com> > FWIW we could add range methods that takes one parameter: > > range(array.length) > > I think that would be better than indices. And only one is needed there rather than 9. Current plan, unless someone objects: - remove Arrays.indices() methods - add Arrays.fill() methods - add Streams.intRange(n) method as Paul suggests From joe.bowbeer at gmail.com Mon Mar 11 12:04:31 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 11 Mar 2013 12:04:31 -0700 Subject: Arrays methods In-Reply-To: <513E267C.4090709@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> Message-ID: I wonder whether a range of one parameter is worth more than it confuses. For one thing, '0, ' is trivial. For another, it is explicit, which avoids the lingering question of whether the range starts at 0 or 1. Finally, I think there is already potential confusion due to the optional 'step' parameter. On Mar 11, 2013 11:46 AM, "Brian Goetz" wrote: > FWIW we could add range methods that takes one parameter: >> >> range(array.length) >> >> I think that would be better than indices. >> > > And only one is needed there rather than 9. > > Current plan, unless someone objects: > - remove Arrays.indices() methods > - add Arrays.fill() methods > - add Streams.intRange(n) method as Paul suggests > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130311/d8d95d02/attachment.html From brian.goetz at oracle.com Mon Mar 11 12:34:08 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 11 Mar 2013 15:34:08 -0400 Subject: Arrays methods In-Reply-To: <513E267C.4090709@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> Message-ID: <513E31B0.9000504@oracle.com> > - add Arrays.fill() methods We have an unfortunate collection. We already have: void fill(Object[] a, Object val) { If we added void fill(T[], IntFunction gen) then existing calls to fill(array, null) would become ambiguous. Doh. (But the other 17 forms are not problematic.) Any suggestions for alternate names? From brian.goetz at oracle.com Mon Mar 11 14:54:22 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 11 Mar 2013 17:54:22 -0400 Subject: Code review request In-Reply-To: <512672E6.1050708@oracle.com> References: <512672E6.1050708@oracle.com> Message-ID: <513E528E.3040400@oracle.com> I've posted an updated webrev incorporating comments received to date: http://cr.openjdk.java.net/~briangoetz/jdk-8008670.2/webrev/ On 2/21/2013 2:17 PM, Brian Goetz wrote: > At > http://cr.openjdk.java.net/~briangoetz/jdk-8008670/webrev/ > > I've posted a webrev for about half the classes in java.util.stream. > None of these are public classes, so there are no public API issues > here, but plenty of internal API issues, naming issues (ooh, a > bikeshed), and code quality issues. > From brian.goetz at oracle.com Wed Mar 13 12:25:27 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2013 15:25:27 -0400 Subject: Arrays methods In-Reply-To: <513E31B0.9000504@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> Message-ID: <5140D2A7.1070508@oracle.com> > If we added > > void fill(T[], IntFunction gen) > > then existing calls to > > fill(array, null) > > would become ambiguous. Doh. (But the other 17 forms are not > problematic.) > > Any suggestions for alternate names? Arrays.generate(array, fn) Arrays.fillApplying(array, fn) Arrays.initialize(array, fn) Arrays.setAll(array, fn) ... From mike.duigou at oracle.com Wed Mar 13 13:28:17 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Wed, 13 Mar 2013 13:28:17 -0700 Subject: Arrays methods In-Reply-To: <5140D2A7.1070508@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> Message-ID: <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> Arrays.indexFill(array, fn) Arrays.indexedFill(array, fn) Arrays.fillIndexed(array, fn) Arrays.indexedSet(array, fn) I think it might be better to stay away from "fill" names because the current fill methods all have the property that every array element is assigned the same value. This new operation allows a different value to be assigned to each element. Mike On Mar 13 2013, at 12:25 , Brian Goetz wrote: >> If we added >> >> void fill(T[], IntFunction gen) >> >> then existing calls to >> >> fill(array, null) >> >> would become ambiguous. Doh. (But the other 17 forms are not >> problematic.) >> >> Any suggestions for alternate names? > > Arrays.generate(array, fn) > Arrays.fillApplying(array, fn) > Arrays.initialize(array, fn) > Arrays.setAll(array, fn) > > ... From brian.goetz at oracle.com Wed Mar 13 13:35:34 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2013 16:35:34 -0400 Subject: Arrays methods In-Reply-To: <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> Message-ID: <5140E316.6070007@oracle.com> indexedFill (and parallelIndexedFill) would work for me. On 3/13/2013 4:28 PM, Mike Duigou wrote: > Arrays.indexFill(array, fn) > Arrays.indexedFill(array, fn) > Arrays.fillIndexed(array, fn) > Arrays.indexedSet(array, fn) > > I think it might be better to stay away from "fill" names because the current fill methods all have the property that every array element is assigned the same value. This new operation allows a different value to be assigned to each element. > > Mike > > On Mar 13 2013, at 12:25 , Brian Goetz wrote: > >>> If we added >>> >>> void fill(T[], IntFunction gen) >>> >>> then existing calls to >>> >>> fill(array, null) >>> >>> would become ambiguous. Doh. (But the other 17 forms are not >>> problematic.) >>> >>> Any suggestions for alternate names? >> >> Arrays.generate(array, fn) >> Arrays.fillApplying(array, fn) >> Arrays.initialize(array, fn) >> Arrays.setAll(array, fn) >> >> ... > From joe.bowbeer at gmail.com Wed Mar 13 14:30:35 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 13 Mar 2013 14:30:35 -0700 Subject: Arrays methods In-Reply-To: <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> Message-ID: I agree with the critique of 'fill' names. I like 'set' names. On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou wrote: > Arrays.indexFill(array, fn) > Arrays.indexedFill(array, fn) > Arrays.fillIndexed(array, fn) > Arrays.indexedSet(array, fn) > > I think it might be better to stay away from "fill" names because the > current fill methods all have the property that every array element is > assigned the same value. This new operation allows a different value to be > assigned to each element. > > Mike > > On Mar 13 2013, at 12:25 , Brian Goetz wrote: > > >> If we added > >> > >> void fill(T[], IntFunction gen) > >> > >> then existing calls to > >> > >> fill(array, null) > >> > >> would become ambiguous. Doh. (But the other 17 forms are not > >> problematic.) > >> > >> Any suggestions for alternate names? > > > > Arrays.generate(array, fn) > > Arrays.fillApplying(array, fn) > > Arrays.initialize(array, fn) > > Arrays.setAll(array, fn) > > > > ... > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130313/8c84a893/attachment.html From brian.goetz at oracle.com Wed Mar 13 14:44:54 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2013 17:44:54 -0400 Subject: Arrays methods In-Reply-To: References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> Message-ID: <5140F356.8020000@oracle.com> Fill implies "set all elements"; a set name would probably have to say "setAll": Arrays.setAll(array, fn) Arrays.parallelSetAll(array, fn) OK? On 3/13/2013 5:30 PM, Joe Bowbeer wrote: > I agree with the critique of 'fill' names. > > I like 'set' names. > > > > > On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou > wrote: > > Arrays.indexFill(array, fn) > Arrays.indexedFill(array, fn) > Arrays.fillIndexed(array, fn) > Arrays.indexedSet(array, fn) > > I think it might be better to stay away from "fill" names because > the current fill methods all have the property that every array > element is assigned the same value. This new operation allows a > different value to be assigned to each element. > > Mike > > On Mar 13 2013, at 12:25 , Brian Goetz wrote: > > >> If we added > >> > >> void fill(T[], IntFunction gen) > >> > >> then existing calls to > >> > >> fill(array, null) > >> > >> would become ambiguous. Doh. (But the other 17 forms are not > >> problematic.) > >> > >> Any suggestions for alternate names? > > > > Arrays.generate(array, fn) > > Arrays.fillApplying(array, fn) > > Arrays.initialize(array, fn) > > Arrays.setAll(array, fn) > > > > ... > > From mike.duigou at oracle.com Wed Mar 13 15:39:39 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Wed, 13 Mar 2013 15:39:39 -0700 Subject: Arrays methods In-Reply-To: <5140F356.8020000@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> <5140F356.8020000@oracle.com> Message-ID: <1F520CCC-2A59-4C57-A0E3-C18FE5514A03@oracle.com> Yes On 2013-03-13, at 14:44, Brian Goetz wrote: > Fill implies "set all elements"; a set name would probably have to say "setAll": > > Arrays.setAll(array, fn) > Arrays.parallelSetAll(array, fn) > > OK? > > On 3/13/2013 5:30 PM, Joe Bowbeer wrote: >> I agree with the critique of 'fill' names. >> >> I like 'set' names. >> >> >> >> >> On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou > > wrote: >> >> Arrays.indexFill(array, fn) >> Arrays.indexedFill(array, fn) >> Arrays.fillIndexed(array, fn) >> Arrays.indexedSet(array, fn) >> >> I think it might be better to stay away from "fill" names because >> the current fill methods all have the property that every array >> element is assigned the same value. This new operation allows a >> different value to be assigned to each element. >> >> Mike >> >> On Mar 13 2013, at 12:25 , Brian Goetz wrote: >> >> >> If we added >> >> >> >> void fill(T[], IntFunction gen) >> >> >> >> then existing calls to >> >> >> >> fill(array, null) >> >> >> >> would become ambiguous. Doh. (But the other 17 forms are not >> >> problematic.) >> >> >> >> Any suggestions for alternate names? >> > >> > Arrays.generate(array, fn) >> > Arrays.fillApplying(array, fn) >> > Arrays.initialize(array, fn) >> > Arrays.setAll(array, fn) >> > >> > ... >> >> From brian.goetz at oracle.com Thu Mar 14 09:26:06 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Mar 2013 12:26:06 -0400 Subject: Arrays methods In-Reply-To: <1F520CCC-2A59-4C57-A0E3-C18FE5514A03@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> <5140F356.8020000@oracle.com> <1F520CCC-2A59-4C57-A0E3-C18FE5514A03@oracle.com> Message-ID: <5141FA1E.1070303@oracle.com> Next roadblock: no suitable SAMs for int -> long, int -> double. Add? On 3/13/2013 6:39 PM, Mike Duigou wrote: > Yes > > > > On 2013-03-13, at 14:44, Brian Goetz wrote: > >> Fill implies "set all elements"; a set name would probably have to say "setAll": >> >> Arrays.setAll(array, fn) >> Arrays.parallelSetAll(array, fn) >> >> OK? >> >> On 3/13/2013 5:30 PM, Joe Bowbeer wrote: >>> I agree with the critique of 'fill' names. >>> >>> I like 'set' names. >>> >>> >>> >>> >>> On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou >> > wrote: >>> >>> Arrays.indexFill(array, fn) >>> Arrays.indexedFill(array, fn) >>> Arrays.fillIndexed(array, fn) >>> Arrays.indexedSet(array, fn) >>> >>> I think it might be better to stay away from "fill" names because >>> the current fill methods all have the property that every array >>> element is assigned the same value. This new operation allows a >>> different value to be assigned to each element. >>> >>> Mike >>> >>> On Mar 13 2013, at 12:25 , Brian Goetz wrote: >>> >>> >> If we added >>> >> >>> >> void fill(T[], IntFunction gen) >>> >> >>> >> then existing calls to >>> >> >>> >> fill(array, null) >>> >> >>> >> would become ambiguous. Doh. (But the other 17 forms are not >>> >> problematic.) >>> >> >>> >> Any suggestions for alternate names? >>> > >>> > Arrays.generate(array, fn) >>> > Arrays.fillApplying(array, fn) >>> > Arrays.initialize(array, fn) >>> > Arrays.setAll(array, fn) >>> > >>> > ... >>> >>> From brian.goetz at oracle.com Thu Mar 14 10:04:47 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Mar 2013 13:04:47 -0400 Subject: Survey on map/flatMap disambiguation In-Reply-To: <513D1C6D.40906@oracle.com> References: <513D1C6D.40906@oracle.com> Message-ID: <5142032F.9070601@oracle.com> The survey is closed, results are here: https://www.surveymonkey.com/sr.aspx?sm=9UyN8RRvMX8BnpTdd4rYgDlXU9uUVALNDjNn_2fY2e9_2fo_3d The sense of the EG was strongly in favor of disambiguating both map and flatMap; several argued that they liked the "less magic" aspect of it, and the explicitness of where we go from reference to primitive streams and back. This does create a possibility for performance bugs, where users do: stuff.map(Foo::size).reduce(0, Integer::sum) instead of stuff.mapToInt(Foo::size).reduce(0, Integer::sum) Both will now compile, but the former will be boxed and the latter won't be. The previous status quo saved users from themselves in this case. Will make the following changes: Stream.map -> {map,mapTo{Int,Long,Double}} Stream.flatMap -> {flatMap,flatMapTo{Int,Long,Double}} {Int,Long,Double}Stream.map -> map,mapToObj On 3/10/2013 7:51 PM, Brian Goetz wrote: > I've posted a survey for the EG at: > > https://www.surveymonkey.com/s/NT5DW7G > > where people can express their opinion on the issue of flatMap > disambiguation (see thread entitled "flatMap ambiguity"). > > The password has been communicated directly to the EG; contact me if you > didn't get it. > > Usual survey rules: enter your name with your response, all results will > be made public after the survey closes. I'll set a closing time of 6PM > PT Wednesday of this week. From brian.goetz at oracle.com Thu Mar 14 10:06:54 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Mar 2013 13:06:54 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <513D159A.7050008@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> Message-ID: <514203AE.3070501@oracle.com> I've closed the survey, results are at: https://www.surveymonkey.com/sr.aspx?sm=c2NqWp6wXUxCUlr6SY05nYEyYIr7ShzH3IgL4OXPIYM_3d Here, we did not reach a clear consensus. However, I think some people may have misunderstood the question. I'll let Doug, as proponent of this approach, take another swing at what is being proposed here, and why this might achieve best-of-both-worlds. On 3/10/2013 7:22 PM, Brian Goetz wrote: > I've posted a survey for the EG at: > > https://www.surveymonkey.com/s/NSXMYC2 > > where people can express their preference between: > - Leave things as they are (Optional-bearing methods for findXxx and > reduce); > - Add, as Doug suggests, non-optional versions of these too. > > Implementation / spec complexity is a non-issue here -- the > implementations are trivial. The sole issue is whether the API is > better with one version or with both. > > The password has been communicated directly to the EG; contact me if you > didn't get it. > > Usual survey rules: enter your name with your response, all results will > be made public after the survey closes. I'll set a closing time of 6PM > PT Wednesday of this week. > > > On 3/6/2013 7:09 AM, Doug Lea wrote: >> (Restricting to lambda-libs list...) >> >> On 03/06/13 04:47, Remi Forax wrote: >>> Ok, let be nuclear on this, >>> There is no good reason to introduce Optional in java.util. >> >> We agree about most of the rationale for not using Optional. >> But there are still people who say they want it. >> I don't think it is productive at this point to >> argue about features supporting an Optional-laden >> programming style. But we never seem to hit closure >> about features supporting an Optional-free style. >> So I'd like to re-propose a simple compromise. >> In the same way that there are Optional and >> basis-returning versions of reduce: >> >> T reduce(T identity, BinaryOperator reducer); >> Optional reduce(BinaryOperator reducer); >> >> (Where the basis-returning one can in turn be used to >> avoid Optional-returning min(), etc). We should do the >> same at least for find, or more in keeping with current >> API, findFirst and findAny: >> >> T findFirst(Predicate predicate, T ifNone); >> T findAny(Predicate predicate, T ifNone); >> >> People wanting to avoid Optional can then then >> get all of the derived versions (allMatch, plain >> findAny, etc) easily enough. >> >> Surprisingly enough, that's the only missing >> feature that would otherwise enable a completely >> Optional-free usage style of the Stream API. >> >> We have both proposed variants of this several times, >> but they don't seem to go anywhere. It would be nice >> to have a calm final discussion about why we would NOT >> do such an apparently sensible thing! >> >> -Doug >> From kevinb at google.com Thu Mar 14 10:26:08 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 14 Mar 2013 10:26:08 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: <5142032F.9070601@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> Message-ID: Ouch. That is pretty gnarly. Did we realize that before we voted? On Thu, Mar 14, 2013 at 10:04 AM, Brian Goetz wrote: > The survey is closed, results are here: > > https://www.surveymonkey.com/**sr.aspx?sm=**9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA > **LNDjNn_2fY2e9_2fo_3d > > The sense of the EG was strongly in favor of disambiguating both map and > flatMap; several argued that they liked the "less magic" aspect of it, and > the explicitness of where we go from reference to primitive streams and > back. > > This does create a possibility for performance bugs, where users do: > > stuff.map(Foo::size).reduce(0, Integer::sum) > > instead of > > stuff.mapToInt(Foo::size).**reduce(0, Integer::sum) > > Both will now compile, but the former will be boxed and the latter won't > be. The previous status quo saved users from themselves in this case. > > Will make the following changes: > > Stream.map -> {map,mapTo{Int,Long,Double}} > Stream.flatMap -> {flatMap,flatMapTo{Int,Long,**Double}} > {Int,Long,Double}Stream.map -> map,mapToObj > > > > On 3/10/2013 7:51 PM, Brian Goetz wrote: > >> I've posted a survey for the EG at: >> >> https://www.surveymonkey.com/**s/NT5DW7G >> >> where people can express their opinion on the issue of flatMap >> disambiguation (see thread entitled "flatMap ambiguity"). >> >> The password has been communicated directly to the EG; contact me if you >> didn't get it. >> >> Usual survey rules: enter your name with your response, all results will >> be made public after the survey closes. I'll set a closing time of 6PM >> PT Wednesday of this week. >> > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130314/f1395351/attachment.html From joe.bowbeer at gmail.com Thu Mar 14 12:03:35 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 14 Mar 2013 12:03:35 -0700 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <514203AE.3070501@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> Message-ID: Three comments, by the block who opposed the additional mechanism, expressed their view very clearly. This view is that there should only be one good way to do this, otherwise we have failed. Doug's view, when he expressed it before, was that we should add another way in case we are wrong (and to satisfy those who prefer the other way). If there is a "best of both worlds" argument, I have yet to hear it. On Thu, Mar 14, 2013 at 10:06 AM, Brian Goetz wrote: > I've closed the survey, results are at: > > > https://www.surveymonkey.com/**sr.aspx?sm=**c2NqWp6wXUxCUlr6SY05nYEyYIr7Sh > **zH3IgL4OXPIYM_3d > > Here, we did not reach a clear consensus. However, I think some people > may have misunderstood the question. I'll let Doug, as proponent of this > approach, take another swing at what is being proposed here, and why this > might achieve best-of-both-worlds. > > > On 3/10/2013 7:22 PM, Brian Goetz wrote: > >> I've posted a survey for the EG at: >> >> https://www.surveymonkey.com/**s/NSXMYC2 >> >> where people can express their preference between: >> - Leave things as they are (Optional-bearing methods for findXxx and >> reduce); >> - Add, as Doug suggests, non-optional versions of these too. >> >> Implementation / spec complexity is a non-issue here -- the >> implementations are trivial. The sole issue is whether the API is >> better with one version or with both. >> >> The password has been communicated directly to the EG; contact me if you >> didn't get it. >> >> Usual survey rules: enter your name with your response, all results will >> be made public after the survey closes. I'll set a closing time of 6PM >> PT Wednesday of this week. >> >> >> On 3/6/2013 7:09 AM, Doug Lea wrote: >> >>> (Restricting to lambda-libs list...) >>> >>> On 03/06/13 04:47, Remi Forax wrote: >>> >>>> Ok, let be nuclear on this, >>>> There is no good reason to introduce Optional in java.util. >>>> >>> >>> We agree about most of the rationale for not using Optional. >>> But there are still people who say they want it. >>> I don't think it is productive at this point to >>> argue about features supporting an Optional-laden >>> programming style. But we never seem to hit closure >>> about features supporting an Optional-free style. >>> So I'd like to re-propose a simple compromise. >>> In the same way that there are Optional and >>> basis-returning versions of reduce: >>> >>> T reduce(T identity, BinaryOperator reducer); >>> Optional reduce(BinaryOperator reducer); >>> >>> (Where the basis-returning one can in turn be used to >>> avoid Optional-returning min(), etc). We should do the >>> same at least for find, or more in keeping with current >>> API, findFirst and findAny: >>> >>> T findFirst(Predicate predicate, T ifNone); >>> T findAny(Predicate predicate, T ifNone); >>> >>> People wanting to avoid Optional can then then >>> get all of the derived versions (allMatch, plain >>> findAny, etc) easily enough. >>> >>> Surprisingly enough, that's the only missing >>> feature that would otherwise enable a completely >>> Optional-free usage style of the Stream API. >>> >>> We have both proposed variants of this several times, >>> but they don't seem to go anywhere. It would be nice >>> to have a calm final discussion about why we would NOT >>> do such an apparently sensible thing! >>> >>> -Doug >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130314/201158fe/attachment.html From joe.bowbeer at gmail.com Thu Mar 14 12:11:37 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 14 Mar 2013 12:11:37 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> Message-ID: The survey (and proposal) did not mention the affect on Int/Long/Double streams either. mapToInt, mapToObj, map... Is this verbose strangeness introduced in the map namespace worth having consistency with flatMap? I'd rather sacrifice flatMap than change map. I think the results of this survey should be tabled until a complete proposal is presented. On Thu, Mar 14, 2013 at 10:26 AM, Kevin Bourrillion wrote: > Ouch. That is pretty gnarly. Did we realize that before we voted? > > > > On Thu, Mar 14, 2013 at 10:04 AM, Brian Goetz wrote: > >> The survey is closed, results are here: >> >> https://www.surveymonkey.com/**sr.aspx?sm=** >> 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**LNDjNn_2fY2e9_2fo_3d >> >> The sense of the EG was strongly in favor of disambiguating both map and >> flatMap; several argued that they liked the "less magic" aspect of it, and >> the explicitness of where we go from reference to primitive streams and >> back. >> >> This does create a possibility for performance bugs, where users do: >> >> stuff.map(Foo::size).reduce(0, Integer::sum) >> >> instead of >> >> stuff.mapToInt(Foo::size).**reduce(0, Integer::sum) >> >> Both will now compile, but the former will be boxed and the latter won't >> be. The previous status quo saved users from themselves in this case. >> >> Will make the following changes: >> >> Stream.map -> {map,mapTo{Int,Long,Double}} >> Stream.flatMap -> {flatMap,flatMapTo{Int,Long,**Double}} >> {Int,Long,Double}Stream.map -> map,mapToObj >> >> >> >> On 3/10/2013 7:51 PM, Brian Goetz wrote: >> >>> I've posted a survey for the EG at: >>> >>> https://www.surveymonkey.com/**s/NT5DW7G >>> >>> where people can express their opinion on the issue of flatMap >>> disambiguation (see thread entitled "flatMap ambiguity"). >>> >>> The password has been communicated directly to the EG; contact me if you >>> didn't get it. >>> >>> Usual survey rules: enter your name with your response, all results will >>> be made public after the survey closes. I'll set a closing time of 6PM >>> PT Wednesday of this week. >>> >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130314/f87306aa/attachment.html From brian.goetz at oracle.com Thu Mar 14 12:21:56 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Mar 2013 15:21:56 -0400 Subject: Survey on map/flatMap disambiguation In-Reply-To: References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> Message-ID: <51422354.5090108@oracle.com> Right. For the primitive streams, I applied the "add a suffix if you are changing types", so IntStream.map() is for int->int, and mapToObj() is for int->Object. What compelled me about the results of this survey is that multiple people explicitly called out that they liked the mapToXxx not for consistency, but for the explicitness of it. Because the types are lost to method chaining, .mapToInt() carries a lot of information about the fact that there's a type switcheroo going on. Whereas overloaded map() is magic. On 3/14/2013 3:11 PM, Joe Bowbeer wrote: > The survey (and proposal) did not mention the affect on Int/Long/Double > streams either. mapToInt, mapToObj, map... > > Is this verbose strangeness introduced in the map namespace worth having > consistency with flatMap? I'd rather sacrifice flatMap than change map. > > I think the results of this survey should be tabled until a complete > proposal is presented. > > > > On Thu, Mar 14, 2013 at 10:26 AM, Kevin Bourrillion > wrote: > > Ouch. That is pretty gnarly. Did we realize that before we voted? > > > > On Thu, Mar 14, 2013 at 10:04 AM, Brian Goetz > > wrote: > > The survey is closed, results are here: > > https://www.surveymonkey.com/__sr.aspx?sm=__9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA__LNDjNn_2fY2e9_2fo_3d > > > The sense of the EG was strongly in favor of disambiguating both > map and flatMap; several argued that they liked the "less magic" > aspect of it, and the explicitness of where we go from reference > to primitive streams and back. > > This does create a possibility for performance bugs, where users do: > > stuff.map(Foo::size).reduce(0, Integer::sum) > > instead of > > stuff.mapToInt(Foo::size).__reduce(0, Integer::sum) > > Both will now compile, but the former will be boxed and the > latter won't be. The previous status quo saved users from > themselves in this case. > > Will make the following changes: > > Stream.map -> {map,mapTo{Int,Long,Double}} > Stream.flatMap -> {flatMap,flatMapTo{Int,Long,__Double}} > {Int,Long,Double}Stream.map -> map,mapToObj > > > > On 3/10/2013 7:51 PM, Brian Goetz wrote: > > I've posted a survey for the EG at: > > https://www.surveymonkey.com/__s/NT5DW7G > > > where people can express their opinion on the issue of flatMap > disambiguation (see thread entitled "flatMap ambiguity"). > > The password has been communicated directly to the EG; > contact me if you > didn't get it. > > Usual survey rules: enter your name with your response, all > results will > be made public after the survey closes. I'll set a closing > time of 6PM > PT Wednesday of this week. > > > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com > > > From joe.bowbeer at gmail.com Thu Mar 14 12:31:06 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 14 Mar 2013 12:31:06 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: <51422354.5090108@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51422354.5090108@oracle.com> Message-ID: Does the use of mapToObj() allow a boxed() to be removed? That could be a feature. On Thu, Mar 14, 2013 at 12:21 PM, Brian Goetz wrote: > Right. For the primitive streams, I applied the "add a suffix if you are > changing types", so IntStream.map() is for int->int, and mapToObj() is for > int->Object. > > What compelled me about the results of this survey is that multiple people > explicitly called out that they liked the mapToXxx not for consistency, but > for the explicitness of it. Because the types are lost to method chaining, > .mapToInt() carries a lot of information about the fact that there's a type > switcheroo going on. Whereas overloaded map() is magic. > > > On 3/14/2013 3:11 PM, Joe Bowbeer wrote: > >> The survey (and proposal) did not mention the affect on Int/Long/Double >> streams either. mapToInt, mapToObj, map... >> >> Is this verbose strangeness introduced in the map namespace worth having >> consistency with flatMap? I'd rather sacrifice flatMap than change map. >> >> I think the results of this survey should be tabled until a complete >> proposal is presented. >> >> >> >> On Thu, Mar 14, 2013 at 10:26 AM, Kevin Bourrillion > > wrote: >> >> Ouch. That is pretty gnarly. Did we realize that before we voted? >> >> >> >> On Thu, Mar 14, 2013 at 10:04 AM, Brian Goetz >> > wrote: >> >> The survey is closed, results are here: >> >> https://www.surveymonkey.com/_**_sr.aspx?sm=__** >> 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**__LNDjNn_2fY2e9_2fo_3d >> >> > 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**LNDjNn_2fY2e9_2fo_3d >> > >> >> The sense of the EG was strongly in favor of disambiguating both >> map and flatMap; several argued that they liked the "less magic" >> aspect of it, and the explicitness of where we go from reference >> to primitive streams and back. >> >> This does create a possibility for performance bugs, where users >> do: >> >> stuff.map(Foo::size).reduce(0, Integer::sum) >> >> instead of >> >> stuff.mapToInt(Foo::size).__**reduce(0, Integer::sum) >> >> >> Both will now compile, but the former will be boxed and the >> latter won't be. The previous status quo saved users from >> themselves in this case. >> >> Will make the following changes: >> >> Stream.map -> {map,mapTo{Int,Long,Double}} >> Stream.flatMap -> {flatMap,flatMapTo{Int,Long,__**Double}} >> >> {Int,Long,Double}Stream.map -> map,mapToObj >> >> >> >> On 3/10/2013 7:51 PM, Brian Goetz wrote: >> >> I've posted a survey for the EG at: >> >> https://www.surveymonkey.com/_**_s/NT5DW7G >> >> >> > >> >> where people can express their opinion on the issue of flatMap >> disambiguation (see thread entitled "flatMap ambiguity"). >> >> The password has been communicated directly to the EG; >> contact me if you >> didn't get it. >> >> Usual survey rules: enter your name with your response, all >> results will >> be made public after the survey closes. I'll set a closing >> time of 6PM >> PT Wednesday of this week. >> >> >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130314/b6276118/attachment-0001.html From joe.bowbeer at gmail.com Thu Mar 14 12:56:16 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 14 Mar 2013 12:56:16 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51422354.5090108@oracle.com> Message-ID: Actually, what I'd need is a reduceFromObj, not mapToObj. Either way, I think this change needs to be seen in the context of sample code. The survey describes the benefits from one point of view, but leaves out some important details, and omits a gnarly issue. As a result, I think you're reading too much into these comments. (Are they really independent verification? Or merely a rephrasing of the proposed benefits?) >From my point of view, it's important that the terms 'map' and 'reduce' remain constant. This gives the code a familiar functional look. Unfortunately, it is a problem trying to read code that changes between int and Obj streams, but I'm not convinced that these extra characters in the method names are going to help much. I think an IDE will be indispensable either way. Joe On Thu, Mar 14, 2013 at 12:31 PM, Joe Bowbeer wrote: > Does the use of mapToObj() allow a boxed() to be removed? That could be a > feature. > > > > On Thu, Mar 14, 2013 at 12:21 PM, Brian Goetz wrote: > >> Right. For the primitive streams, I applied the "add a suffix if you are >> changing types", so IntStream.map() is for int->int, and mapToObj() is for >> int->Object. >> >> What compelled me about the results of this survey is that multiple >> people explicitly called out that they liked the mapToXxx not for >> consistency, but for the explicitness of it. Because the types are lost to >> method chaining, .mapToInt() carries a lot of information about the fact >> that there's a type switcheroo going on. Whereas overloaded map() is magic. >> >> >> On 3/14/2013 3:11 PM, Joe Bowbeer wrote: >> >>> The survey (and proposal) did not mention the affect on Int/Long/Double >>> streams either. mapToInt, mapToObj, map... >>> >>> Is this verbose strangeness introduced in the map namespace worth having >>> consistency with flatMap? I'd rather sacrifice flatMap than change map. >>> >>> I think the results of this survey should be tabled until a complete >>> proposal is presented. >>> >>> >>> >>> On Thu, Mar 14, 2013 at 10:26 AM, Kevin Bourrillion >> > wrote: >>> >>> Ouch. That is pretty gnarly. Did we realize that before we voted? >>> >>> >>> >>> On Thu, Mar 14, 2013 at 10:04 AM, Brian Goetz >>> > wrote: >>> >>> The survey is closed, results are here: >>> >>> https://www.surveymonkey.com/_**_sr.aspx?sm=__** >>> 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**__LNDjNn_2fY2e9_2fo_3d >>> >>> >> 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**LNDjNn_2fY2e9_2fo_3d >>> > >>> >>> The sense of the EG was strongly in favor of disambiguating both >>> map and flatMap; several argued that they liked the "less magic" >>> aspect of it, and the explicitness of where we go from reference >>> to primitive streams and back. >>> >>> This does create a possibility for performance bugs, where users >>> do: >>> >>> stuff.map(Foo::size).reduce(0, Integer::sum) >>> >>> instead of >>> >>> stuff.mapToInt(Foo::size).__**reduce(0, Integer::sum) >>> >>> >>> Both will now compile, but the former will be boxed and the >>> latter won't be. The previous status quo saved users from >>> themselves in this case. >>> >>> Will make the following changes: >>> >>> Stream.map -> {map,mapTo{Int,Long,Double}} >>> Stream.flatMap -> {flatMap,flatMapTo{Int,Long,__**Double}} >>> >>> {Int,Long,Double}Stream.map -> map,mapToObj >>> >>> >>> >>> On 3/10/2013 7:51 PM, Brian Goetz wrote: >>> >>> I've posted a survey for the EG at: >>> >>> https://www.surveymonkey.com/_**_s/NT5DW7G >>> >>> >>> > >>> >>> where people can express their opinion on the issue of >>> flatMap >>> disambiguation (see thread entitled "flatMap ambiguity"). >>> >>> The password has been communicated directly to the EG; >>> contact me if you >>> didn't get it. >>> >>> Usual survey rules: enter your name with your response, all >>> results will >>> be made public after the survey closes. I'll set a closing >>> time of 6PM >>> PT Wednesday of this week. >>> >>> >>> >>> >>> -- >>> Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com >>> >>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130314/183a0bf5/attachment.html From brian.goetz at oracle.com Thu Mar 14 14:54:55 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 14 Mar 2013 17:54:55 -0400 Subject: Arrays methods In-Reply-To: <5141FA1E.1070303@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> <5140F356.8020000@oracle.com> <1F520CCC-2A59-4C57-A0E3-C18FE5514A03@oracle.com> <5141FA1E.1070303@oracle.com> Message-ID: <5142472F.6090507@oracle.com> Here's what these would look like: New SAMS: {Int,Long,Double}To{Int,Long,Double}Function New Arrays methods: /** * Initialize all elements of the specified array, using the provided generator function to compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position * @param Type of elements of the array */ public void setAll(T[] array, IntFunction generator) { Streams.intRange(0, array.length).forEach(i -> { array[i] = generator.apply(i); }); } /** * Initialize all elements of the specified array, in parallel, using the provided generator function to * compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position * @param Type of elements of the array */ public void parallelSetAll(T[] array, IntFunction generator) { Streams.intRange(0, array.length).parallel().forEach(i -> { array[i] = generator.apply(i); }); } /** * Initialize all elements of the specified array, using the provided generator function to compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void setAll(int[] array, IntUnaryOperator generator) { Streams.intRange(0, array.length).forEach(i -> { array[i] = generator.applyAsInt(i); }); } /** * Initialize all elements of the specified array, in parallel, using the provided generator function to * compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void parallelSetAll(int[] array, IntUnaryOperator generator) { Streams.intRange(0, array.length).parallel().forEach(i -> { array[i] = generator.applyAsInt(i); }); } /** * Initialize all elements of the specified array, using the provided generator function to compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void setAll(long[] array, IntToLongFunction generator) { Streams.intRange(0, array.length).forEach(i -> { array[i] = generator.applyAsLong(i); }); } /** * Initialize all elements of the specified array, in parallel, using the provided generator function to * compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void parallelSetAll(long[] array, IntToLongFunction generator) { Streams.intRange(0, array.length).parallel().forEach(i -> { array[i] = generator.applyAsLong(i); }); } /** * Initialize all elements of the specified array, using the provided generator function to compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void setAll(double[] array, IntToDoubleFunction generator) { Streams.intRange(0, array.length).forEach(i -> { array[i] = generator.applyAsDouble(i); }); } /** * Initialize all elements of the specified array, in parallel, using the provided generator function to * compute each element. * @param array Array to be initialized * @param generator Function accepting an index and producing the desired value for that position */ public void parallelSetAll(double[] array, IntToDoubleFunction generator) { Streams.intRange(0, array.length).parallel().forEach(i -> { array[i] = generator.applyAsDouble(i); }); } On 3/14/2013 12:26 PM, Brian Goetz wrote: > Next roadblock: no suitable SAMs for int -> long, int -> double. > > Add? > > On 3/13/2013 6:39 PM, Mike Duigou wrote: >> Yes >> >> >> >> On 2013-03-13, at 14:44, Brian Goetz wrote: >> >>> Fill implies "set all elements"; a set name would probably have to >>> say "setAll": >>> >>> Arrays.setAll(array, fn) >>> Arrays.parallelSetAll(array, fn) >>> >>> OK? >>> >>> On 3/13/2013 5:30 PM, Joe Bowbeer wrote: >>>> I agree with the critique of 'fill' names. >>>> >>>> I like 'set' names. >>>> >>>> >>>> >>>> >>>> On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou >>> > wrote: >>>> >>>> Arrays.indexFill(array, fn) >>>> Arrays.indexedFill(array, fn) >>>> Arrays.fillIndexed(array, fn) >>>> Arrays.indexedSet(array, fn) >>>> >>>> I think it might be better to stay away from "fill" names because >>>> the current fill methods all have the property that every array >>>> element is assigned the same value. This new operation allows a >>>> different value to be assigned to each element. >>>> >>>> Mike >>>> >>>> On Mar 13 2013, at 12:25 , Brian Goetz wrote: >>>> >>>> >> If we added >>>> >> >>>> >> void fill(T[], IntFunction gen) >>>> >> >>>> >> then existing calls to >>>> >> >>>> >> fill(array, null) >>>> >> >>>> >> would become ambiguous. Doh. (But the other 17 forms are not >>>> >> problematic.) >>>> >> >>>> >> Any suggestions for alternate names? >>>> > >>>> > Arrays.generate(array, fn) >>>> > Arrays.fillApplying(array, fn) >>>> > Arrays.initialize(array, fn) >>>> > Arrays.setAll(array, fn) >>>> > >>>> > ... >>>> >>>> From forax at univ-mlv.fr Thu Mar 14 14:59:07 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 14 Mar 2013 22:59:07 +0100 Subject: Arrays methods In-Reply-To: <5142472F.6090507@oracle.com> References: <513D202B.7070702@oracle.com> <52084429-74F5-4D4C-BE0A-B91681DAF072@oracle.com> <513E267C.4090709@oracle.com> <513E31B0.9000504@oracle.com> <5140D2A7.1070508@oracle.com> <94865406-9844-4F0D-87E1-0AC25AB3ECB0@oracle.com> <5140F356.8020000@oracle.com> <1F520CCC-2A59-4C57-A0E3-C18FE5514A03@oracle.com> <5141FA1E.1070303@oracle.com> <5142472F.6090507@oracle.com> Message-ID: <5142482B.4040206@univ-mlv.fr> On 03/14/2013 10:54 PM, Brian Goetz wrote: > Here's what these would look like: > > New SAMS: {Int,Long,Double}To{Int,Long,Double}Function > > New Arrays methods: and with the wildcards: > > /** > * Initialize all elements of the specified array, using the > provided generator function to compute each element. > * @param array Array to be initialized > * @param generator Function accepting an index and producing the > desired value for that position > * @param Type of elements of the array > */ > public void setAll(T[] array, IntFunction > generator) { > Streams.intRange(0, array.length).forEach(i -> { array[i] = > generator.apply(i); }); > } > > /** > * Initialize all elements of the specified array, in parallel, > using the provided generator function to > * compute each element. > * @param array Array to be initialized > * @param generator Function accepting an index and producing the > desired value for that position > * @param Type of elements of the array > */ > public void parallelSetAll(T[] array, IntFunction > generator) { > Streams.intRange(0, array.length).parallel().forEach(i -> { > array[i] = generator.apply(i); }); > } R?mi > > On 3/14/2013 12:26 PM, Brian Goetz wrote: >> Next roadblock: no suitable SAMs for int -> long, int -> double. >> >> Add? >> >> On 3/13/2013 6:39 PM, Mike Duigou wrote: >>> Yes >>> >>> >>> >>> On 2013-03-13, at 14:44, Brian Goetz wrote: >>> >>>> Fill implies "set all elements"; a set name would probably have to >>>> say "setAll": >>>> >>>> Arrays.setAll(array, fn) >>>> Arrays.parallelSetAll(array, fn) >>>> >>>> OK? >>>> >>>> On 3/13/2013 5:30 PM, Joe Bowbeer wrote: >>>>> I agree with the critique of 'fill' names. >>>>> >>>>> I like 'set' names. >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Mar 13, 2013 at 1:28 PM, Mike Duigou >>>> > wrote: >>>>> >>>>> Arrays.indexFill(array, fn) >>>>> Arrays.indexedFill(array, fn) >>>>> Arrays.fillIndexed(array, fn) >>>>> Arrays.indexedSet(array, fn) >>>>> >>>>> I think it might be better to stay away from "fill" names because >>>>> the current fill methods all have the property that every array >>>>> element is assigned the same value. This new operation allows a >>>>> different value to be assigned to each element. >>>>> >>>>> Mike >>>>> >>>>> On Mar 13 2013, at 12:25 , Brian Goetz wrote: >>>>> >>>>> >> If we added >>>>> >> >>>>> >> void fill(T[], IntFunction gen) >>>>> >> >>>>> >> then existing calls to >>>>> >> >>>>> >> fill(array, null) >>>>> >> >>>>> >> would become ambiguous. Doh. (But the other 17 forms are >>>>> not >>>>> >> problematic.) >>>>> >> >>>>> >> Any suggestions for alternate names? >>>>> > >>>>> > Arrays.generate(array, fn) >>>>> > Arrays.fillApplying(array, fn) >>>>> > Arrays.initialize(array, fn) >>>>> > Arrays.setAll(array, fn) >>>>> > >>>>> > ... >>>>> >>>>> From dl at cs.oswego.edu Thu Mar 14 16:36:04 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 14 Mar 2013 19:36:04 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <514203AE.3070501@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> Message-ID: <51425EE4.5010301@cs.oswego.edu> On 03/14/13 13:06, Brian Goetz wrote: > I've closed the survey, results are at: > > > https://www.surveymonkey.com/sr.aspx?sm=c2NqWp6wXUxCUlr6SY05nYEyYIr7ShzH3IgL4OXPIYM_3d > > > Here, we did not reach a clear consensus. However, I think some people may have > misunderstood the question. I'll let Doug, as proponent of this approach, take > another swing at what is being proposed here, and why this might achieve > best-of-both-worlds. The argument is straightforward. Nothing very best-ish about it though: 1. It is possible to obtain all functionality of all Stream methods without encountering Optional, except for findAny/findFirst. 2. Optional remains controversial. Some people hate it. Some people love it. Why single out findAny/findFirst as a battleground? -Doug (PS: As always, I think Optional is so great as to be essential if you have Value types. Oh, we don't have Value types...) > > On 3/10/2013 7:22 PM, Brian Goetz wrote: >> I've posted a survey for the EG at: >> >> https://www.surveymonkey.com/s/NSXMYC2 >> >> where people can express their preference between: >> - Leave things as they are (Optional-bearing methods for findXxx and >> reduce); >> - Add, as Doug suggests, non-optional versions of these too. >> >> Implementation / spec complexity is a non-issue here -- the >> implementations are trivial. The sole issue is whether the API is >> better with one version or with both. >> >> The password has been communicated directly to the EG; contact me if you >> didn't get it. >> >> Usual survey rules: enter your name with your response, all results will >> be made public after the survey closes. I'll set a closing time of 6PM >> PT Wednesday of this week. >> >> >> On 3/6/2013 7:09 AM, Doug Lea wrote: >>> (Restricting to lambda-libs list...) >>> >>> On 03/06/13 04:47, Remi Forax wrote: >>>> Ok, let be nuclear on this, >>>> There is no good reason to introduce Optional in java.util. >>> >>> We agree about most of the rationale for not using Optional. >>> But there are still people who say they want it. >>> I don't think it is productive at this point to >>> argue about features supporting an Optional-laden >>> programming style. But we never seem to hit closure >>> about features supporting an Optional-free style. >>> So I'd like to re-propose a simple compromise. >>> In the same way that there are Optional and >>> basis-returning versions of reduce: >>> >>> T reduce(T identity, BinaryOperator reducer); >>> Optional reduce(BinaryOperator reducer); >>> >>> (Where the basis-returning one can in turn be used to >>> avoid Optional-returning min(), etc). We should do the >>> same at least for find, or more in keeping with current >>> API, findFirst and findAny: >>> >>> T findFirst(Predicate predicate, T ifNone); >>> T findAny(Predicate predicate, T ifNone); >>> >>> People wanting to avoid Optional can then then >>> get all of the derived versions (allMatch, plain >>> findAny, etc) easily enough. >>> >>> Surprisingly enough, that's the only missing >>> feature that would otherwise enable a completely >>> Optional-free usage style of the Stream API. >>> >>> We have both proposed variants of this several times, >>> but they don't seem to go anywhere. It would be nice >>> to have a calm final discussion about why we would NOT >>> do such an apparently sensible thing! >>> >>> -Doug >>> > From joe.bowbeer at gmail.com Fri Mar 15 03:26:19 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 15 Mar 2013 03:26:19 -0700 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <51425EE4.5010301@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> Message-ID: Doug, I think your point that Optional and non-Optional forms of reduce are already provided is significant. I noticed that your proposed versions of findFirst and findAny have a Predicate argument, but the Optional forms do not: T findFirst(Predicate predicate, T ifNone); Why is this? On Thu, Mar 14, 2013 at 4:36 PM, Doug Lea
wrote: > On 03/14/13 13:06, Brian Goetz wrote: > >> I've closed the survey, results are at: >> >> >> https://www.surveymonkey.com/**sr.aspx?sm=** >> c2NqWp6wXUxCUlr6SY05nYEyYIr7Sh**zH3IgL4OXPIYM_3d >> >> >> Here, we did not reach a clear consensus. However, I think some people >> may have >> misunderstood the question. I'll let Doug, as proponent of this >> approach, take >> another swing at what is being proposed here, and why this might achieve >> best-of-both-worlds. >> > > The argument is straightforward. Nothing very best-ish about it though: > > 1. It is possible to obtain all functionality of all Stream > methods without encountering Optional, except for findAny/findFirst. > > 2. Optional remains controversial. Some people hate it. Some > people love it. Why single out findAny/findFirst as a battleground? > > -Doug > > (PS: As always, I think Optional is so great as to be essential if you > have Value types. Oh, we don't have Value types...) > > > >> On 3/10/2013 7:22 PM, Brian Goetz wrote: >> >>> I've posted a survey for the EG at: >>> >>> https://www.surveymonkey.com/**s/NSXMYC2 >>> >>> where people can express their preference between: >>> - Leave things as they are (Optional-bearing methods for findXxx and >>> reduce); >>> - Add, as Doug suggests, non-optional versions of these too. >>> >>> Implementation / spec complexity is a non-issue here -- the >>> implementations are trivial. The sole issue is whether the API is >>> better with one version or with both. >>> >>> The password has been communicated directly to the EG; contact me if you >>> didn't get it. >>> >>> Usual survey rules: enter your name with your response, all results will >>> be made public after the survey closes. I'll set a closing time of 6PM >>> PT Wednesday of this week. >>> >>> >>> On 3/6/2013 7:09 AM, Doug Lea wrote: >>> >>>> (Restricting to lambda-libs list...) >>>> >>>> On 03/06/13 04:47, Remi Forax wrote: >>>> >>>>> Ok, let be nuclear on this, >>>>> There is no good reason to introduce Optional in java.util. >>>>> >>>> >>>> We agree about most of the rationale for not using Optional. >>>> But there are still people who say they want it. >>>> I don't think it is productive at this point to >>>> argue about features supporting an Optional-laden >>>> programming style. But we never seem to hit closure >>>> about features supporting an Optional-free style. >>>> So I'd like to re-propose a simple compromise. >>>> In the same way that there are Optional and >>>> basis-returning versions of reduce: >>>> >>>> T reduce(T identity, BinaryOperator reducer); >>>> Optional reduce(BinaryOperator reducer); >>>> >>>> (Where the basis-returning one can in turn be used to >>>> avoid Optional-returning min(), etc). We should do the >>>> same at least for find, or more in keeping with current >>>> API, findFirst and findAny: >>>> >>>> T findFirst(Predicate predicate, T ifNone); >>>> T findAny(Predicate predicate, T ifNone); >>>> >>>> People wanting to avoid Optional can then then >>>> get all of the derived versions (allMatch, plain >>>> findAny, etc) easily enough. >>>> >>>> Surprisingly enough, that's the only missing >>>> feature that would otherwise enable a completely >>>> Optional-free usage style of the Stream API. >>>> >>>> We have both proposed variants of this several times, >>>> but they don't seem to go anywhere. It would be nice >>>> to have a calm final discussion about why we would NOT >>>> do such an apparently sensible thing! >>>> >>>> -Doug >>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130315/65e3b904/attachment.html From dl at cs.oswego.edu Fri Mar 15 04:31:26 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 15 Mar 2013 07:31:26 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> Message-ID: <5143068E.8040900@cs.oswego.edu> On 03/15/13 06:26, Joe Bowbeer wrote: > Doug, > > I think your point that Optional and non-Optional forms of reduce are already > provided is significant. > > I noticed that your proposed versions of findFirst and findAny have a Predicate > argument, but the Optional forms do not: > > T findFirst(Predicate predicate, T ifNone); > > Why is this? It's in the spirit of proposing a minimal change. The predicate form suffices for all Optional-avoiding search stuff. To reduce impact by another 50%, it would suffice to ONLY include the "any" form. T findAny(Predicate predicate, T ifNone); -Doug From brian.goetz at oracle.com Fri Mar 15 06:46:48 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2013 09:46:48 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5143068E.8040900@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> <5143068E.8040900@cs.oswego.edu> Message-ID: <51432648.9020902@oracle.com> Wouldn't the minimal change NOT have a predicate, to match the existing form of findFirst? Optional findFirst() T findFirst(T orElse) On 3/15/2013 7:31 AM, Doug Lea wrote: > On 03/15/13 06:26, Joe Bowbeer wrote: >> Doug, >> >> I think your point that Optional and non-Optional forms of reduce are >> already >> provided is significant. >> >> I noticed that your proposed versions of findFirst and findAny have a >> Predicate >> argument, but the Optional forms do not: >> >> T findFirst(Predicate predicate, T ifNone); >> >> Why is this? > > > It's in the spirit of proposing a minimal change. The predicate > form suffices for all Optional-avoiding search stuff. To reduce > impact by another 50%, it would suffice to ONLY include the "any" form. > T findAny(Predicate predicate, T ifNone); > > -Doug > > > > > From dl at cs.oswego.edu Fri Mar 15 08:04:28 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 15 Mar 2013 11:04:28 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <51432648.9020902@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> <5143068E.8040900@cs.oswego.edu> <51432648.9020902@oracle.com> Message-ID: <5143387C.6090709@cs.oswego.edu> On 03/15/13 09:46, Brian Goetz wrote: > Wouldn't the minimal change NOT have a predicate, to match the existing form of > findFirst? > > Optional findFirst() > T findFirst(T orElse) > Yes and no. The only way to get non-optional-bearing result for search would otherwise be s.filter(pred).findAny(), which entails buffering of stuff you will throw away. This is also the reason only adding why findAny(pred) (not findDirst) is defensible: the alternative is of most interest to the sort of person who want to avoid that Optional too. -Doug > > On 3/15/2013 7:31 AM, Doug Lea wrote: >> On 03/15/13 06:26, Joe Bowbeer wrote: >>> Doug, >>> >>> I think your point that Optional and non-Optional forms of reduce are >>> already >>> provided is significant. >>> >>> I noticed that your proposed versions of findFirst and findAny have a >>> Predicate >>> argument, but the Optional forms do not: >>> >>> T findFirst(Predicate predicate, T ifNone); >>> >>> Why is this? >> >> >> It's in the spirit of proposing a minimal change. The predicate >> form suffices for all Optional-avoiding search stuff. To reduce >> impact by another 50%, it would suffice to ONLY include the "any" form. >> T findAny(Predicate predicate, T ifNone); >> >> -Doug >> >> >> >> >> > From joe.bowbeer at gmail.com Fri Mar 15 11:56:18 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 15 Mar 2013 11:56:18 -0700 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5143387C.6090709@cs.oswego.edu> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> <5143068E.8040900@cs.oswego.edu> <51432648.9020902@oracle.com> <5143387C.6090709@cs.oswego.edu> Message-ID: It wasn't obvious to me until recently that it is the short-circuiting behavior of the 'find' methods that is hardest to derive. Without short-circuiting, findFirst forms could be derived as follows, but they will fail on infinite streams: T findFirst(T ifNone) { return reduce(ifNone, (l, r) -> (l != ifNone) ? l : r); } T findFirst(Predicate predicate, T ifNone) { return reduce(ifNone, (l, r) -> (l != ifNone || !predicate.test(r)) ? l : r); } I'm not sure why, but I'm liking the idea of only adding findAny(predicate, ifNone). --Joe On Fri, Mar 15, 2013 at 8:04 AM, Doug Lea
wrote: > On 03/15/13 09:46, Brian Goetz wrote: > >> Wouldn't the minimal change NOT have a predicate, to match the existing >> form of >> findFirst? >> >> Optional findFirst() >> T findFirst(T orElse) >> >> > Yes and no. The only way to get non-optional-bearing > result for search would otherwise be s.filter(pred).findAny(), > which entails buffering of stuff you will throw away. > > This is also the reason only adding why findAny(pred) (not findDirst) > is defensible: the alternative is of most interest to the sort of > person who want to avoid that Optional too. > > -Doug > > > > > >> On 3/15/2013 7:31 AM, Doug Lea wrote: >> >>> On 03/15/13 06:26, Joe Bowbeer wrote: >>> >>>> Doug, >>>> >>>> I think your point that Optional and non-Optional forms of reduce are >>>> already >>>> provided is significant. >>>> >>>> I noticed that your proposed versions of findFirst and findAny have a >>>> Predicate >>>> argument, but the Optional forms do not: >>>> >>>> T findFirst(Predicate predicate, T ifNone); >>>> >>>> Why is this? >>>> >>> >>> >>> It's in the spirit of proposing a minimal change. The predicate >>> form suffices for all Optional-avoiding search stuff. To reduce >>> impact by another 50%, it would suffice to ONLY include the "any" form. >>> T findAny(Predicate predicate, T ifNone); >>> >>> -Doug >>> >>> >>> >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130315/2e5dd89f/attachment-0001.html From brian.goetz at oracle.com Fri Mar 15 12:06:36 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2013 15:06:36 -0400 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> <5143068E.8040900@cs.oswego.edu> <51432648.9020902@oracle.com> <5143387C.6090709@cs.oswego.edu> Message-ID: <5143713C.3070306@oracle.com> Even ignoring infinite streams, it would still perform poorly on large streams if generating the stream elements (or the upstream operations) have any cost whatsoever. The short-circuiting ops (find, matchXxx, and limit) force traversal to use tryAdvance instead of forEach, to be more responsive. There really are two orthogonal considerations with find: - Should find() take a predicate, or not? - How do we specify what to return if you find nothing? It seems to me that people gravitate towards the predicate version because of the name "find", but if it were just called "first", they might feel differently. In any case, I find the predicate version a distraction: - You can always specify a predicate using an upstream filter(), which is not that much less efficient, just one more pipeline stage. Specifying the predicate in find is just a fusing optimization, and not one that saves a lot of extra work. - For people who just want the first, having to specify a "no-op" predicate is annoying. Which pushes us back to two versions, a predicate-version and a predicate-less version -- and I don't think the predicate version carries its weight. However, maybe the name is just wrong. The second is the Optional vs default. We're not revisiting the "whither Optional" discussion here; what we're discussing here is whether it is better to have both, for both findXxx and various reduce. Optional findFirst() T findFirstOrElse(T sentinal) The argument in favor of both is that the latter is more efficient, without running into the "null might mean nothing or might mean that the first element was null" problem. Though I suspect a lot of people would use null as the sentinel and then shoot themselves in the foot anyway. On 3/15/2013 2:56 PM, Joe Bowbeer wrote: > It wasn't obvious to me until recently that it is the short-circuiting > behavior of the 'find' methods that is hardest to derive. > > Without short-circuiting, findFirst forms could be derived as follows, > but they will fail on infinite streams: > > T findFirst(T ifNone) { > return reduce(ifNone, (l, r) -> (l != ifNone) ? l : r); > } > > T findFirst(Predicate predicate, T ifNone) { > return reduce(ifNone, (l, r) -> (l != ifNone || !predicate.test(r)) ? > l : r); > } > > I'm not sure why, but I'm liking the idea of only adding > findAny(predicate, ifNone). > > --Joe > > > On Fri, Mar 15, 2013 at 8:04 AM, Doug Lea
> wrote: > > On 03/15/13 09:46, Brian Goetz wrote: > > Wouldn't the minimal change NOT have a predicate, to match the > existing form of > findFirst? > > Optional findFirst() > T findFirst(T orElse) > > > Yes and no. The only way to get non-optional-bearing > result for search would otherwise be s.filter(pred).findAny(), > which entails buffering of stuff you will throw away. > > This is also the reason only adding why findAny(pred) (not findDirst) > is defensible: the alternative is of most interest to the sort of > person who want to avoid that Optional too. > > -Doug > > > > > > On 3/15/2013 7:31 AM, Doug Lea wrote: > > On 03/15/13 06:26, Joe Bowbeer wrote: > > Doug, > > I think your point that Optional and non-Optional forms > of reduce are > already > provided is significant. > > I noticed that your proposed versions of findFirst and > findAny have a > Predicate > argument, but the Optional forms do not: > > T findFirst(Predicate predicate, T ifNone); > > Why is this? > > > > It's in the spirit of proposing a minimal change. The predicate > form suffices for all Optional-avoiding search stuff. To reduce > impact by another 50%, it would suffice to ONLY include the > "any" form. > T findAny(Predicate predicate, T ifNone); > > -Doug > > > > > > > > From joe.bowbeer at gmail.com Fri Mar 15 14:09:04 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 15 Mar 2013 14:09:04 -0700 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <5143713C.3070306@oracle.com> References: <513710CC.3010903@univ-mlv.fr> <5137320B.60001@cs.oswego.edu> <513D159A.7050008@oracle.com> <514203AE.3070501@oracle.com> <51425EE4.5010301@cs.oswego.edu> <5143068E.8040900@cs.oswego.edu> <51432648.9020902@oracle.com> <5143387C.6090709@cs.oswego.edu> <5143713C.3070306@oracle.com> Message-ID: I'm OK with these two forms. I would use findFirst() if I wanted to throw an exception if empty -- or execute an 'else' clause. I would use findFirstOrElse(ifEmpty) if I wanted to substitute a different value. On Fri, Mar 15, 2013 at 12:06 PM, Brian Goetz wrote: > Even ignoring infinite streams, it would still perform poorly on large > streams if generating the stream elements (or the upstream operations) have > any cost whatsoever. > > The short-circuiting ops (find, matchXxx, and limit) force traversal to > use tryAdvance instead of forEach, to be more responsive. > > There really are two orthogonal considerations with find: > > - Should find() take a predicate, or not? > - How do we specify what to return if you find nothing? > > It seems to me that people gravitate towards the predicate version because > of the name "find", but if it were just called "first", they might feel > differently. In any case, I find the predicate version a distraction: > - You can always specify a predicate using an upstream filter(), which is > not that much less efficient, just one more pipeline stage. Specifying the > predicate in find is just a fusing optimization, and not one that saves a > lot of extra work. > - For people who just want the first, having to specify a "no-op" > predicate is annoying. Which pushes us back to two versions, a > predicate-version and a predicate-less version -- and I don't think the > predicate version carries its weight. However, maybe the name is just > wrong. > > The second is the Optional vs default. We're not revisiting the "whither > Optional" discussion here; what we're discussing here is whether it is > better to have both, for both findXxx and various reduce. > > Optional findFirst() > T findFirstOrElse(T sentinal) > > The argument in favor of both is that the latter is more efficient, > without running into the "null might mean nothing or might mean that the > first element was null" problem. Though I suspect a lot of people would > use null as the sentinel and then shoot themselves in the foot anyway. > > > > > > On 3/15/2013 2:56 PM, Joe Bowbeer wrote: > >> It wasn't obvious to me until recently that it is the short-circuiting >> behavior of the 'find' methods that is hardest to derive. >> >> Without short-circuiting, findFirst forms could be derived as follows, >> but they will fail on infinite streams: >> >> T findFirst(T ifNone) { >> return reduce(ifNone, (l, r) -> (l != ifNone) ? l : r); >> } >> >> T findFirst(Predicate predicate, T ifNone) { >> return reduce(ifNone, (l, r) -> (l != ifNone || !predicate.test(r)) ? >> l : r); >> } >> >> I'm not sure why, but I'm liking the idea of only adding >> findAny(predicate, ifNone). >> >> --Joe >> >> >> On Fri, Mar 15, 2013 at 8:04 AM, Doug Lea
> > wrote: >> >> On 03/15/13 09:46, Brian Goetz wrote: >> >> Wouldn't the minimal change NOT have a predicate, to match the >> existing form of >> findFirst? >> >> Optional findFirst() >> T findFirst(T orElse) >> >> >> Yes and no. The only way to get non-optional-bearing >> result for search would otherwise be s.filter(pred).findAny(), >> which entails buffering of stuff you will throw away. >> >> This is also the reason only adding why findAny(pred) (not findDirst) >> is defensible: the alternative is of most interest to the sort of >> person who want to avoid that Optional too. >> >> -Doug >> >> >> >> >> >> On 3/15/2013 7:31 AM, Doug Lea wrote: >> >> On 03/15/13 06:26, Joe Bowbeer wrote: >> >> Doug, >> >> I think your point that Optional and non-Optional forms >> of reduce are >> already >> provided is significant. >> >> I noticed that your proposed versions of findFirst and >> findAny have a >> Predicate >> argument, but the Optional forms do not: >> >> T findFirst(Predicate predicate, T ifNone); >> >> Why is this? >> >> >> >> It's in the spirit of proposing a minimal change. The >> predicate >> form suffices for all Optional-avoiding search stuff. To >> reduce >> impact by another 50%, it would suffice to ONLY include the >> "any" form. >> T findAny(Predicate predicate, T ifNone); >> >> -Doug >> >> >> >> >> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130315/37655ef5/attachment.html From brian.goetz at oracle.com Sat Mar 16 09:22:42 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 16 Mar 2013 12:22:42 -0400 Subject: Collectors -- finally! Message-ID: <51449C52.9050700@oracle.com> I believe I have the last word on Collector. Recall the overriding goal of this effort is to support *composibility*, so that "intermediate collecting stages" like groupBy / partition / mapping can be combined with other collections/reductions to let the user easily mix and match, rather than providing limited ad-hoc reductions like "groupBy". In retrospect, the central tension in the API, which was causing in various versions API bloat and API complexity, was the fact that we were treating cascaded functional reduction and cascaded mutable reduction differently, when both are really 95% the same thing. In the first version, it meant 16 forms of grouping{By,Reduce}, and half of those were due to reduction. In the version from last week, this trouble came out as overloading the GroupingCollector as being both a Collector and a factory for more complex Collectors. The answer, I believe, is to "detune" the Collector abstraction so that it can model either a functional or a mutable reduction. This slightly increases the pain of implementing a Collector (but not really), and slightly-more-than-slightly increases the pain of implementing reduction atop a Collector -- but that's good because the only true client of Collector is the Streams implementation, and that's where the pain belongs. By moving the pain to the core framework implementation, the user code gets simpler, there are fewer exposed concepts, and the explosion of combinations becomes more manageable. Here's Collector now: public interface Collector { Supplier resultSupplier(); BiFunction accumulator(); BinaryOperator combiner(); default boolean isConcurrent() { return false; } default boolean isStable() { return false; } } API-wise, what's changed is accumulator returns a BiFunction, not a BiConsumer, raising the possibility that the accumulation operation could change the container. This opens the doors to some more interesting Collector implementations, and makes it more parallel with the combiner, which we turned into a BiFunction in the first round of Collector. The other new method is "isStable" (better name invited), which is merely an indication that this collector will act as an "old style" mutable Collector which opens the doors to some optimizations in the concurrent implementation. (Ignore for now, its purely an optimization.) Spec-wise, it gets more complicated because there's more things a Collector can do. But again, that's mostly our problem. What this means is that we can now (finally) define a Collector for reduction: Collector reducing(BinaryOperator) Which means that half the forms (reduce, map-reduce) of the grouping and partitioning combinators go away, and instead just fold into the "cascaded collector" form. Which leaves us with the following grouping forms: groupingBy(classifier) groupingBy(classifier, mapCtor) groupingBy(classifier, downstreamCollector) groupingBy(classifier, mapCtor, downstreamCollector) along with groupingByConcurrent version of both. This is still a few versions, but it should be clear enough how they differ. The "max sale by salesman" example now becomes: Map txns.collect(groupingBy(Txn::seller, mapping(Txn::amount, reducing(Integer::max))); From the previous version, the intermediate types GroupingCollector/PartitionCollector go away, as does the unfortunate type fudgery with the map constructors. This is basically like the original version, but with half the groupingBy forms replaced with a single reducing() form. The Collectors inventory now stands at: - toList() - toSet() - toCollection(ctor) - toStringBuilder() - toStringJoiner(sep) - to{Int,Long,Double}Statistics - toMap(mappingFn) - toMap(mappingFn, mapCtor, mergeFn) - toConcurrentMap(mappingFn) - toConcurrentMap(mappingFn, mapCtor, mergeFn) - mapping(fn, collector) // plus primitive specializations - reducing(BinaryOperator) // plus primitive specializations - groupingBy(classifier) - groupingBy(classifier, mapCtor) - groupingBy(classifier, downstreamCollector) - groupingBy(classifier, mapCtor, downstreamCollector) - groupingByConcurrent(classifier) - groupingByConcurrent(classifier, mapCtor) - groupingByConcurrent(classifier, downstreamCollector) - groupingByConcurrent(classifier, mapCtor, downstreamCollector) - partitioningBy(predicate) The more flexible Collector API gives us new opportunities, too. For example, toList used to use exclusively ArrayList. But this version is more memory efficient: public static Collector> toList() { BiFunction, T, List> accumulator = (list, t) -> { int s = list.size(); if (s == 0) return Collections.singletonList(t); else if (s == 1) { List newList = new ArrayList<>(); newList.add(list.get(0)); newList.add(t); return newList; } else { list.add(t); return list; } }; BinaryOperator> combiner = (left, right) -> { if (left.size() > 1) { left.addAll(right); return left; } else { List newList = new ArrayList<>(left.size() + right.size()); newList.addAll(left); newList.addAll(right); return newList; } }; return new CollectorImpl<>(Collections::emptyList, accumulator, combiner, false, false); } From brian.goetz at oracle.com Mon Mar 18 07:27:51 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Mar 2013 10:27:51 -0400 Subject: Survey on map/flatMap disambiguation In-Reply-To: <5142032F.9070601@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> Message-ID: <51472467.6090502@oracle.com> It seems there were some second thoughts on this after the survey. I'll wait another 24 hours for discussion to take root, otherwise I'll push the change. On 3/14/2013 1:04 PM, Brian Goetz wrote: > The survey is closed, results are here: > > https://www.surveymonkey.com/sr.aspx?sm=9UyN8RRvMX8BnpTdd4rYgDlXU9uUVALNDjNn_2fY2e9_2fo_3d > > > The sense of the EG was strongly in favor of disambiguating both map and > flatMap; several argued that they liked the "less magic" aspect of it, > and the explicitness of where we go from reference to primitive streams > and back. > > This does create a possibility for performance bugs, where users do: > > stuff.map(Foo::size).reduce(0, Integer::sum) > > instead of > > stuff.mapToInt(Foo::size).reduce(0, Integer::sum) > > Both will now compile, but the former will be boxed and the latter won't > be. The previous status quo saved users from themselves in this case. > > Will make the following changes: > > Stream.map -> {map,mapTo{Int,Long,Double}} > Stream.flatMap -> {flatMap,flatMapTo{Int,Long,Double}} > {Int,Long,Double}Stream.map -> map,mapToObj > > > On 3/10/2013 7:51 PM, Brian Goetz wrote: >> I've posted a survey for the EG at: >> >> https://www.surveymonkey.com/s/NT5DW7G >> >> where people can express their opinion on the issue of flatMap >> disambiguation (see thread entitled "flatMap ambiguity"). >> >> The password has been communicated directly to the EG; contact me if you >> didn't get it. >> >> Usual survey rules: enter your name with your response, all results will >> be made public after the survey closes. I'll set a closing time of 6PM >> PT Wednesday of this week. From joe.bowbeer at gmail.com Tue Mar 19 07:10:28 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 19 Mar 2013 07:10:28 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: <51472467.6090502@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> Message-ID: The proposed changes, for review, are: Stream.map -> {map,mapTo{Int,Long,Double}} Stream.flatMap -> {flatMap,flatMapTo{Int,Long,**Double}} {Int,Long,Double}Stream.map -> map,mapToObj I think there are several points worth discussing: 1. Performance gotchas? The performance gotcha that Brian mentioned in his followup: This does create a possibility for performance bugs, where users do: > stuff.map(Foo::size).reduce(0, Integer::sum) > instead of > stuff.mapToInt(Foo::size).**reduce(0, Integer::sum) > Both will now compile, but the former will be boxed and the latter won't > be. The previous status quo saved users from themselves in this case. Does this change anyone's mind? 2. Consistency. What about other similar methods? My main concern is that this change will (or should?) cascade to other methods, and, in the end, a patch that need only be applied to flatMap ripples through the entire API. In offline correspondence, Brian writes: The "could use the wrong map" case is ugly, but then again so is having to > cast if you really wanted the boxed version. Over time we've been moving > in the direction of "less overloading when lambdas are involved", and the > move to more explicit names makes sense there (maybe there are others too > we should consider?) In hindsight, this ought to have been obvious (type > inference and same-arity-overloading always fight with each other.) Are there other cases that we should consider? If so, I suggest we consider them now before extending this merely to the map case. 3. Breaks map/reduce functional feel? The other concern that I stated previously is that this mars the familiar map/reduce functional feel, without helping enough with the usability or readability. Either way, an IDE will be indispensable. Joe On Mon, Mar 18, 2013 at 7:27 AM, Brian Goetz wrote: > It seems there were some second thoughts on this after the survey. I'll > wait another 24 hours for discussion to take root, otherwise I'll push the > change. > > > On 3/14/2013 1:04 PM, Brian Goetz wrote: > >> The survey is closed, results are here: >> >> https://www.surveymonkey.com/**sr.aspx?sm=** >> 9UyN8RRvMX8BnpTdd4rYgDlXU9uUVA**LNDjNn_2fY2e9_2fo_3d >> >> >> The sense of the EG was strongly in favor of disambiguating both map and >> flatMap; several argued that they liked the "less magic" aspect of it, >> and the explicitness of where we go from reference to primitive streams >> and back. >> >> This does create a possibility for performance bugs, where users do: >> >> stuff.map(Foo::size).reduce(0, Integer::sum) >> >> instead of >> >> stuff.mapToInt(Foo::size).**reduce(0, Integer::sum) >> >> Both will now compile, but the former will be boxed and the latter won't >> be. The previous status quo saved users from themselves in this case. >> >> Will make the following changes: >> >> Stream.map -> {map,mapTo{Int,Long,Double}} >> Stream.flatMap -> {flatMap,flatMapTo{Int,Long,**Double}} >> {Int,Long,Double}Stream.map -> map,mapToObj >> >> >> On 3/10/2013 7:51 PM, Brian Goetz wrote: >> >>> I've posted a survey for the EG at: >>> >>> https://www.surveymonkey.com/**s/NT5DW7G >>> >>> where people can express their opinion on the issue of flatMap >>> disambiguation (see thread entitled "flatMap ambiguity"). >>> >>> The password has been communicated directly to the EG; contact me if you >>> didn't get it. >>> >>> Usual survey rules: enter your name with your response, all results will >>> be made public after the survey closes. I'll set a closing time of 6PM >>> PT Wednesday of this week. >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130319/50afe8ba/attachment.html From brian.goetz at oracle.com Tue Mar 19 08:22:55 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 19 Mar 2013 11:22:55 -0400 Subject: Survey on map/flatMap disambiguation In-Reply-To: References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> Message-ID: <514882CF.8000308@oracle.com> Thanks for writing up your concerns all in one place. > 1. Performance gotchas? Of these, I think this is the worst of the concerns. However, I also think its not as bad as it sounds. When a user hits ctrl-space in the IDE, they'll see (close to each other) all the mapToXxx forms, which has actually a lot of educational value. [1] Secondly, while boxed performance is definitely much worse than non-boxed, for small streams (and most collections are small), it might not make a difference anyway. I've definitely had the experience many many times of discovering "egregious" performance "bugs" like this that turned out to have no effect on actual business-relevant performance metrics, because all the cost was in the { XML parsing, database access, network latency, crypto, etc }. For those cases where it does make a difference, profiling will disclose this immediately. So its a gotcha, but not a disaster. So I think its a gotcha but nothing so bad that this makes the difference for me. > 2. Consistency. What about other similar methods? > > My main concern is that this change will (or should?) cascade to other > methods, and, in the end, a patch that need only be applied to flatMap > ripples through the entire API. I did a quick look and didn't see any other obvious examples of this pattern, where we've overloaded methods for all the X-to-Y stream conversions. Did you find any ripples I missed? > 3. Breaks map/reduce functional feel? > > The other concern that I stated previously is that this mars the > familiar map/reduce functional feel, without helping enough with the > usability or readability. Either way, an IDE will be indispensable. This is obviously subjective but for me it felt OK. [1] Educating people that there are multiple stream shapes, whose methods are similar but not exactly the same, is important. Calling all the methods "map" may make people believe that there's a sum() method on Stream, and be surprised when there is not. But .mapToInt(...).sum() makes it more obvious what is going on here, which is arguably a plus. From forax at univ-mlv.fr Tue Mar 19 08:38:27 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 19 Mar 2013 16:38:27 +0100 Subject: Survey on map/flatMap disambiguation In-Reply-To: <514882CF.8000308@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> <514882CF.8000308@oracle.com> Message-ID: <51488673.5070001@univ-mlv.fr> On 03/19/2013 04:22 PM, Brian Goetz wrote: > Thanks for writing up your concerns all in one place. > >> 1. Performance gotchas? > > Of these, I think this is the worst of the concerns. However, I also > think its not as bad as it sounds. When a user hits ctrl-space in the > IDE, they'll see (close to each other) all the mapToXxx forms, which > has actually a lot of educational value. [1] > > Secondly, while boxed performance is definitely much worse than > non-boxed, for small streams (and most collections are small), it > might not make a difference anyway. I've definitely had the > experience many many times of discovering "egregious" performance > "bugs" like this that turned out to have no effect on actual > business-relevant performance metrics, because all the cost was in the > { XML parsing, database access, network latency, crypto, etc }. For > those cases where it does make a difference, profiling will disclose > this immediately. So its a gotcha, but not a disaster. > > So I think its a gotcha but nothing so bad that this makes the > difference for me. so we should remove the specialized streams because they not worth their weight. R?mi From brian.goetz at oracle.com Tue Mar 19 10:45:44 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 19 Mar 2013 13:45:44 -0400 Subject: Fwd: hg: lambda/lambda/jdk: Eliminate Collector.Of{Int,Long,Double}; make {Int,Long,Double}Statistics top-level classes; add .statistics() methods to {Int,Long,DOuble}Stream; add Collectors.to{ILD}Statistics(mapper); add Collectors.reduce(mapper, reducer); eliminate Collectors.mapper forms; more efficient implementation of {ILD}Stream.average In-Reply-To: <20130318172822.8F1A048217@hg.openjdk.java.net> References: <20130318172822.8F1A048217@hg.openjdk.java.net> Message-ID: <5148A448.9040709@oracle.com> The changeset referenced below moves the XxxStatistics classes to top-level in java.util.stream for now, but we should decide where they go as they are useful beyond just streams. I also reverted the use of Optional here, because there is a way to distinguish between the return of the default value and "no data" -- count() == 0. We have specialized versions for each of the components (count, sum, average, min, max) for those for whom computing all the statistics is too slow. I added statistics() as a method to the primitive streams as well (simple implementation: collect(XxxStats::new, XxxStats::add, XxxStats::combine)). One thing missing from this basic set of summary statistics is "sum of squares", which is needed for all sorts of statistical calculations. -------- Original Message -------- Subject: hg: lambda/lambda/jdk: Eliminate Collector.Of{Int,Long,Double}; make {Int,Long,Double}Statistics top-level classes; add .statistics() methods to {Int,Long,DOuble}Stream; add Collectors.to{ILD}Statistics(mapper); add Collectors.reduce(mapper, reducer); eliminate Collectors.mapper forms; more efficient implementation of {ILD}Stream.average Date: Mon, 18 Mar 2013 17:27:47 +0000 From: brian.goetz at oracle.com To: lambda-dev at openjdk.java.net Changeset: 1a6b75a82fe0 Author: briangoetz Date: 2013-03-18 13:06 -0400 URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/1a6b75a82fe0 From brian.goetz at oracle.com Wed Mar 20 12:25:00 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 20 Mar 2013 15:25:00 -0400 Subject: hg: lambda/lambda/jdk: Eliminate Collector.Of{Int,Long,Double}; make {Int,Long,Double}Statistics top-level classes; add .statistics() methods to {Int,Long,DOuble}Stream; add Collectors.to{ILD}Statistics(mapper); add Collectors.reduce(mapper, reducer); eliminate Collectors.mapper forms; more efficient implementation of {ILD}Stream.average In-Reply-To: References: <20130318172822.8F1A048217@hg.openjdk.java.net> <5148A448.9040709@oracle.com> Message-ID: <514A0D0C.9000805@oracle.com> > I think the "statistics" method names and the Statistics class names are > out of place at the top level. > > The main problem is that statistics is too vague. It could refer to > almost anything, and so I find it distracting. Joe and I had an offline discussion and we both like "SummaryStatistics" as a name, which is common with ParallelArray. It better captures the summary statistics we're gathering: count/sum/min/max. The question is open about where to put it. I am open to suggestions. It has no dependency on anything in j.u.stream. I am on the fence about sum of squares. In most of the times I've needed to gather more than one summary statistic about a population, I eventually wanted sample variance too. Of course, there's a slippery slope -- geometric mean, median, mode, t-statistics. But I think we can make a compelling enough argument for stopping either right before sum(sq) or right after. From dl at cs.oswego.edu Wed Mar 20 12:40:08 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 20 Mar 2013 15:40:08 -0400 Subject: hg: lambda/lambda/jdk: ... ge In-Reply-To: <514A0D0C.9000805@oracle.com> References: <20130318172822.8F1A048217@hg.openjdk.java.net> <5148A448.9040709@oracle.com> <514A0D0C.9000805@oracle.com> Message-ID: <514A1098.2070207@cs.oswego.edu> On 03/20/13 15:25, Brian Goetz wrote: > I am on the fence about sum of squares. In most of the times I've needed to > gather more than one summary statistic about a population, I eventually wanted > sample variance too. Of course, there's a slippery slope -- geometric mean, > median, mode, t-statistics. But I think we can make a compelling enough > argument for stopping either right before sum(sq) or right after. > The only complaint I ever got about parallelArray version was not computing sum-of-squares. So there's at least one vote out there for it. -Doug From paul.sandoz at oracle.com Thu Mar 21 03:38:32 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 21 Mar 2013 11:38:32 +0100 Subject: Creating streams from ranges, functions and suppliers Message-ID: <2D2CE5F2-2325-4DED-8B16-31011C339C78@oracle.com> Hi, In the lambda repo there are bunch of ways to create streams from non-collection sources: - Primitive ranges - Infinite streams whose elements are the result of repeatedly applying a function - Infinite streams whose elements are generated from a supplier The infinite streams are intended to be used in conjunction with limit(), substream() or short-circuiting terminal operations. Thoughts? Paul. -- Ranges: public static IntStream intRange(int start, int end) { public static IntStream intRange(int start, int end, int step) { public static LongStream longRange(long start, final long end) { public static LongStream longRange(long start, final long end, final long step) { public static DoubleStream doubleRange(double start, double end) { public static DoubleStream doubleRange(double start, double end, double step) { A stream created from doubleRange will have a maximum element count of Long.MAX_VALUE. The implementation of doubleRange is equivalent to: long size = (long) Math.ceil((start - end) / step); DoubleStream ds = Streams.longStream(0, size).doubles().map(i -> start + step * i); By providing a method we can ensure developers will do the right thing in terms of splitting (require consistent values on traversal), check for edge numerical cases e.g. elements are not all distinct, and be performant. There are no ints() or longs() methods but these could be useful when used in conjunction with limit() or substream(). Implementation-wise they are trivial: intRange(0, Integer.MAX_VALUE) I think they could be a useful idiom (see below on the generate method). doubles() may also be useful but perhaps should be restricted to a max size of 2^53 + 1 to ensure integer values are precisely represented. -- Infinite streams whose elements are the result of repeatedly applying a function: public static Stream iterate(final T seed, final UnaryOperator f) { public static IntStream iterateInt(final int seed, final IntUnaryOperator f) { public static LongStream iterateLong(final long seed, final LongUnaryOperator f) { public static DoubleStream iterateDouble(final double seed, final DoubleUnaryOperator f) { Essentially create streams of (seed, f(seed), f(f(seed)), ...). This is a common functional idiom. The stream that is created has an encounter order and permits limited parallelism since an iterator is used to repeatedly apply the function to the previous result. -- Infinite streams whose elements are generated from a supplier: public static Stream generate(Supplier s) { public static IntStream generateInt(IntSupplier s) { public static LongStream generateLong(LongSupplier s) { public static DoubleStream generateDouble(DoubleSupplier s) { The use in java.util.Random is a good example: public IntStream ints() { return Streams.generateInt(this::nextInt); } Currently the generators are iterator based and thus only permits limited parallelism, and there are guarantees that the nth element encountered corresponds to the nth call of Supplier.get(). We could change the implementation to be equivalent to: longs().map(e -> s.get()) The stream would not have an encounter order, and enables balanced parallel computation (and therefore better resource utilization for reduction operations). For parallel streams the supplier would be invoked concurrently. The stream would no longer be known to be infinite, but would be "not known to be finite" from the perspective of the caller e.g. calling forEach my take more time and do more work than one expects unless one limits the number of elements to a known finite size. The existing functionality could be achieved using iterate: iterate(s.get(), i- > s.get()); So i think we should change generate as described. From brian.goetz at oracle.com Thu Mar 21 07:19:37 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2013 10:19:37 -0400 Subject: Creating streams from ranges, functions and suppliers In-Reply-To: <2D2CE5F2-2325-4DED-8B16-31011C339C78@oracle.com> References: <2D2CE5F2-2325-4DED-8B16-31011C339C78@oracle.com> Message-ID: <514B16F9.6060801@oracle.com> Some follow-up questions: - Do any of these seem NOT to carry their weight? - Are there any forms that are obviously missing? Specifically, indexed forms (or is range().map() good enough?) - If we have an ints() or longs(), should these be ranges or truly infinite? Or both? On 3/21/2013 6:38 AM, Paul Sandoz wrote: > Hi, > > In the lambda repo there are bunch of ways to create streams from non-collection sources: > > - Primitive ranges > - Infinite streams whose elements are the result of repeatedly applying a function > - Infinite streams whose elements are generated from a supplier > > The infinite streams are intended to be used in conjunction with limit(), substream() or short-circuiting terminal operations. > > Thoughts? > > Paul. > > -- > > Ranges: > > public static IntStream intRange(int start, int end) { > public static IntStream intRange(int start, int end, int step) { > > public static LongStream longRange(long start, final long end) { > public static LongStream longRange(long start, final long end, final long step) { > > public static DoubleStream doubleRange(double start, double end) { > public static DoubleStream doubleRange(double start, double end, double step) { > > > A stream created from doubleRange will have a maximum element count of Long.MAX_VALUE. The implementation of doubleRange is equivalent to: > > long size = (long) Math.ceil((start - end) / step); > DoubleStream ds = Streams.longStream(0, size).doubles().map(i -> start + step * i); > > By providing a method we can ensure developers will do the right thing in terms of splitting (require consistent values on traversal), check for edge numerical cases e.g. elements are not all distinct, and be performant. > > > There are no ints() or longs() methods but these could be useful when used in conjunction with limit() or substream(). Implementation-wise they are trivial: > > intRange(0, Integer.MAX_VALUE) > > I think they could be a useful idiom (see below on the generate method). > > doubles() may also be useful but perhaps should be restricted to a max size of 2^53 + 1 to ensure integer values are precisely represented. > > -- > > Infinite streams whose elements are the result of repeatedly applying a function: > > public static Stream iterate(final T seed, final UnaryOperator f) { > public static IntStream iterateInt(final int seed, final IntUnaryOperator f) { > public static LongStream iterateLong(final long seed, final LongUnaryOperator f) { > public static DoubleStream iterateDouble(final double seed, final DoubleUnaryOperator f) { > > > Essentially create streams of (seed, f(seed), f(f(seed)), ...). This is a common functional idiom. > > The stream that is created has an encounter order and permits limited parallelism since an iterator is used to repeatedly apply the function to the previous result. > > -- > > Infinite streams whose elements are generated from a supplier: > > public static Stream generate(Supplier s) { > public static IntStream generateInt(IntSupplier s) { > public static LongStream generateLong(LongSupplier s) { > public static DoubleStream generateDouble(DoubleSupplier s) { > > The use in java.util.Random is a good example: > > public IntStream ints() { > return Streams.generateInt(this::nextInt); > } > > Currently the generators are iterator based and thus only permits limited parallelism, and there are guarantees that the nth element encountered corresponds to the nth call of Supplier.get(). > > We could change the implementation to be equivalent to: > > longs().map(e -> s.get()) > > The stream would not have an encounter order, and enables balanced parallel computation (and therefore better resource utilization for reduction operations). For parallel streams the supplier would be invoked concurrently. > > The stream would no longer be known to be infinite, but would be "not known to be finite" from the perspective of the caller e.g. calling forEach my take more time and do more work than one expects unless one limits the number of elements to a known finite size. > > The existing functionality could be achieved using iterate: > > iterate(s.get(), i- > s.get()); > > So i think we should change generate as described. > From brian.goetz at oracle.com Thu Mar 21 09:10:55 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2013 12:10:55 -0400 Subject: hg: lambda/lambda/jdk: Eliminate Collector.Of{Int,Long,Double}; make {Int,Long,Double}Statistics top-level classes; add .statistics() methods to {Int,Long,DOuble}Stream; add Collectors.to{ILD}Statistics(mapper); add Collectors.reduce(mapper, reducer); eliminate Collectors.mapper forms; more efficient implementation of {ILD}Stream.average In-Reply-To: <514A0D0C.9000805@oracle.com> References: <20130318172822.8F1A048217@hg.openjdk.java.net> <5148A448.9040709@oracle.com> <514A0D0C.9000805@oracle.com> Message-ID: <514B310F.90709@oracle.com> > Joe and I had an offline discussion and we both like "SummaryStatistics" > as a name, which is common with ParallelArray. It better captures the > summary statistics we're gathering: count/sum/min/max. > > The question is open about where to put it. I am open to suggestions. > It has no dependency on anything in j.u.stream. I think java.util seems a better home. > I am on the fence about sum of squares. After offline discussion with Doug, I tipped over the fence towards "do it." From brian.goetz at oracle.com Thu Mar 21 12:47:49 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2013 15:47:49 -0400 Subject: Simplifying sequential() / parallel() Message-ID: <514B63E5.8090101@oracle.com> Doug and I have been revisiting sequential() and parallel(). I think there's a nice simplification here. The original motivation for sequential() was because we originally had .into(collection), and many collections were not thread-safe so we needed a means to bring the computation back to the current thread. The .sequential() method did that, but it also brought back a constraint of encounter ordering with it, because if people did: stuff.parallel().map(...).sequential().into(new ArrayList<>()); not respecting encounter order would violate the principle of least astonishment. So the original motivation for sequential() was "bring the computation back to the current thread, in order." This was doable, but has a high price -- a full barrier where we buffer the contents of the entire stream before doing any forEach'ing. Most of the time, sequential() only appears right before the terminal operation. But, once implemented, there was no reason to constrain this to appear right before the forEach/into, so we didn't. Once we discovered a need for .parallel(), it made sense it be the dual of .sequential() -- fully unconstrained. And again, the implementation wasn't *that* bad -- better than .sequential(). But again, the most desirable position for .parallel() is right after the source. Then we killed into() and replaced it with reduction, which is a much smarter way of managing ordering. Eliminating half the justification for .sequential(). As far as I can tell, the remaining use cases for .sequential() are just modifiers to forEach to constrain it, in order, to the current thread. As in: ints().parallel().filter(i -> isPrime(i)) .sequential().forEach(System.out::println) Which could be replaced by .forEachSequentialAndOrderedInCurrentThread(), with a suitably better name. Which could further be simplified to ditch the "in current thread" part by doing some locking in the implementation, which brings us to .forEachOrdered(action). Which nicely complements .collectUnordered, and the two actually stand better with their duals present (reduce is by default ordered; forEach is by default unordered.) The "put it anywhere" behavior of .parallel was completely bootstrapped on the "put it anywhere" nature of .sequential; we never really set out to support transitions in the API. So, pulling the rug out from under the house of cards, I think we can fall back to: 1. Modify semantics of .sequential and .parallel to apply globally to the entire pipeline. This works because pipelines are fully lazy anyway, so we don't commit to seq-ness/par-ness until we hit the terminal op. So they are functional versions of "set the seq/par bit in the source". And that simplifies the specification of seq/par down to a single property of the entire pipeline -- much easier to spec. 2. Add .forEachOrdered. For sequential streams, this is just .forEach. For par streams, we use a lock to avoid concurrent invocation of the lambda, and loosen up the current behavior from "full barrier" to "partial barrier", so that when the next chunk is available, we can start working right away. This is easy to accomplish using the existing AbstractTask machinery. Before we go there, does anyone have use cases for .sequential() / .parallel() that *don't* put the parallel right after the source, or the sequential right before a forEach? From forax at univ-mlv.fr Thu Mar 21 13:02:53 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 21 Mar 2013 21:02:53 +0100 Subject: Simplifying sequential() / parallel() In-Reply-To: <514B63E5.8090101@oracle.com> References: <514B63E5.8090101@oracle.com> Message-ID: <514B676D.7050902@univ-mlv.fr> On 03/21/2013 08:47 PM, Brian Goetz wrote: > Doug and I have been revisiting sequential() and parallel(). I think > there's a nice simplification here. > > The original motivation for sequential() was because we originally had > .into(collection), and many collections were not thread-safe so we > needed a means to bring the computation back to the current thread. > The .sequential() method did that, but it also brought back a > constraint of encounter ordering with it, because if people did: > > stuff.parallel().map(...).sequential().into(new ArrayList<>()); > > not respecting encounter order would violate the principle of least > astonishment. > > So the original motivation for sequential() was "bring the computation > back to the current thread, in order." This was doable, but has a > high price -- a full barrier where we buffer the contents of the > entire stream before doing any forEach'ing. > > Most of the time, sequential() only appears right before the terminal > operation. But, once implemented, there was no reason to constrain > this to appear right before the forEach/into, so we didn't. > > Once we discovered a need for .parallel(), it made sense it be the > dual of .sequential() -- fully unconstrained. And again, the > implementation wasn't *that* bad -- better than .sequential(). But > again, the most desirable position for .parallel() is right after the > source. > > Then we killed into() and replaced it with reduction, which is a much > smarter way of managing ordering. Eliminating half the justification > for .sequential(). > > As far as I can tell, the remaining use cases for .sequential() are > just modifiers to forEach to constrain it, in order, to the current > thread. > > As in: > ints().parallel().filter(i -> isPrime(i)) > .sequential().forEach(System.out::println) > > Which could be replaced by > .forEachSequentialAndOrderedInCurrentThread(), with a suitably better > name. Which could further be simplified to ditch the "in current > thread" part by doing some locking in the implementation, which brings > us to .forEachOrdered(action). Which nicely complements > .collectUnordered, and the two actually stand better with their duals > present (reduce is by default ordered; forEach is by default unordered.) > > The "put it anywhere" behavior of .parallel was completely > bootstrapped on the "put it anywhere" nature of .sequential; we never > really set out to support transitions in the API. > > So, pulling the rug out from under the house of cards, I think we can > fall back to: > > 1. Modify semantics of .sequential and .parallel to apply globally to > the entire pipeline. This works because pipelines are fully lazy > anyway, so we don't commit to seq-ness/par-ness until we hit the > terminal op. So they are functional versions of "set the seq/par bit > in the source". And that simplifies the specification of seq/par down > to a single property of the entire pipeline -- much easier to spec. > > 2. Add .forEachOrdered. For sequential streams, this is just > .forEach. For par streams, we use a lock to avoid concurrent > invocation of the lambda, and loosen up the current behavior from > "full barrier" to "partial barrier", so that when the next chunk is > available, we can start working right away. This is easy to > accomplish using the existing AbstractTask machinery. > > > Before we go there, does anyone have use cases for .sequential() / > .parallel() that *don't* put the parallel right after the source, or > the sequential right before a forEach? Supporting stateful mappers or filters require to be able to use sequential() just before them if the start of the stream is parallel. We may just not supporting them. R?mi From brian.goetz at oracle.com Thu Mar 21 13:21:26 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2013 16:21:26 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: <514B676D.7050902@univ-mlv.fr> References: <514B63E5.8090101@oracle.com> <514B676D.7050902@univ-mlv.fr> Message-ID: <514B6BC6.4070604@oracle.com> >> Before we go there, does anyone have use cases for .sequential() / >> .parallel() that *don't* put the parallel right after the source, or >> the sequential right before a forEach? > > Supporting stateful mappers or filters require to be able to use > sequential() > just before them if the start of the stream is parallel. > We may just not supporting them. Good point. Even given that, I'm still OK with the simplification; I do not really want to cater to stateful lambdas, that's always been explicitly outside of the design center for this library. In fact, I've been working yesterday on the spec for statefulness and non-interference, and I'm pretty comfortable just saying "lambdas passed to stream ops should not be stateful, if they are, this may result in wrong or non-deterministic outcomes." Does anyone feel otherwise? From tim at peierls.net Thu Mar 21 13:34:24 2013 From: tim at peierls.net (Tim Peierls) Date: Thu, 21 Mar 2013 16:34:24 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: <514B6BC6.4070604@oracle.com> References: <514B63E5.8090101@oracle.com> <514B676D.7050902@univ-mlv.fr> <514B6BC6.4070604@oracle.com> Message-ID: On Thu, Mar 21, 2013 at 4:21 PM, Brian Goetz wrote: > I'm pretty comfortable just saying "lambdas passed to stream ops should >>> not be stateful, if they are, this may result in wrong or non-deterministic >>> outcomes." >> >> > Does anyone feel otherwise? > Not me. I'd even be comfortable with something stronger: "Passing stateful lambdas to stream ops will give you a rash." --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130321/2879d779/attachment.html From brian.goetz at oracle.com Thu Mar 21 14:13:06 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2013 17:13:06 -0400 Subject: Stateful lambdas (was: Simplifying sequential() / parallel()) In-Reply-To: References: <514B63E5.8090101@oracle.com> <514B676D.7050902@univ-mlv.fr> <514B6BC6.4070604@oracle.com> Message-ID: <514B77E2.5060109@oracle.com> Here's what I've currently got in the package info for non-interference and statefulness: *

Non-interference

* * The {@code java.util.stream} package enables you to execute possibly-parallel * bulk-data operations over a variety of data sources, including even non-thread-safe * collections such as {@code ArrayList}. This is possible only if we can * prevent interference with the data source during the execution of a * stream pipeline. (Execution begins when the terminal operation is invoked, and ends * when the terminal operation completes.) For most data sources, preventing interference * means ensuring that the data source is not modified at all during the execution * of the stream pipeline. (Some data sources, such as concurrent collections, are * specifically designed to handle concurrent modification, in which case their * {@code Spliterator} will report the {@code CONCURRENT} characteristic.) * *

Accordingly, lambdas passed to stream methods should never modify the stream's data * source. A lambda (or other object implementing the appropriate functional interface) * is said to interfere with the data source if it modifies, or causes to be modified, * the stream's data source. The need for non-interference applies to all pipelines, not just parallel * ones. Unless the stream source is concurrent, modifying a stream's data source during * execution of a stream pipeline can cause exceptions, incorrect answers, or nonconformant * results. * *

Further results may be nondeterministic or incorrect if the lambdas passed to * stream operations are stateful. A stateful lambda (or other object implementing the * appropriate functional interface) is one whose result depends on any state which might change * during the execution of the stream pipeline. An example of a stateful lambda is: *

  *     Set seen = Collections.synchronizedSet(new HashSet<>());
  *     stream.map(e -> { if (seen.add(e)) return 0; else return e; })...
  * 
* Stream pipelines with stateful lambdas may produce nondeterministic or incorrect results. * And @param doc for map arguments is: * @param mapper a
non-interfering, stateless * function to be applied to each element On 3/21/2013 4:34 PM, Tim Peierls wrote: > On Thu, Mar 21, 2013 at 4:21 PM, Brian Goetz > wrote: > > I'm pretty comfortable just saying "lambdas passed to stream > ops should not be stateful, if they are, this may result in > wrong or non-deterministic outcomes." > > > Does anyone feel otherwise? > > > Not me. I'd even be comfortable with something stronger: "Passing > stateful lambdas to stream ops will give you a rash." > > --tim > From joe.bowbeer at gmail.com Thu Mar 21 18:57:37 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 21 Mar 2013 18:57:37 -0700 Subject: Simplifying sequential() / parallel() In-Reply-To: <514B63E5.8090101@oracle.com> References: <514B63E5.8090101@oracle.com> Message-ID: I'm traveling now and won't be able to respond promptly but this topic has been raised a couple of times already. Feel free to copy and paste my response from previous discussions:) Rephrasing, I'm OK with non-interference but I object to banning stateful in sequential ops. I think there should be a one-one correspondence between any for loop and a sequential forEach. Can you compare your restrictions with those in Scala and Groovy? Scala in particular, because it is more strictly defined, and I'm pretty sure I've combined stateful expressions with functional forms in Scala, to good effect. (One of the benefits of being multi-paradigmatic?) In addition, I'm wary of the new form of forEach. If anything, I'd like its name to be simpler, e.g., each, not longer. Joe On Mar 21, 2013 3:48 PM, "Brian Goetz" wrote: > Doug and I have been revisiting sequential() and parallel(). I think > there's a nice simplification here. > > The original motivation for sequential() was because we originally had > .into(collection), and many collections were not thread-safe so we needed a > means to bring the computation back to the current thread. The > .sequential() method did that, but it also brought back a constraint of > encounter ordering with it, because if people did: > > stuff.parallel().map(...).**sequential().into(new ArrayList<>()); > > not respecting encounter order would violate the principle of least > astonishment. > > So the original motivation for sequential() was "bring the computation > back to the current thread, in order." This was doable, but has a high > price -- a full barrier where we buffer the contents of the entire stream > before doing any forEach'ing. > > Most of the time, sequential() only appears right before the terminal > operation. But, once implemented, there was no reason to constrain this to > appear right before the forEach/into, so we didn't. > > Once we discovered a need for .parallel(), it made sense it be the dual of > .sequential() -- fully unconstrained. And again, the implementation wasn't > *that* bad -- better than .sequential(). But again, the most desirable > position for .parallel() is right after the source. > > Then we killed into() and replaced it with reduction, which is a much > smarter way of managing ordering. Eliminating half the justification for > .sequential(). > > As far as I can tell, the remaining use cases for .sequential() are just > modifiers to forEach to constrain it, in order, to the current thread. > > As in: > ints().parallel().filter(i -> isPrime(i)) > .sequential().forEach(System.**out::println) > > Which could be replaced by .**forEachSequentialAndOrderedInC**urrentThread(), > with a suitably better name. Which could further be simplified to ditch > the "in current thread" part by doing some locking in the implementation, > which brings us to .forEachOrdered(action). Which nicely complements > .collectUnordered, and the two actually stand better with their duals > present (reduce is by default ordered; forEach is by default unordered.) > > The "put it anywhere" behavior of .parallel was completely bootstrapped on > the "put it anywhere" nature of .sequential; we never really set out to > support transitions in the API. > > So, pulling the rug out from under the house of cards, I think we can fall > back to: > > 1. Modify semantics of .sequential and .parallel to apply globally to the > entire pipeline. This works because pipelines are fully lazy anyway, so we > don't commit to seq-ness/par-ness until we hit the terminal op. So they > are functional versions of "set the seq/par bit in the source". And that > simplifies the specification of seq/par down to a single property of the > entire pipeline -- much easier to spec. > > 2. Add .forEachOrdered. For sequential streams, this is just .forEach. > For par streams, we use a lock to avoid concurrent invocation of the > lambda, and loosen up the current behavior from "full barrier" to "partial > barrier", so that when the next chunk is available, we can start working > right away. This is easy to accomplish using the existing AbstractTask > machinery. > > > Before we go there, does anyone have use cases for .sequential() / > .parallel() that *don't* put the parallel right after the source, or the > sequential right before a forEach? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130321/0821d0f1/attachment-0001.html From brian.goetz at oracle.com Fri Mar 22 06:33:18 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 09:33:18 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> Message-ID: <514C5D9E.9060500@oracle.com> The problem with stateful lambdas is that, unless one block of code has control over the entire pipeline, it is an accident waiting to happen. Let's say you receive a stream as a parameter: void foo(Stream s) { ... } and you want to do something that requires a stateful mapper: void foo(Stream s) { s.map(... stateful ...)... } That's a bug already. Because you don't know that the stream you were passed in is sequential. But I doubt that people will remember, even most of the time, they need to do: s.sequential().map(... stateful ...)... instead. Won't happen. Stateful lambdas introduce the need for non-modular reasoning about stream pipelines (who created this? who will consume this? in what state was it created?). And, it has all the same self-deception problems as thread-safety. People convince themselves "I don't need to think about synchronization because no one will ever use this object concurrently." So, while I sympathize with the desire to let people say "I know that this entire stream pipeline has been carefully controlled such as to not have statefulness distort its results", I think in reality, this will quickly turn into "statefulness is OK" in most people's minds. With the attended inevitable foot-shooting. On 3/21/2013 9:57 PM, Joe Bowbeer wrote: > I'm traveling now and won't be able to respond promptly but this topic > has been raised a couple of times already. Feel free to copy and paste > my response from previous discussions:) > > Rephrasing, I'm OK with non-interference but I object to banning > stateful in sequential ops. > > I think there should be a one-one correspondence between any for loop > and a sequential forEach. > > Can you compare your restrictions with those in Scala and Groovy? Scala > in particular, because it is more strictly defined, and I'm pretty sure > I've combined stateful expressions with functional forms in Scala, to > good effect. (One of the benefits of being multi-paradigmatic?) > > In addition, I'm wary of the new form of forEach. If anything, I'd like > its name to be simpler, e.g., each, not longer. > > Joe > > On Mar 21, 2013 3:48 PM, "Brian Goetz" > wrote: > > Doug and I have been revisiting sequential() and parallel(). I > think there's a nice simplification here. > > The original motivation for sequential() was because we originally > had .into(collection), and many collections were not thread-safe so > we needed a means to bring the computation back to the current > thread. The .sequential() method did that, but it also brought back > a constraint of encounter ordering with it, because if people did: > > stuff.parallel().map(...).__sequential().into(new ArrayList<>()); > > not respecting encounter order would violate the principle of least > astonishment. > > So the original motivation for sequential() was "bring the > computation back to the current thread, in order." This was doable, > but has a high price -- a full barrier where we buffer the contents > of the entire stream before doing any forEach'ing. > > Most of the time, sequential() only appears right before the > terminal operation. But, once implemented, there was no reason to > constrain this to appear right before the forEach/into, so we didn't. > > Once we discovered a need for .parallel(), it made sense it be the > dual of .sequential() -- fully unconstrained. And again, the > implementation wasn't *that* bad -- better than .sequential(). But > again, the most desirable position for .parallel() is right after > the source. > > Then we killed into() and replaced it with reduction, which is a > much smarter way of managing ordering. Eliminating half the > justification for .sequential(). > > As far as I can tell, the remaining use cases for .sequential() are > just modifiers to forEach to constrain it, in order, to the current > thread. > > As in: > ints().parallel().filter(i -> isPrime(i)) > .sequential().forEach(System.__out::println) > > Which could be replaced by > .__forEachSequentialAndOrderedInC__urrentThread(), with a suitably > better name. Which could further be simplified to ditch the "in > current thread" part by doing some locking in the implementation, > which brings us to .forEachOrdered(action). Which nicely > complements .collectUnordered, and the two actually stand better > with their duals present (reduce is by default ordered; forEach is > by default unordered.) > > The "put it anywhere" behavior of .parallel was completely > bootstrapped on the "put it anywhere" nature of .sequential; we > never really set out to support transitions in the API. > > So, pulling the rug out from under the house of cards, I think we > can fall back to: > > 1. Modify semantics of .sequential and .parallel to apply globally > to the entire pipeline. This works because pipelines are fully lazy > anyway, so we don't commit to seq-ness/par-ness until we hit the > terminal op. So they are functional versions of "set the seq/par > bit in the source". And that simplifies the specification of > seq/par down to a single property of the entire pipeline -- much > easier to spec. > > 2. Add .forEachOrdered. For sequential streams, this is just > .forEach. For par streams, we use a lock to avoid concurrent > invocation of the lambda, and loosen up the current behavior from > "full barrier" to "partial barrier", so that when the next chunk is > available, we can start working right away. This is easy to > accomplish using the existing AbstractTask machinery. > > > Before we go there, does anyone have use cases for .sequential() / > .parallel() that *don't* put the parallel right after the source, or > the sequential right before a forEach? > > From forax at univ-mlv.fr Fri Mar 22 06:43:29 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 22 Mar 2013 14:43:29 +0100 Subject: Simplifying sequential() / parallel() In-Reply-To: <514C5D9E.9060500@oracle.com> References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> Message-ID: <514C6001.4070709@univ-mlv.fr> On 03/22/2013 02:33 PM, Brian Goetz wrote: > The problem with stateful lambdas is that, unless one block of code > has control over the entire pipeline, it is an accident waiting to > happen. > > Let's say you receive a stream as a parameter: > > void foo(Stream s) { ... } > > and you want to do something that requires a stateful mapper: > > void foo(Stream s) { > s.map(... stateful ...)... > } > > That's a bug already. Because you don't know that the stream you were > passed in is sequential. But I doubt that people will remember, even > most of the time, they need to do: > > s.sequential().map(... stateful ...)... > > instead. Won't happen. > > Stateful lambdas introduce the need for non-modular reasoning about > stream pipelines (who created this? who will consume this? in what > state was it created?). And, it has all the same self-deception > problems as thread-safety. People convince themselves "I don't need > to think about synchronization because no one will ever use this > object concurrently." > > So, while I sympathize with the desire to let people say "I know that > this entire stream pipeline has been carefully controlled such as to > not have statefulness distort its results", I think in reality, this > will quickly turn into "statefulness is OK" in most people's minds. > With the attended inevitable foot-shooting. Yes, recognizing what is stateful and what is stateless is enough (says the guy the "concurrency in Java" teacher in me :) +1 for 'each', foreach is already used for the enhanced for loop "for(:)", in that case, forEachOrdered() should be eachOrdered(). R?mi > > On 3/21/2013 9:57 PM, Joe Bowbeer wrote: >> I'm traveling now and won't be able to respond promptly but this topic >> has been raised a couple of times already. Feel free to copy and paste >> my response from previous discussions:) >> >> Rephrasing, I'm OK with non-interference but I object to banning >> stateful in sequential ops. >> >> I think there should be a one-one correspondence between any for loop >> and a sequential forEach. >> >> Can you compare your restrictions with those in Scala and Groovy? Scala >> in particular, because it is more strictly defined, and I'm pretty sure >> I've combined stateful expressions with functional forms in Scala, to >> good effect. (One of the benefits of being multi-paradigmatic?) >> >> In addition, I'm wary of the new form of forEach. If anything, I'd like >> its name to be simpler, e.g., each, not longer. >> >> Joe >> >> On Mar 21, 2013 3:48 PM, "Brian Goetz" > > wrote: >> >> Doug and I have been revisiting sequential() and parallel(). I >> think there's a nice simplification here. >> >> The original motivation for sequential() was because we originally >> had .into(collection), and many collections were not thread-safe so >> we needed a means to bring the computation back to the current >> thread. The .sequential() method did that, but it also brought back >> a constraint of encounter ordering with it, because if people did: >> >> stuff.parallel().map(...).__sequential().into(new >> ArrayList<>()); >> >> not respecting encounter order would violate the principle of least >> astonishment. >> >> So the original motivation for sequential() was "bring the >> computation back to the current thread, in order." This was doable, >> but has a high price -- a full barrier where we buffer the contents >> of the entire stream before doing any forEach'ing. >> >> Most of the time, sequential() only appears right before the >> terminal operation. But, once implemented, there was no reason to >> constrain this to appear right before the forEach/into, so we >> didn't. >> >> Once we discovered a need for .parallel(), it made sense it be the >> dual of .sequential() -- fully unconstrained. And again, the >> implementation wasn't *that* bad -- better than .sequential(). But >> again, the most desirable position for .parallel() is right after >> the source. >> >> Then we killed into() and replaced it with reduction, which is a >> much smarter way of managing ordering. Eliminating half the >> justification for .sequential(). >> >> As far as I can tell, the remaining use cases for .sequential() are >> just modifiers to forEach to constrain it, in order, to the current >> thread. >> >> As in: >> ints().parallel().filter(i -> isPrime(i)) >> .sequential().forEach(System.__out::println) >> >> Which could be replaced by >> .__forEachSequentialAndOrderedInC__urrentThread(), with a suitably >> better name. Which could further be simplified to ditch the "in >> current thread" part by doing some locking in the implementation, >> which brings us to .forEachOrdered(action). Which nicely >> complements .collectUnordered, and the two actually stand better >> with their duals present (reduce is by default ordered; forEach is >> by default unordered.) >> >> The "put it anywhere" behavior of .parallel was completely >> bootstrapped on the "put it anywhere" nature of .sequential; we >> never really set out to support transitions in the API. >> >> So, pulling the rug out from under the house of cards, I think we >> can fall back to: >> >> 1. Modify semantics of .sequential and .parallel to apply globally >> to the entire pipeline. This works because pipelines are fully lazy >> anyway, so we don't commit to seq-ness/par-ness until we hit the >> terminal op. So they are functional versions of "set the seq/par >> bit in the source". And that simplifies the specification of >> seq/par down to a single property of the entire pipeline -- much >> easier to spec. >> >> 2. Add .forEachOrdered. For sequential streams, this is just >> .forEach. For par streams, we use a lock to avoid concurrent >> invocation of the lambda, and loosen up the current behavior from >> "full barrier" to "partial barrier", so that when the next chunk is >> available, we can start working right away. This is easy to >> accomplish using the existing AbstractTask machinery. >> >> >> Before we go there, does anyone have use cases for .sequential() / >> .parallel() that *don't* put the parallel right after the source, or >> the sequential right before a forEach? >> >> From joe.bowbeer at gmail.com Fri Mar 22 07:07:44 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 22 Mar 2013 07:07:44 -0700 Subject: Simplifying sequential() / parallel() In-Reply-To: <514C5D9E.9060500@oracle.com> References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> Message-ID: Stateful programming has its issues but that ship has already sailed (in Java). The programs where these new expressions will live are full of state... With the introduction of streams, programmers and refactoring tools will be introducing the cool new expressions into existing code. (forEach is the groovy guy's for loop, right?) I don't want to create danger zones in the code where these transformations are accidents waiting to happen. Also think of the code maintainers trying to determine, as they are enhancing and debugging the code, where they are allowed to add state. Before, the existence of parallel() created a danger zone, but sequential() restored safety. That's an easy rule to understand. BTW, what are the rules in Scala and groovy? Joe On Mar 22, 2013 9:33 AM, "Brian Goetz" wrote: > The problem with stateful lambdas is that, unless one block of code has > control over the entire pipeline, it is an accident waiting to happen. > > Let's say you receive a stream as a parameter: > > void foo(Stream s) { ... } > > and you want to do something that requires a stateful mapper: > > void foo(Stream s) { > s.map(... stateful ...)... > } > > That's a bug already. Because you don't know that the stream you were > passed in is sequential. But I doubt that people will remember, even most > of the time, they need to do: > > s.sequential().map(... stateful ...)... > > instead. Won't happen. > > Stateful lambdas introduce the need for non-modular reasoning about stream > pipelines (who created this? who will consume this? in what state was it > created?). And, it has all the same self-deception problems as > thread-safety. People convince themselves "I don't need to think about > synchronization because no one will ever use this object concurrently." > > So, while I sympathize with the desire to let people say "I know that this > entire stream pipeline has been carefully controlled such as to not have > statefulness distort its results", I think in reality, this will quickly > turn into "statefulness is OK" in most people's minds. With the attended > inevitable foot-shooting. > > On 3/21/2013 9:57 PM, Joe Bowbeer wrote: > >> I'm traveling now and won't be able to respond promptly but this topic >> has been raised a couple of times already. Feel free to copy and paste >> my response from previous discussions:) >> >> Rephrasing, I'm OK with non-interference but I object to banning >> stateful in sequential ops. >> >> I think there should be a one-one correspondence between any for loop >> and a sequential forEach. >> >> Can you compare your restrictions with those in Scala and Groovy? Scala >> in particular, because it is more strictly defined, and I'm pretty sure >> I've combined stateful expressions with functional forms in Scala, to >> good effect. (One of the benefits of being multi-paradigmatic?) >> >> In addition, I'm wary of the new form of forEach. If anything, I'd like >> its name to be simpler, e.g., each, not longer. >> >> Joe >> >> On Mar 21, 2013 3:48 PM, "Brian Goetz" > > wrote: >> >> Doug and I have been revisiting sequential() and parallel(). I >> think there's a nice simplification here. >> >> The original motivation for sequential() was because we originally >> had .into(collection), and many collections were not thread-safe so >> we needed a means to bring the computation back to the current >> thread. The .sequential() method did that, but it also brought back >> a constraint of encounter ordering with it, because if people did: >> >> stuff.parallel().map(...).__**sequential().into(new >> ArrayList<>()); >> >> not respecting encounter order would violate the principle of least >> astonishment. >> >> So the original motivation for sequential() was "bring the >> computation back to the current thread, in order." This was doable, >> but has a high price -- a full barrier where we buffer the contents >> of the entire stream before doing any forEach'ing. >> >> Most of the time, sequential() only appears right before the >> terminal operation. But, once implemented, there was no reason to >> constrain this to appear right before the forEach/into, so we didn't. >> >> Once we discovered a need for .parallel(), it made sense it be the >> dual of .sequential() -- fully unconstrained. And again, the >> implementation wasn't *that* bad -- better than .sequential(). But >> again, the most desirable position for .parallel() is right after >> the source. >> >> Then we killed into() and replaced it with reduction, which is a >> much smarter way of managing ordering. Eliminating half the >> justification for .sequential(). >> >> As far as I can tell, the remaining use cases for .sequential() are >> just modifiers to forEach to constrain it, in order, to the current >> thread. >> >> As in: >> ints().parallel().filter(i -> isPrime(i)) >> .sequential().forEach(System._**_out::println) >> >> Which could be replaced by >> .__**forEachSequentialAndOrderedInC**__urrentThread(), with a >> suitably >> better name. Which could further be simplified to ditch the "in >> current thread" part by doing some locking in the implementation, >> which brings us to .forEachOrdered(action). Which nicely >> complements .collectUnordered, and the two actually stand better >> with their duals present (reduce is by default ordered; forEach is >> by default unordered.) >> >> The "put it anywhere" behavior of .parallel was completely >> bootstrapped on the "put it anywhere" nature of .sequential; we >> never really set out to support transitions in the API. >> >> So, pulling the rug out from under the house of cards, I think we >> can fall back to: >> >> 1. Modify semantics of .sequential and .parallel to apply globally >> to the entire pipeline. This works because pipelines are fully lazy >> anyway, so we don't commit to seq-ness/par-ness until we hit the >> terminal op. So they are functional versions of "set the seq/par >> bit in the source". And that simplifies the specification of >> seq/par down to a single property of the entire pipeline -- much >> easier to spec. >> >> 2. Add .forEachOrdered. For sequential streams, this is just >> .forEach. For par streams, we use a lock to avoid concurrent >> invocation of the lambda, and loosen up the current behavior from >> "full barrier" to "partial barrier", so that when the next chunk is >> available, we can start working right away. This is easy to >> accomplish using the existing AbstractTask machinery. >> >> >> Before we go there, does anyone have use cases for .sequential() / >> .parallel() that *don't* put the parallel right after the source, or >> the sequential right before a forEach? >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130322/06537e7b/attachment-0001.html From brian.goetz at oracle.com Fri Mar 22 07:41:45 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 10:41:45 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> Message-ID: <514C6DA9.9050703@oracle.com> > Stateful programming has its issues but that ship has already sailed (in > Java). While that's unquestionably true, I think it is also unnecessarily defeatist. A tremendous amount of effort has gone into the design of this API to make statefulness less attractive because there's an easier way to do it without statefulness. > I don't want to create danger zones in the code where these > transformations are accidents waiting to happen. Also think of the code > maintainers trying to determine, as they are enhancing and debugging the > code, where they are allowed to add state. > > Before, the existence of parallel() created a danger zone, but > sequential() restored safety. That's an easy rule to understand. The newly proposed rule does the same. The only problem is what happens when responsibility for a pipeline is divided across code regions. Which I'm arguing is always problematic with statefulness lambdas regardless of model. From forax at univ-mlv.fr Fri Mar 22 07:53:02 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 22 Mar 2013 15:53:02 +0100 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> Message-ID: <514C704E.4080306@univ-mlv.fr> On 03/22/2013 03:07 PM, Joe Bowbeer wrote: > > Stateful programming has its issues but that ship has already sailed > (in Java). > > The programs where these new expressions will live are full of state... > > With the introduction of streams, programmers and refactoring tools > will be introducing the cool new expressions into existing code. > (forEach is the groovy guy's for loop, right?) > > I don't want to create danger zones in the code where these > transformations are accidents waiting to happen. Also think of the > code maintainers trying to determine, as they are enhancing and > debugging the code, where they are allowed to add state. > > Before, the existence of parallel() created a danger zone, but > sequential() restored safety. That's an easy rule to understand. > > BTW, what are the rules in Scala and groovy? > > Joe > for Groovy, gpars the groovy parallel library doesn't allow stateful closure, you have too use special constructs like Agent for that. http://gpars.codehaus.org/ R?mi > On Mar 22, 2013 9:33 AM, "Brian Goetz" > wrote: > > The problem with stateful lambdas is that, unless one block of > code has control over the entire pipeline, it is an accident > waiting to happen. > > Let's say you receive a stream as a parameter: > > void foo(Stream s) { ... } > > and you want to do something that requires a stateful mapper: > > void foo(Stream s) { > s.map(... stateful ...)... > } > > That's a bug already. Because you don't know that the stream you > were passed in is sequential. But I doubt that people will > remember, even most of the time, they need to do: > > s.sequential().map(... stateful ...)... > > instead. Won't happen. > > Stateful lambdas introduce the need for non-modular reasoning > about stream pipelines (who created this? who will consume this? > in what state was it created?). And, it has all the same > self-deception problems as thread-safety. People convince > themselves "I don't need to think about synchronization because no > one will ever use this object concurrently." > > So, while I sympathize with the desire to let people say "I know > that this entire stream pipeline has been carefully controlled > such as to not have statefulness distort its results", I think in > reality, this will quickly turn into "statefulness is OK" in most > people's minds. With the attended inevitable foot-shooting. > > On 3/21/2013 9:57 PM, Joe Bowbeer wrote: > > I'm traveling now and won't be able to respond promptly but > this topic > has been raised a couple of times already. Feel free to copy > and paste > my response from previous discussions:) > > Rephrasing, I'm OK with non-interference but I object to banning > stateful in sequential ops. > > I think there should be a one-one correspondence between any > for loop > and a sequential forEach. > > Can you compare your restrictions with those in Scala and > Groovy? Scala > in particular, because it is more strictly defined, and I'm > pretty sure > I've combined stateful expressions with functional forms in > Scala, to > good effect. (One of the benefits of being multi-paradigmatic?) > > In addition, I'm wary of the new form of forEach. If anything, > I'd like > its name to be simpler, e.g., each, not longer. > > Joe > > On Mar 21, 2013 3:48 PM, "Brian Goetz" > >> wrote: > > Doug and I have been revisiting sequential() and > parallel(). I > think there's a nice simplification here. > > The original motivation for sequential() was because we > originally > had .into(collection), and many collections were not > thread-safe so > we needed a means to bring the computation back to the current > thread. The .sequential() method did that, but it also > brought back > a constraint of encounter ordering with it, because if > people did: > > stuff.parallel().map(...).__sequential().into(new > ArrayList<>()); > > not respecting encounter order would violate the principle > of least > astonishment. > > So the original motivation for sequential() was "bring the > computation back to the current thread, in order." This > was doable, > but has a high price -- a full barrier where we buffer the > contents > of the entire stream before doing any forEach'ing. > > Most of the time, sequential() only appears right before the > terminal operation. But, once implemented, there was no > reason to > constrain this to appear right before the forEach/into, so > we didn't. > > Once we discovered a need for .parallel(), it made sense > it be the > dual of .sequential() -- fully unconstrained. And again, the > implementation wasn't *that* bad -- better than > .sequential(). But > again, the most desirable position for .parallel() is > right after > the source. > > Then we killed into() and replaced it with reduction, > which is a > much smarter way of managing ordering. Eliminating half the > justification for .sequential(). > > As far as I can tell, the remaining use cases for > .sequential() are > just modifiers to forEach to constrain it, in order, to > the current > thread. > > As in: > ints().parallel().filter(i -> isPrime(i)) > .sequential().forEach(System.__out::println) > > Which could be replaced by > .__forEachSequentialAndOrderedInC__urrentThread(), with a > suitably > better name. Which could further be simplified to ditch > the "in > current thread" part by doing some locking in the > implementation, > which brings us to .forEachOrdered(action). Which nicely > complements .collectUnordered, and the two actually stand > better > with their duals present (reduce is by default ordered; > forEach is > by default unordered.) > > The "put it anywhere" behavior of .parallel was completely > bootstrapped on the "put it anywhere" nature of > .sequential; we > never really set out to support transitions in the API. > > So, pulling the rug out from under the house of cards, I > think we > can fall back to: > > 1. Modify semantics of .sequential and .parallel to apply > globally > to the entire pipeline. This works because pipelines are > fully lazy > anyway, so we don't commit to seq-ness/par-ness until we > hit the > terminal op. So they are functional versions of "set the > seq/par > bit in the source". And that simplifies the specification of > seq/par down to a single property of the entire pipeline > -- much > easier to spec. > > 2. Add .forEachOrdered. For sequential streams, this is just > .forEach. For par streams, we use a lock to avoid concurrent > invocation of the lambda, and loosen up the current > behavior from > "full barrier" to "partial barrier", so that when the next > chunk is > available, we can start working right away. This is easy to > accomplish using the existing AbstractTask machinery. > > > Before we go there, does anyone have use cases for > .sequential() / > .parallel() that *don't* put the parallel right after the > source, or > the sequential right before a forEach? > > From joe.bowbeer at gmail.com Fri Mar 22 08:09:56 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 22 Mar 2013 08:09:56 -0700 Subject: Simplifying sequential() / parallel() In-Reply-To: <514C704E.4080306@univ-mlv.fr> References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C704E.4080306@univ-mlv.fr> Message-ID: I'm just asking about the rules for each() in groovy. gpars is special purpose, akin to adding parallel() to ones code. On Mar 22, 2013 10:57 AM, "Remi Forax" wrote: > On 03/22/2013 03:07 PM, Joe Bowbeer wrote: > >> >> Stateful programming has its issues but that ship has already sailed (in >> Java). >> >> The programs where these new expressions will live are full of state... >> >> With the introduction of streams, programmers and refactoring tools will >> be introducing the cool new expressions into existing code. (forEach is the >> groovy guy's for loop, right?) >> >> I don't want to create danger zones in the code where these >> transformations are accidents waiting to happen. Also think of the code >> maintainers trying to determine, as they are enhancing and debugging the >> code, where they are allowed to add state. >> >> Before, the existence of parallel() created a danger zone, but >> sequential() restored safety. That's an easy rule to understand. >> >> BTW, what are the rules in Scala and groovy? >> >> Joe >> >> > for Groovy, gpars the groovy parallel library doesn't allow stateful > closure, > you have too use special constructs like Agent for that. > > http://gpars.codehaus.org/ > > R?mi > > On Mar 22, 2013 9:33 AM, "Brian Goetz" > brian.goetz at oracle.com**>> wrote: >> >> The problem with stateful lambdas is that, unless one block of >> code has control over the entire pipeline, it is an accident >> waiting to happen. >> >> Let's say you receive a stream as a parameter: >> >> void foo(Stream s) { ... } >> >> and you want to do something that requires a stateful mapper: >> >> void foo(Stream s) { >> s.map(... stateful ...)... >> } >> >> That's a bug already. Because you don't know that the stream you >> were passed in is sequential. But I doubt that people will >> remember, even most of the time, they need to do: >> >> s.sequential().map(... stateful ...)... >> >> instead. Won't happen. >> >> Stateful lambdas introduce the need for non-modular reasoning >> about stream pipelines (who created this? who will consume this? >> in what state was it created?). And, it has all the same >> self-deception problems as thread-safety. People convince >> themselves "I don't need to think about synchronization because no >> one will ever use this object concurrently." >> >> So, while I sympathize with the desire to let people say "I know >> that this entire stream pipeline has been carefully controlled >> such as to not have statefulness distort its results", I think in >> reality, this will quickly turn into "statefulness is OK" in most >> people's minds. With the attended inevitable foot-shooting. >> >> On 3/21/2013 9:57 PM, Joe Bowbeer wrote: >> >> I'm traveling now and won't be able to respond promptly but >> this topic >> has been raised a couple of times already. Feel free to copy >> and paste >> my response from previous discussions:) >> >> Rephrasing, I'm OK with non-interference but I object to banning >> stateful in sequential ops. >> >> I think there should be a one-one correspondence between any >> for loop >> and a sequential forEach. >> >> Can you compare your restrictions with those in Scala and >> Groovy? Scala >> in particular, because it is more strictly defined, and I'm >> pretty sure >> I've combined stateful expressions with functional forms in >> Scala, to >> good effect. (One of the benefits of being multi-paradigmatic?) >> >> In addition, I'm wary of the new form of forEach. If anything, >> I'd like >> its name to be simpler, e.g., each, not longer. >> >> Joe >> >> On Mar 21, 2013 3:48 PM, "Brian Goetz" > >> > >> wrote: >> >> Doug and I have been revisiting sequential() and >> parallel(). I >> think there's a nice simplification here. >> >> The original motivation for sequential() was because we >> originally >> had .into(collection), and many collections were not >> thread-safe so >> we needed a means to bring the computation back to the current >> thread. The .sequential() method did that, but it also >> brought back >> a constraint of encounter ordering with it, because if >> people did: >> >> stuff.parallel().map(...).__**sequential().into(new >> ArrayList<>()); >> >> not respecting encounter order would violate the principle >> of least >> astonishment. >> >> So the original motivation for sequential() was "bring the >> computation back to the current thread, in order." This >> was doable, >> but has a high price -- a full barrier where we buffer the >> contents >> of the entire stream before doing any forEach'ing. >> >> Most of the time, sequential() only appears right before the >> terminal operation. But, once implemented, there was no >> reason to >> constrain this to appear right before the forEach/into, so >> we didn't. >> >> Once we discovered a need for .parallel(), it made sense >> it be the >> dual of .sequential() -- fully unconstrained. And again, the >> implementation wasn't *that* bad -- better than >> .sequential(). But >> again, the most desirable position for .parallel() is >> right after >> the source. >> >> Then we killed into() and replaced it with reduction, >> which is a >> much smarter way of managing ordering. Eliminating half the >> justification for .sequential(). >> >> As far as I can tell, the remaining use cases for >> .sequential() are >> just modifiers to forEach to constrain it, in order, to >> the current >> thread. >> >> As in: >> ints().parallel().filter(i -> isPrime(i)) >> .sequential().forEach(System._**_out::println) >> >> Which could be replaced by >> .__**forEachSequentialAndOrderedInC**__urrentThread(), with a >> suitably >> better name. Which could further be simplified to ditch >> the "in >> current thread" part by doing some locking in the >> implementation, >> which brings us to .forEachOrdered(action). Which nicely >> complements .collectUnordered, and the two actually stand >> better >> with their duals present (reduce is by default ordered; >> forEach is >> by default unordered.) >> >> The "put it anywhere" behavior of .parallel was completely >> bootstrapped on the "put it anywhere" nature of >> .sequential; we >> never really set out to support transitions in the API. >> >> So, pulling the rug out from under the house of cards, I >> think we >> can fall back to: >> >> 1. Modify semantics of .sequential and .parallel to apply >> globally >> to the entire pipeline. This works because pipelines are >> fully lazy >> anyway, so we don't commit to seq-ness/par-ness until we >> hit the >> terminal op. So they are functional versions of "set the >> seq/par >> bit in the source". And that simplifies the specification of >> seq/par down to a single property of the entire pipeline >> -- much >> easier to spec. >> >> 2. Add .forEachOrdered. For sequential streams, this is just >> .forEach. For par streams, we use a lock to avoid concurrent >> invocation of the lambda, and loosen up the current >> behavior from >> "full barrier" to "partial barrier", so that when the next >> chunk is >> available, we can start working right away. This is easy to >> accomplish using the existing AbstractTask machinery. >> >> >> Before we go there, does anyone have use cases for >> .sequential() / >> .parallel() that *don't* put the parallel right after the >> source, or >> the sequential right before a forEach? >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130322/ce7e290c/attachment-0001.html From tim at peierls.net Fri Mar 22 08:48:58 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 22 Mar 2013 11:48:58 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: <514B63E5.8090101@oracle.com> References: <514B63E5.8090101@oracle.com> Message-ID: On Thu, Mar 21, 2013 at 3:47 PM, Brian Goetz wrote: > Before we go there, does anyone have use cases for .sequential() / > .parallel() that *don't* put the parallel right after the source, or the > sequential right before a forEach? > [Assuming stateful mappers are banned:] Even if someone *does* come up with such a use case, e.g., (using current semantics) s.parallel().x().sequential().y().terminal() isn't it always possible to fake this with a custom collector that combines y() and terminal()? (Or other mildly painful tricks.) --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130322/a463ace2/attachment.html From dl at cs.oswego.edu Fri Mar 22 08:56:21 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 22 Mar 2013 11:56:21 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> Message-ID: <514C7F25.1090609@cs.oswego.edu> On 03/22/13 10:07, Joe Bowbeer wrote: > Stateful programming has its issues but that ship has already sailed (in Java). > Although it is worth bearing in mind that most stream functionality wrt Collections exploits the fact that operations within traversals are already known to avoid some of the worst unexpected side-effects -- mutating a collection while you are traversing. Which normally leads to ConcurrentModificationException for iterators. A variant of this is preserved when applicable in Spliterator implementations. People learn quickly to avoid them. (That's the subject of some of the specs Paul Sandoz has been adding, which can't be nailed down very well in general because they are quality-of-implementation issues, but he is trying anyway :-) Anyway, as the chief advocate for cool mutative algorithmics in this group, I'm still in favor of saying they don't belong in any stream op not called forEach. -Doug From brian.goetz at oracle.com Fri Mar 22 09:59:08 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 12:59:08 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: <514C7F25.1090609@cs.oswego.edu> References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C7F25.1090609@cs.oswego.edu> Message-ID: <514C8DDC.70303@oracle.com> To add to this: while it may be possible for careful people to get away with mutative tricks in flatMappers, I am not inclined to distort the API at all to cater to that. We've done so much work to enable things to be expressible in a safe functional style, it would be a shame to make a U-turn and re-embrace statefulness. We all know people will cheat, regardless of warnings. But the moral hazard of catering to such cheating is huge, as it only leads to more and more dangerous cheating. The primary reason to ban mutable capture was not simply "mutation is bad", but that the 99% use case for mutable capture -- accumulators -- had a better, safer, clearer, more parallelizable solution in reduce. So I am all for making streams work best when you follow the "no stateful lambdas" rule, and have it *merely work* if you can ensure, that for all time now and in the future, that you control the entire pipeline and will never try to use parallelism. But I'm not even sure we want to commit to saying that much. On 3/22/2013 11:56 AM, Doug Lea wrote: > On 03/22/13 10:07, Joe Bowbeer wrote: >> Stateful programming has its issues but that ship has already sailed >> (in Java). >> > > Although it is worth bearing in mind that most stream functionality > wrt Collections exploits the fact that operations within > traversals are already known to avoid some of the worst unexpected > side-effects -- mutating a collection while you are traversing. > Which normally leads to ConcurrentModificationException for > iterators. A variant of this is preserved when applicable > in Spliterator implementations. People learn quickly to avoid them. > (That's the subject of some of the specs Paul Sandoz has been > adding, which can't be nailed down very well in general because > they are quality-of-implementation issues, but he is trying anyway :-) > > Anyway, as the chief advocate for cool mutative algorithmics > in this group, I'm still in favor of saying they don't belong > in any stream op not called forEach. > > -Doug > > From brian.goetz at oracle.com Fri Mar 22 11:14:05 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 14:14:05 -0400 Subject: Survey on map/flatMap disambiguation In-Reply-To: <514882CF.8000308@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> <514882CF.8000308@oracle.com> Message-ID: <514C9F6D.8090902@oracle.com> There was no further discussion on this thread, so given that there was already a poll in favor, and little followup afterwards, I'm inclined to push this change. On 3/19/2013 11:22 AM, Brian Goetz wrote: > Thanks for writing up your concerns all in one place. > >> 1. Performance gotchas? > > Of these, I think this is the worst of the concerns. However, I also > think its not as bad as it sounds. When a user hits ctrl-space in the > IDE, they'll see (close to each other) all the mapToXxx forms, which has > actually a lot of educational value. [1] > > Secondly, while boxed performance is definitely much worse than > non-boxed, for small streams (and most collections are small), it might > not make a difference anyway. I've definitely had the experience many > many times of discovering "egregious" performance "bugs" like this that > turned out to have no effect on actual business-relevant performance > metrics, because all the cost was in the { XML parsing, database access, > network latency, crypto, etc }. For those cases where it does make a > difference, profiling will disclose this immediately. So its a gotcha, > but not a disaster. > > So I think its a gotcha but nothing so bad that this makes the > difference for me. > >> 2. Consistency. What about other similar methods? >> >> My main concern is that this change will (or should?) cascade to other >> methods, and, in the end, a patch that need only be applied to flatMap >> ripples through the entire API. > > I did a quick look and didn't see any other obvious examples of this > pattern, where we've overloaded methods for all the X-to-Y stream > conversions. Did you find any ripples I missed? > >> 3. Breaks map/reduce functional feel? >> >> The other concern that I stated previously is that this mars the >> familiar map/reduce functional feel, without helping enough with the >> usability or readability. Either way, an IDE will be indispensable. > > This is obviously subjective but for me it felt OK. > > > > [1] Educating people that there are multiple stream shapes, whose > methods are similar but not exactly the same, is important. Calling all > the methods "map" may make people believe that there's a sum() method on > Stream, and be surprised when there is not. But .mapToInt(...).sum() > makes it more obvious what is going on here, which is arguably a plus. > From joe.bowbeer at gmail.com Fri Mar 22 11:26:47 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 22 Mar 2013 11:26:47 -0700 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C7F25.1090609@cs.oswego.edu> Message-ID: Doug writes >don't belong in any stream op not called forEach I'm with you there. Will we be able to advertise that one can easily rewrite any 'for' loop using each()? This is one of those useful talking points in the introductory articles: See, these new features aren't completely alien. You can take any for loop and transform it like so... If so, then forEach is an apt, intuitive name. Otherwise, some distance is needed. Joe On Mar 22, 2013 11:57 AM, "Doug Lea"
wrote: On 03/22/13 10:07, Joe Bowbeer wrote: > Stateful programming has its issues but that ship has already sailed (in > Java). > > Although it is worth bearing in mind that most stream functionality wrt Collections exploits the fact that operations within traversals are already known to avoid some of the worst unexpected side-effects -- mutating a collection while you are traversing. Which normally leads to ConcurrentModificationExceptio**n for iterators. A variant of this is preserved when applicable in Spliterator implementations. People learn quickly to avoid them. (That's the subject of some of the specs Paul Sandoz has been adding, which can't be nailed down very well in general because they are quality-of-implementation issues, but he is trying anyway :-) Anyway, as the chief advocate for cool mutative algorithmics in this group, I'm still in favor of saying they don't belong in any stream op not called forEach. -Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130322/6e40a36f/attachment.html From brian.goetz at oracle.com Fri Mar 22 11:33:16 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 14:33:16 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C7F25.1090609@cs.oswego.edu> Message-ID: <514CA3EC.5040305@oracle.com> for-loops that mutate uplevel locals or use nonlocal control flow will not be so transformable. (Supporting this was one of the goals of BGGA.) Transforming an existing for-loop, and then discovering that your stream source is controlled by other code that has decided to go parallel on you without you realizing it, will cause trouble. So the constraints on "when can I use statefulness in my lambdas" is pretty much as messy as "when is it safe to mutate fields of objects". (This problem was big enough it needed a whole book.) On 3/22/2013 2:26 PM, Joe Bowbeer wrote: > Doug writes > > >don't belong in any stream op not called forEach > > I'm with you there. > > Will we be able to advertise that one can easily rewrite any 'for' loop > using each()? > > This is one of those useful talking points in the introductory articles: > See, these new features aren't completely alien. You can take any for > loop and transform it like so... If so, then forEach is an apt, > intuitive name. Otherwise, some distance is needed. > > Joe > > On Mar 22, 2013 11:57 AM, "Doug Lea"
> wrote: > > On 03/22/13 10:07, Joe Bowbeer wrote: > > Stateful programming has its issues but that ship has already > sailed (in Java). > > > Although it is worth bearing in mind that most stream functionality > wrt Collections exploits the fact that operations within > traversals are already known to avoid some of the worst unexpected > side-effects -- mutating a collection while you are traversing. > Which normally leads to ConcurrentModificationExceptio__n for > iterators. A variant of this is preserved when applicable > in Spliterator implementations. People learn quickly to avoid them. > (That's the subject of some of the specs Paul Sandoz has been > adding, which can't be nailed down very well in general because > they are quality-of-implementation issues, but he is trying anyway :-) > > Anyway, as the chief advocate for cool mutative algorithmics > in this group, I'm still in favor of saying they don't belong > in any stream op not called forEach. > > -Doug > > From forax at univ-mlv.fr Fri Mar 22 13:39:00 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 22 Mar 2013 21:39:00 +0100 Subject: Survey on map/flatMap disambiguation In-Reply-To: <514C9F6D.8090902@oracle.com> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> <514882CF.8000308@oracle.com> <514C9F6D.8090902@oracle.com> Message-ID: <514CC164.9040600@univ-mlv.fr> On 03/22/2013 07:14 PM, Brian Goetz wrote: > There was no further discussion on this thread, so given that there > was already a poll in favor, and little followup afterwards, I'm > inclined to push this change. As Joe said, filter/map/reduce are common concepts, we should keep them simple. flatMap doesn't fall in that category so adding several overloads to help the compiler is just something necessary not something which is a design that we should apply on map. please, we should try to keep the common operation simple if we can. R?mi > > > On 3/19/2013 11:22 AM, Brian Goetz wrote: >> Thanks for writing up your concerns all in one place. >> >>> 1. Performance gotchas? >> >> Of these, I think this is the worst of the concerns. However, I also >> think its not as bad as it sounds. When a user hits ctrl-space in the >> IDE, they'll see (close to each other) all the mapToXxx forms, which has >> actually a lot of educational value. [1] >> >> Secondly, while boxed performance is definitely much worse than >> non-boxed, for small streams (and most collections are small), it might >> not make a difference anyway. I've definitely had the experience many >> many times of discovering "egregious" performance "bugs" like this that >> turned out to have no effect on actual business-relevant performance >> metrics, because all the cost was in the { XML parsing, database access, >> network latency, crypto, etc }. For those cases where it does make a >> difference, profiling will disclose this immediately. So its a gotcha, >> but not a disaster. >> >> So I think its a gotcha but nothing so bad that this makes the >> difference for me. >> >>> 2. Consistency. What about other similar methods? >>> >>> My main concern is that this change will (or should?) cascade to other >>> methods, and, in the end, a patch that need only be applied to flatMap >>> ripples through the entire API. >> >> I did a quick look and didn't see any other obvious examples of this >> pattern, where we've overloaded methods for all the X-to-Y stream >> conversions. Did you find any ripples I missed? >> >>> 3. Breaks map/reduce functional feel? >>> >>> The other concern that I stated previously is that this mars the >>> familiar map/reduce functional feel, without helping enough with the >>> usability or readability. Either way, an IDE will be indispensable. >> >> This is obviously subjective but for me it felt OK. >> >> >> >> [1] Educating people that there are multiple stream shapes, whose >> methods are similar but not exactly the same, is important. Calling all >> the methods "map" may make people believe that there's a sum() method on >> Stream, and be surprised when there is not. But .mapToInt(...).sum() >> makes it more obvious what is going on here, which is arguably a plus. >> From brian.goetz at oracle.com Fri Mar 22 15:00:15 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 18:00:15 -0400 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics Message-ID: <514CD46F.9020508@oracle.com> I've posted a survey at: https://www.surveymonkey.com/s/5VTLT26 To do an API and spec review for the classes Int/Long/DoubleSummaryStatistics. If you have comments, please provide them in the SurveyMonkey form. Usual password. From brian.goetz at oracle.com Fri Mar 22 15:14:56 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 18:14:56 -0400 Subject: API and spec review for Stream Message-ID: <514CD7E0.6030102@oracle.com> I have posted a survey at: https://www.surveymonkey.com/s/59CTHS8 This is a hopefully-final review of the API and preliminary review of the specification for the single class Stream. Docs are linked from the survey. Usual password. Any and all constructive comments welcome. It is known that the specs are incomplete; what is here is a start. Suggestions for improvement are welcome. From joe.bowbeer at gmail.com Fri Mar 22 16:21:48 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 22 Mar 2013 16:21:48 -0700 Subject: Simplifying sequential() / parallel() In-Reply-To: <514CA3EC.5040305@oracle.com> References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C7F25.1090609@cs.oswego.edu> <514CA3EC.5040305@oracle.com> Message-ID: One of the design principles for this API is that parallel transforms will not be automatic. There is a parallel() method for that. This hasn't changed, right? I maintain strict control over which parts of my code are serial and which are parallel. I believe this is the best approach for productivity and maintainability. Just because someone could insert a parallel() somewhere upstream and thereby create a mess doesn't matter to me. There are lots of ways to break code and this is not one of the ways that I am defending against. It seems to me that the ground rules are changing, driven by some latent aspects of the implementation. But I'll have to see if/how these changes affect my sample code before I can respond. Joe On Mar 22, 2013 2:33 PM, "Brian Goetz" wrote: > for-loops that mutate uplevel locals or use nonlocal control flow will not > be so transformable. (Supporting this was one of the goals of BGGA.) > > Transforming an existing for-loop, and then discovering that your stream > source is controlled by other code that has decided to go parallel on you > without you realizing it, will cause trouble. > > So the constraints on "when can I use statefulness in my lambdas" is > pretty much as messy as "when is it safe to mutate fields of objects". > (This problem was big enough it needed a whole book.) > > On 3/22/2013 2:26 PM, Joe Bowbeer wrote: > >> Doug writes >> >> >don't belong in any stream op not called forEach >> >> I'm with you there. >> >> Will we be able to advertise that one can easily rewrite any 'for' loop >> using each()? >> >> This is one of those useful talking points in the introductory articles: >> See, these new features aren't completely alien. You can take any for >> loop and transform it like so... If so, then forEach is an apt, >> intuitive name. Otherwise, some distance is needed. >> >> Joe >> >> On Mar 22, 2013 11:57 AM, "Doug Lea"
> > wrote: >> >> On 03/22/13 10:07, Joe Bowbeer wrote: >> >> Stateful programming has its issues but that ship has already >> sailed (in Java). >> >> >> Although it is worth bearing in mind that most stream functionality >> wrt Collections exploits the fact that operations within >> traversals are already known to avoid some of the worst unexpected >> side-effects -- mutating a collection while you are traversing. >> Which normally leads to ConcurrentModificationExceptio**__n for >> iterators. A variant of this is preserved when applicable >> in Spliterator implementations. People learn quickly to avoid them. >> (That's the subject of some of the specs Paul Sandoz has been >> adding, which can't be nailed down very well in general because >> they are quality-of-implementation issues, but he is trying anyway :-) >> >> Anyway, as the chief advocate for cool mutative algorithmics >> in this group, I'm still in favor of saying they don't belong >> in any stream op not called forEach. >> >> -Doug >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130322/79a722c9/attachment.html From brian.goetz at oracle.com Fri Mar 22 16:29:00 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Mar 2013 19:29:00 -0400 Subject: Simplifying sequential() / parallel() In-Reply-To: References: <514B63E5.8090101@oracle.com> <514C5D9E.9060500@oracle.com> <514C7F25.1090609@cs.oswego.edu> <514CA3EC.5040305@oracle.com> Message-ID: <514CE93C.30005@oracle.com> > One of the design principles for this API is that parallel transforms > will not be automatic. There is a parallel() method for that. This > hasn't changed, right? Correct. Parallelism is always explicit. > It seems to me that the ground rules are changing, driven by some latent > aspects of the implementation. But I'll have to see if/how these changes > affect my sample code before I can respond. No, it has nothing to do with the implementation. It was identified that complexities in the model -- which were no longer as important because of other past simplifications -- could be removed, making the user model and specification simpler. As often happens, this also makes the implementation simpler and more performant, but that's a secondary benefit. The reality was that the existing sequential() / parallel() were overly general, complicated, and did not admit efficient implementation. From spullara at gmail.com Fri Mar 22 18:44:06 2013 From: spullara at gmail.com (Sam Pullara) Date: Fri, 22 Mar 2013 18:44:06 -0700 Subject: Survey on map/flatMap disambiguation In-Reply-To: <514CC164.9040600@univ-mlv.fr> References: <513D1C6D.40906@oracle.com> <5142032F.9070601@oracle.com> <51472467.6090502@oracle.com> <514882CF.8000308@oracle.com> <514C9F6D.8090902@oracle.com> <514CC164.9040600@univ-mlv.fr> Message-ID: <25BFE6C0-BBE4-4895-8973-B78F16067A31@gmail.com> I'm for putting in the mapToInt etc. Anything to stop type inference from slowing me down. Most of the issues I run into when building things are mismatches between various overloads. Sam On Mar 22, 2013, at 1:39 PM, Remi Forax wrote: > On 03/22/2013 07:14 PM, Brian Goetz wrote: >> There was no further discussion on this thread, so given that there was already a poll in favor, and little followup afterwards, I'm inclined to push this change. > > As Joe said, filter/map/reduce are common concepts, > we should keep them simple. > > flatMap doesn't fall in that category so adding several overloads to help the compiler > is just something necessary not something which is a design that we should apply > on map. > > please, we should try to keep the common operation simple if we can. > > R?mi > >> >> >> On 3/19/2013 11:22 AM, Brian Goetz wrote: >>> Thanks for writing up your concerns all in one place. >>> >>>> 1. Performance gotchas? >>> >>> Of these, I think this is the worst of the concerns. However, I also >>> think its not as bad as it sounds. When a user hits ctrl-space in the >>> IDE, they'll see (close to each other) all the mapToXxx forms, which has >>> actually a lot of educational value. [1] >>> >>> Secondly, while boxed performance is definitely much worse than >>> non-boxed, for small streams (and most collections are small), it might >>> not make a difference anyway. I've definitely had the experience many >>> many times of discovering "egregious" performance "bugs" like this that >>> turned out to have no effect on actual business-relevant performance >>> metrics, because all the cost was in the { XML parsing, database access, >>> network latency, crypto, etc }. For those cases where it does make a >>> difference, profiling will disclose this immediately. So its a gotcha, >>> but not a disaster. >>> >>> So I think its a gotcha but nothing so bad that this makes the >>> difference for me. >>> >>>> 2. Consistency. What about other similar methods? >>>> >>>> My main concern is that this change will (or should?) cascade to other >>>> methods, and, in the end, a patch that need only be applied to flatMap >>>> ripples through the entire API. >>> >>> I did a quick look and didn't see any other obvious examples of this >>> pattern, where we've overloaded methods for all the X-to-Y stream >>> conversions. Did you find any ripples I missed? >>> >>>> 3. Breaks map/reduce functional feel? >>>> >>>> The other concern that I stated previously is that this mars the >>>> familiar map/reduce functional feel, without helping enough with the >>>> usability or readability. Either way, an IDE will be indispensable. >>> >>> This is obviously subjective but for me it felt OK. >>> >>> >>> >>> [1] Educating people that there are multiple stream shapes, whose >>> methods are similar but not exactly the same, is important. Calling all >>> the methods "map" may make people believe that there's a sum() method on >>> Stream, and be surprised when there is not. But .mapToInt(...).sum() >>> makes it more obvious what is going on here, which is arguably a plus. >>> > From spullara at gmail.com Sun Mar 24 12:32:15 2013 From: spullara at gmail.com (Sam Pullara) Date: Sun, 24 Mar 2013 12:32:15 -0700 Subject: Iterable/Iterator.stream() Message-ID: I was working with Brian on seeing how limit/substream functionality[1] might be implemented and he suggested conversion to Iterator was the right way to go about it. I had thought about that solution but didn't find any obvious way to take an iterator and turn it into a stream. It turns out it is in there, you just need to first convert the iterator to a spliterator and then convert the spliterator to a stream. So this brings me to revisit the whether we should have these hanging off one of Iterable/Iterator directly or both. My suggestion is to at least have it on Iterator so you can move cleanly between the two worlds and it would also be easily discoverable rather than having to do: Streams.stream(Spliterators.spliteratorUnknownSize(iterator, Spliterator.ORDERED)) Sam [1] https://github.com/spullara/java-future-jdk8/blob/master/src/main/java/spullara/util/Limiter.java From forax at univ-mlv.fr Sun Mar 24 14:44:42 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 24 Mar 2013 22:44:42 +0100 Subject: Iterable/Iterator.stream() In-Reply-To: References: Message-ID: <514F73CA.1060003@univ-mlv.fr> On 03/24/2013 08:32 PM, Sam Pullara wrote: > I was working with Brian on seeing how limit/substream > functionality[1] might be implemented and he suggested conversion to > Iterator was the right way to go about it. I had thought about that > solution but didn't find any obvious way to take an iterator and turn > it into a stream. It turns out it is in there, you just need to first > convert the iterator to a spliterator and then convert the spliterator > to a stream. So this brings me to revisit the whether we should have > these hanging off one of Iterable/Iterator directly or both. > > My suggestion is to at least have it on Iterator so you can move > cleanly between the two worlds and it would also be easily > discoverable rather than having to do: > > Streams.stream(Spliterators.spliteratorUnknownSize(iterator, > Spliterator.ORDERED)) > > Sam > > [1] https://github.com/spullara/java-future-jdk8/blob/master/src/main/java/spullara/util/Limiter.java It's not really a good sign of the heath of the API. it seems that for a lot of problems there is no way to write a good Spliterator directly, i.e. it's better to write an Iterator and let the JDK create the Spliterator around it, R?mi From brian.goetz at oracle.com Sun Mar 24 14:51:49 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 24 Mar 2013 17:51:49 -0400 Subject: Iterable/Iterator.stream() In-Reply-To: <514F73CA.1060003@univ-mlv.fr> References: <514F73CA.1060003@univ-mlv.fr> Message-ID: I think Sam's point was that there are plenty of library classes that give you an Iterator but don't let you necessarily write your own spliterator. So all you can do is call stream(spliteratorUnknownSize(iterator)). Sam is suggesting that we define Iterator.stream() to do that for you. I would like to keep the stream() and spliterator() methods as being for library writers / advanced users. On Mar 24, 2013, at 5:44 PM, Remi Forax wrote: > On 03/24/2013 08:32 PM, Sam Pullara wrote: >> I was working with Brian on seeing how limit/substream >> functionality[1] might be implemented and he suggested conversion to >> Iterator was the right way to go about it. I had thought about that >> solution but didn't find any obvious way to take an iterator and turn >> it into a stream. It turns out it is in there, you just need to first >> convert the iterator to a spliterator and then convert the spliterator >> to a stream. So this brings me to revisit the whether we should have >> these hanging off one of Iterable/Iterator directly or both. >> >> My suggestion is to at least have it on Iterator so you can move >> cleanly between the two worlds and it would also be easily >> discoverable rather than having to do: >> >> Streams.stream(Spliterators.spliteratorUnknownSize(iterator, >> Spliterator.ORDERED)) >> >> Sam >> >> [1] https://github.com/spullara/java-future-jdk8/blob/master/src/main/java/spullara/util/Limiter.java > > It's not really a good sign of the heath of the API. > > it seems that for a lot of problems there is no way to write a good Spliterator directly, > i.e. it's better to write an Iterator and let the JDK create the Spliterator around it, > > R?mi > From forax at univ-mlv.fr Sun Mar 24 15:53:54 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 24 Mar 2013 23:53:54 +0100 Subject: Iterable/Iterator.stream() In-Reply-To: References: <514F73CA.1060003@univ-mlv.fr> Message-ID: <514F8402.1060909@univ-mlv.fr> On 03/24/2013 10:51 PM, Brian Goetz wrote: > I think Sam's point was that there are plenty of library classes that give you an Iterator but don't let you necessarily write your own spliterator. So all you can do is call stream(spliteratorUnknownSize(iterator)). Sam is suggesting that we define Iterator.stream() to do that for you. The problem is how to pass the Spliterator flags. > > I would like to keep the stream() and spliterator() methods as being for library writers / advanced users. Given that writing a Spliterator is easier than writing an Iterator, I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :) R?mi > > On Mar 24, 2013, at 5:44 PM, Remi Forax wrote: > >> On 03/24/2013 08:32 PM, Sam Pullara wrote: >>> I was working with Brian on seeing how limit/substream >>> functionality[1] might be implemented and he suggested conversion to >>> Iterator was the right way to go about it. I had thought about that >>> solution but didn't find any obvious way to take an iterator and turn >>> it into a stream. It turns out it is in there, you just need to first >>> convert the iterator to a spliterator and then convert the spliterator >>> to a stream. So this brings me to revisit the whether we should have >>> these hanging off one of Iterable/Iterator directly or both. >>> >>> My suggestion is to at least have it on Iterator so you can move >>> cleanly between the two worlds and it would also be easily >>> discoverable rather than having to do: >>> >>> Streams.stream(Spliterators.spliteratorUnknownSize(iterator, >>> Spliterator.ORDERED)) >>> >>> Sam >>> >>> [1] https://github.com/spullara/java-future-jdk8/blob/master/src/main/java/spullara/util/Limiter.java >> It's not really a good sign of the heath of the API. >> >> it seems that for a lot of problems there is no way to write a good Spliterator directly, >> i.e. it's better to write an Iterator and let the JDK create the Spliterator around it, >> >> R?mi >> From brian.goetz at oracle.com Sun Mar 24 16:04:30 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 24 Mar 2013 19:04:30 -0400 Subject: Iterable/Iterator.stream() In-Reply-To: <514F8402.1060909@univ-mlv.fr> References: <514F73CA.1060003@univ-mlv.fr> <514F8402.1060909@univ-mlv.fr> Message-ID: <3644BE54-1B34-4D4E-B91F-CF7D3B7748C8@oracle.com> > Given that writing a Spliterator is easier than writing an Iterator, > I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :) You're missing the point, though. There are zillions of classes out there that *already* hand you an Iterator. And many of them are not spliterator-ready. From forax at univ-mlv.fr Sun Mar 24 16:34:47 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 25 Mar 2013 00:34:47 +0100 Subject: Iterable/Iterator.stream() In-Reply-To: <3644BE54-1B34-4D4E-B91F-CF7D3B7748C8@oracle.com> References: <514F73CA.1060003@univ-mlv.fr> <514F8402.1060909@univ-mlv.fr> <3644BE54-1B34-4D4E-B91F-CF7D3B7748C8@oracle.com> Message-ID: <514F8D97.8030207@univ-mlv.fr> On 03/25/2013 12:04 AM, Brian Goetz wrote: >> Given that writing a Spliterator is easier than writing an Iterator, >> I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :) > You're missing the point, though. There are zillions of classes out there that *already* hand you an Iterator. And many of them are not spliterator-ready. > yes, yes, but for Sam or Ben Evans (on lambda-dev), there were no Iterator available but they end up to write a new one. R?mi From Donald.Raab at gs.com Sun Mar 24 23:24:28 2013 From: Donald.Raab at gs.com (Raab, Donald) Date: Mon, 25 Mar 2013 02:24:28 -0400 Subject: GS Collections 3.0 Released Message-ID: <6712820CB52CFB4D842561213A77C05404C9708CE5@GSCMAMP09EX.firmwide.corp.gs.com> Hi All, We recently released GS Collections 3.0, which now contains a rich set of primitive containers including list, set, bag, map, stack and interval. We added support for all primitive types. High level release notes, binary/source download, JavaDoc and JDiff reports are available in GitHub here: https://github.com/goldmansachs/gs-collections/wiki/3.0-Release-Notes I've also updated our kata solutions to use the new primitive container support. Some usage examples can be seen in the following exercises: https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise5Test.java https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise6Test.java https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise7Test.java https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise9Test.java Thanks, Don -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130325/af20a29f/attachment.html From dl at cs.oswego.edu Mon Mar 25 16:30:17 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 25 Mar 2013 19:30:17 -0400 Subject: sorting and stability Message-ID: <5150DE09.3020505@cs.oswego.edu> Prodded by a lambda-dev post, I've been looking more carefully into stability guarantees for Stream.sorted(). It struck me that if the stream is ORDERED, then people have a reasonable expectation that the sort will be stable. Else not. Agreed? Algorithmically, the cost of this might on average be zero. Even though it does add cost in general, it can be made immeasurably small when there are no duplicate keys, and as fast as any other strategy when there are duplicates. I've done some preliminary versions of this that look pretty good. Barely-working hacks are already faster than the existing sorts in there now (ArraysParallelSortHelpers) that were based on my ParallelArray versions. But serious versions are not very close to being ready. Do we want to solidify specs anyway? -Doug From joe.bowbeer at gmail.com Mon Mar 25 18:31:51 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 25 Mar 2013 21:31:51 -0400 Subject: sorting and stability In-Reply-To: <5150DE09.3020505@cs.oswego.edu> References: <5150DE09.3020505@cs.oswego.edu> Message-ID: sequential streams should be stable in all cases? This is principle of least surprise. On Mar 25, 2013 7:30 PM, "Doug Lea"
wrote: > > Prodded by a lambda-dev post, I've been looking more > carefully into stability guarantees for Stream.sorted(). > > It struck me that if the stream is ORDERED, then people > have a reasonable expectation that the sort will be stable. > Else not. > > Agreed? > > Algorithmically, the cost of this might on average be zero. > Even though it does add cost in general, it can be made > immeasurably small when there are no duplicate keys, and > as fast as any other strategy when there are duplicates. > > I've done some preliminary versions of this that look > pretty good. Barely-working hacks are already faster than the > existing sorts in there now (ArraysParallelSortHelpers) > that were based on my ParallelArray versions. > But serious versions are not very close to being ready. > > Do we want to solidify specs anyway? > > -Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130325/4c9f45ab/attachment.html From tim at peierls.net Tue Mar 26 07:26:19 2013 From: tim at peierls.net (Tim Peierls) Date: Tue, 26 Mar 2013 10:26:19 -0400 Subject: sorting and stability In-Reply-To: <5150DE09.3020505@cs.oswego.edu> References: <5150DE09.3020505@cs.oswego.edu> Message-ID: On Mon, Mar 25, 2013 at 7:30 PM, Doug Lea
wrote: > Prodded by a lambda-dev post, I've been looking more > carefully into stability guarantees for Stream.sorted(). > > It struck me that if the stream is ORDERED, then people > have a reasonable expectation that the sort will be stable. > Else not. > > Agreed? > Yes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130326/14a8aacd/attachment.html From mike.duigou at oracle.com Tue Mar 26 23:00:59 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Tue, 26 Mar 2013 23:00:59 -0700 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> Message-ID: I've pushed another update that has putIfAbsent treat null values as absent. This allows putIfAbsent to be restored in a couple of the defaults. This feels like the right thing to do though some null-lover will eventually be unhappy that his null value got trampled. The remove() and replace() methods still distinguish between absent keys and present keys with null values. I think this is appropriate. The alternative to treating null values as absent is to add additional code to distinguish between absent and nulls values everywhere. I attempted this in http://hg.openjdk.java.net/lambda/lambda/jdk/rev/8561d74a9e8f It wasn't much more successful because the value mappers use null for signalling. The current specification and behaviour seems about as good as can be achieved while still tolerating nulls and without introducing Optional. Mike On Mar 25 2013, at 16:41 , Mike Duigou wrote: > My intention with the first cut of the unit test was to avoid changing the method specification while having consistent behaviour across all of the Map implementations. It's certainly possible I made mistakes and changed the contract of the defaults in ways I didn't intend. > > I think perhaps that changing putIfAbsent to allow replacement of null values would help a lot to making the operations more consistent. I am also going to look at Sam's suggestion of using replace() rather than put() in a few places. Using replace() almost ensures though that we'll introduce more retry loop implementations. This may be fine though. > > Mike > > > On Mar 25 2013, at 14:57 , Peter Levart wrote: > >> Hi Mike, >> >> I see default Map.computeIfAbsent has been chosen to not be atomic and rather support null values more naturally. That's ok if JDK ConcurrentMap implementations provide atomic overrides. Other 3rd party implementations will follow soon. >> >> But I have doubts about default Map.compute(). On one hand it contains the usual disclaimer: >> >> 900 *

The default implementation makes no guarantees about >> 901 * synchronization or atomicity properties of this method or the >> 902 * application of the remapping function. Any class overriding >> 903 * this method must specify its concurrency properties. In >> 904 * particular, all implementations of subinterface {@link >> 905 * java.util.concurrent.ConcurrentMap} must document whether the >> 906 * function is applied exactly once atomically. Any class that >> 907 * permits null values must document whether and how this method >> 908 * distinguishes absence from null mappings. >> >> ...but on the other hand it tries to be smart: >> >> 924 * In concurrent contexts, the default implementation may retry >> 925 * these steps when multiple threads attempt updates. >> >> Now this last sentence may indicate that a ConcurrentMap implementation that does not override compute() might safely be used with default Map.compute() and be atomic. It was atomic until putIfAbsent() was replaced with plain put() to correct the "existing null value" behavior. The retry-loop is only needed when the optimistic operation is not successful. But put() is not optimistic. It always "succeeds". And retrying after the put() only makes things worse: "Oh, somebody was quicker than me and I have just overwritten his value - never mind, I'll try to make some more damage in next loop..." >> >> If the damage was done already, then there's no point in repeating the loop. Further, what's the point in using optimistic operations: replace(key, oldValue, newValue) and remove(key, oldValue) with retry-loop on one hand and then just stomping over with put() on the other hand. If the default Map.compute() method is declared non-atomic, then plain put(key, newValue) instead of replace(key, oldValue, newValue) and remove(key) instead of remove(key, oldValue) could be used and no retry-loop... >> >> The same goes for default Map.merge(). >> >> Regards, Peter >> >> P.S. What do you think about changing the specification of putIfAbsent to always overwrite null values? It could make all these things simpler and more consistent. And it would not break anything. >> >> >> On 03/25/2013 08:21 PM, mike.duigou at oracle.com wrote: >>> Changeset: c8d40b7e6de3 >>> Author: mduigou >>> Date: 2013-03-20 20:32 -0700 >>> URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/c8d40b7e6de3 >>> >>> bug fixes and unit test for Map Defaults >>> >>> ! src/share/classes/java/util/HashMap.java >>> ! src/share/classes/java/util/Map.java >>> + test/java/util/Map/Defaults.java >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130326/06c52bec/attachment.html From joe.bowbeer at gmail.com Tue Mar 26 23:40:32 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 27 Mar 2013 02:40:32 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> Message-ID: I was not expecting Map to define a default impl. for what was originally a ConcurrentMap method, btw. If it does, however, and if Map supports null - as all the base Collections do - then shouldn't all the default method implementations support null as well? (Support is the correct term here, as the Collections spec makes no mention of tolerate.) In Map there is a distinction between !contains(key) and get(key)==null. I think putIfAbsent is even defined in terms of contains() and not get(). How can we justify not fully supporting null in some Map methods? I can't come up with a rationale or a loophole to exploit. On Mar 27, 2013 12:01 AM, "Mike Duigou" wrote: > I've pushed another update that has putIfAbsent treat null values as > absent. This allows putIfAbsent to be restored in a couple of the defaults. > This feels like the right thing to do though some null-lover will > eventually be unhappy that his null value got trampled. The remove() and > replace() methods still distinguish between absent keys and present keys > with null values. I think this is appropriate. > > The alternative to treating null values as absent is to add additional > code to distinguish between absent and nulls values everywhere. I attempted > this in > > http://hg.openjdk.java.net/lambda/lambda/jdk/rev/8561d74a9e8f > > It wasn't much more successful because the value mappers use null for > signalling. > > The current specification and behaviour seems about as good as can be > achieved while still tolerating nulls and without introducing Optional. > > Mike > > On Mar 25 2013, at 16:41 , Mike Duigou wrote: > > My intention with the first cut of the unit test was to avoid changing the > method specification while having consistent behaviour across all of the > Map implementations. It's certainly possible I made mistakes and changed > the contract of the defaults in ways I didn't intend. > > I think perhaps that changing putIfAbsent to allow replacement of null > values would help a lot to making the operations more consistent. I am also > going to look at Sam's suggestion of using replace() rather than put() in a > few places. Using replace() almost ensures though that we'll introduce more > retry loop implementations. This may be fine though. > > Mike > > > On Mar 25 2013, at 14:57 , Peter Levart wrote: > > Hi Mike, > > I see default Map.computeIfAbsent has been chosen to not be atomic and > rather support null values more naturally. That's ok if JDK ConcurrentMap > implementations provide atomic overrides. Other 3rd party implementations > will follow soon. > > But I have doubts about default Map.compute(). On one hand it contains > the usual disclaimer: > > 900 *

The default implementation makes no guarantees about > 901 * synchronization or atomicity properties of this method or > the > 902 * application of the remapping function. Any class > overriding > 903 * this method must specify its concurrency properties. In > 904 * particular, all implementations of subinterface {@link > 905 * java.util.concurrent.ConcurrentMap} must document whether > the > 906 * function is applied exactly once atomically. Any class > that > 907 * permits null values must document whether and how this > method > 908 * distinguishes absence from null mappings. > > ...but on the other hand it tries to be smart: > > 924 * In concurrent contexts, the default implementation may > retry > 925 * these steps when multiple threads attempt updates. > > Now this last sentence may indicate that a ConcurrentMap implementation > that does not override compute() might safely be used with default > Map.compute() and be atomic. It was atomic until putIfAbsent() was replaced > with plain put() to correct the "existing null value" behavior. The > retry-loop is only needed when the optimistic operation is not successful. > But put() is not optimistic. It always "succeeds". And retrying after the > put() only makes things worse: "Oh, somebody was quicker than me and I have > just overwritten his value - never mind, I'll try to make some more damage > in next loop..." > > If the damage was done already, then there's no point in repeating the > loop. Further, what's the point in using optimistic operations: > replace(key, oldValue, newValue) and remove(key, oldValue) with retry-loop > on one hand and then just stomping over with put() on the other hand. If > the default Map.compute() method is declared non-atomic, then plain > put(key, newValue) instead of replace(key, oldValue, newValue) and > remove(key) instead of remove(key, oldValue) could be used and no > retry-loop... > > The same goes for default Map.merge(). > > Regards, Peter > > P.S. What do you think about changing the specification of putIfAbsent to > always overwrite null values? It could make all these things simpler and > more consistent. And it would not break anything. > > > On 03/25/2013 08:21 PM, mike.duigou at oracle.com wrote: > > Changeset: c8d40b7e6de3 > Author: mduigou > Date: 2013-03-20 20:32 -0700 > URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/c8d40b7e6de3 > > bug fixes and unit test for Map Defaults > > ! src/share/classes/java/util/HashMap.java > ! src/share/classes/java/util/Map.java > + test/java/util/Map/Defaults.java > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130327/33644566/attachment.html From joe.bowbeer at gmail.com Wed Mar 27 08:45:57 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 27 Mar 2013 11:45:57 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: <51529EE1.1080003@gmail.com> References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> <51529EE1.1080003@gmail.com> Message-ID: putIfAbsent was not designed to be compatible with null values. It was designed for the null-incompatible ConcurrentMap interface. I don't think a null-compatible implementation of CM is possible, BTW. So the prospects for putIfAbsent in Map are limited from the start. In your analysis you assume that the user cares about the return value (which is where the problem arises) but might this not be the case? Joe On Mar 27, 2013 1:25 AM, "Peter Levart" wrote: > On 03/27/2013 07:40 AM, Joe Bowbeer wrote: > > I was not expecting Map to define a default impl. for what was originally > a ConcurrentMap method, btw. > > If it does, however, and if Map supports null - as all the base > Collections do - then shouldn't all the default method implementations > support null as well? (Support is the correct term here, as the Collections > spec makes no mention of tolerate.) > > > The new definition still supports null, but differently. > > In Map there is a distinction between !contains(key) and get(key)==null. > I think putIfAbsent is even defined in terms of contains() and not get(). > > > It is defined, yes, but the part about null values "wasn't exploited in > reality", since all ConcurrentMap implementations in JDK don't support null > values and I haven't yet seen one that does. Have you? The debate is about > whether it is appropriate to *change* the definition of putIfAbsent. I > think it is. Because now, that putIfAbsent is being defined for plain Maps > which we know do support null values, the question about nulls becomes > relevant. Which is the most appropriate definition of putIfAbsent for Maps > that do support null values (the only constraint being that method > signature should not change)? Current definition (as defined until now) is > dubious, since when putIfAbsent returns null, the user does not know > whether null has already been mapped and was therefore not replaced with > new value or there was no mapping yet and new value has been entered into > the Map. This non-determinism is not making such putIfAbsent usable at all. > With new definition, the null return is more deterministic. You still don't > know whether there was already a mapping to null present in the map or > there was no mapping, but you are certain that new value landed in the Map. > I think this is more appropriate. In either way, if one wants to use > putIfAbsent, he should not put null values in the Map. > > Regards, Peter > > How can we justify not fully supporting null in some Map methods? > > I can't come up with a rationale or a loophole to exploit. > On Mar 27, 2013 12:01 AM, "Mike Duigou" wrote: > >> I've pushed another update that has putIfAbsent treat null values as >> absent. This allows putIfAbsent to be restored in a couple of the defaults. >> This feels like the right thing to do though some null-lover will >> eventually be unhappy that his null value got trampled. The remove() and >> replace() methods still distinguish between absent keys and present keys >> with null values. I think this is appropriate. >> >> The alternative to treating null values as absent is to add additional >> code to distinguish between absent and nulls values everywhere. I attempted >> this in >> >> http://hg.openjdk.java.net/lambda/lambda/jdk/rev/8561d74a9e8f >> >> It wasn't much more successful because the value mappers use null for >> signalling. >> >> The current specification and behaviour seems about as good as can be >> achieved while still tolerating nulls and without introducing Optional. >> >> Mike >> >> On Mar 25 2013, at 16:41 , Mike Duigou wrote: >> >> My intention with the first cut of the unit test was to avoid changing >> the method specification while having consistent behaviour across all of >> the Map implementations. It's certainly possible I made mistakes and >> changed the contract of the defaults in ways I didn't intend. >> >> I think perhaps that changing putIfAbsent to allow replacement of null >> values would help a lot to making the operations more consistent. I am also >> going to look at Sam's suggestion of using replace() rather than put() in a >> few places. Using replace() almost ensures though that we'll introduce more >> retry loop implementations. This may be fine though. >> >> Mike >> >> >> On Mar 25 2013, at 14:57 , Peter Levart wrote: >> >> Hi Mike, >> >> I see default Map.computeIfAbsent has been chosen to not be atomic and >> rather support null values more naturally. That's ok if JDK ConcurrentMap >> implementations provide atomic overrides. Other 3rd party implementations >> will follow soon. >> >> But I have doubts about default Map.compute(). On one hand it contains >> the usual disclaimer: >> >> 900 *

The default implementation makes no guarantees about >> 901 * synchronization or atomicity properties of this method >> or the >> 902 * application of the remapping function. Any class >> overriding >> 903 * this method must specify its concurrency properties. In >> 904 * particular, all implementations of subinterface {@link >> 905 * java.util.concurrent.ConcurrentMap} must document >> whether the >> 906 * function is applied exactly once atomically. Any class >> that >> 907 * permits null values must document whether and how this >> method >> 908 * distinguishes absence from null mappings. >> >> ...but on the other hand it tries to be smart: >> >> 924 * In concurrent contexts, the default implementation may >> retry >> 925 * these steps when multiple threads attempt updates. >> >> Now this last sentence may indicate that a ConcurrentMap implementation >> that does not override compute() might safely be used with default >> Map.compute() and be atomic. It was atomic until putIfAbsent() was replaced >> with plain put() to correct the "existing null value" behavior. The >> retry-loop is only needed when the optimistic operation is not successful. >> But put() is not optimistic. It always "succeeds". And retrying after the >> put() only makes things worse: "Oh, somebody was quicker than me and I have >> just overwritten his value - never mind, I'll try to make some more damage >> in next loop..." >> >> If the damage was done already, then there's no point in repeating the >> loop. Further, what's the point in using optimistic operations: >> replace(key, oldValue, newValue) and remove(key, oldValue) with retry-loop >> on one hand and then just stomping over with put() on the other hand. If >> the default Map.compute() method is declared non-atomic, then plain >> put(key, newValue) instead of replace(key, oldValue, newValue) and >> remove(key) instead of remove(key, oldValue) could be used and no >> retry-loop... >> >> The same goes for default Map.merge(). >> >> Regards, Peter >> >> P.S. What do you think about changing the specification of putIfAbsent to >> always overwrite null values? It could make all these things simpler and >> more consistent. And it would not break anything. >> >> >> On 03/25/2013 08:21 PM, mike.duigou at oracle.com wrote: >> >> Changeset: c8d40b7e6de3 >> Author: mduigou >> Date: 2013-03-20 20:32 -0700 >> URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/c8d40b7e6de3 >> >> bug fixes and unit test for Map Defaults >> >> ! src/share/classes/java/util/HashMap.java >> ! src/share/classes/java/util/Map.java >> + test/java/util/Map/Defaults.java >> >> >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130327/26cfadfb/attachment-0001.html From dl at cs.oswego.edu Wed Mar 27 09:05:15 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 27 Mar 2013 12:05:15 -0400 Subject: sorting and stability In-Reply-To: <5150DE09.3020505@cs.oswego.edu> References: <5150DE09.3020505@cs.oswego.edu> Message-ID: <515318BB.2030805@cs.oswego.edu> On 03/25/13 19:30, Doug Lea wrote: > It struck me that if the stream is ORDERED, then people > have a reasonable expectation that the sort will be stable. > Else not. > > Agreed? Apparently everyone does. > > Algorithmically, the cost of this might on average be zero. > Even though it does add cost in general, it can be made > immeasurably small when there are no duplicate keys, and > as fast as any other strategy when there are duplicates. This still holds for versions ready enough to tentatively commit. Paul will help integrate into lambda repo within a few days. A few other notes on implementation: * While I was at it, I tested split thresholds more carefully. Unsurprisingly (in retrospect), you need values big enough to outweigh memory contention across tasks, which (because of cardmarks) is bigger than you'd like -- it is now set to 8K. * The previous versions required temp workspace arrays as large as source array, even if only a portion was being sorted. Now they don't. * The new versions use CountedCompleters, which makes them stall much less on GC etc. * The new versions bypass par sort entirely if size < threshold, which bypasses needless workspace array allocation. * I ran these all through a modified version of the jtreg "Sorting" test that replaces "sort" with "parallelSort" everywhere (plus uses larger array sizes to test), that includes stability checks. Someone might want to add something like this to jtreg. -Doug From dl at cs.oswego.edu Wed Mar 27 09:35:12 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 27 Mar 2013 12:35:12 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> <51529EE1.1080003@gmail.com> Message-ID: <51531FC0.5080604@cs.oswego.edu> On 03/27/13 11:45, Joe Bowbeer wrote: > putIfAbsent was not designed to be compatible with null values. It was designed > for the null-incompatible ConcurrentMap interface. I don't think a > null-compatible implementation of CM is possible, BTW. > This is what everyone mis-remembers. Somehow I let someone talk me into not strictly banishing nulls in ConcurrentMap interface specs, even though some methods are not very useful if they are allowed, and no actual implementation I know allows them. The defaults for the new stream-collect-friendly Map methods preserve this state of affairs. -Doug From joe.bowbeer at gmail.com Wed Mar 27 21:51:57 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 28 Mar 2013 00:51:57 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: <51531FC0.5080604@cs.oswego.edu> References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> <51529EE1.1080003@gmail.com> <51531FC0.5080604@cs.oswego.edu> Message-ID: The way I'm interpreting this is: 1. ConcurrentMap effectively prohibits nulls (feel free to make analogies with effective thread safety;) 2. Map is not. It accommodates nulls. I'm not happy with this current middle ground taken by Map's putIfAbsent. I think it either needs to (1) accommodate nulls to the extent that it can, according to the letter of the current definition, which leaves the return value useless, or (2) we should redefine putIfAbsent, removing the !contains part and substituting get==null. As it stands, I can't figure out what it does without legacy information and implementation details, which is not satisfying. On Mar 27, 2013 10:35 AM, "Doug Lea"

wrote: > On 03/27/13 11:45, Joe Bowbeer wrote: > >> putIfAbsent was not designed to be compatible with null values. It was >> designed >> for the null-incompatible ConcurrentMap interface. I don't think a >> null-compatible implementation of CM is possible, BTW. >> >> > This is what everyone mis-remembers. Somehow I let someone talk > me into not strictly banishing nulls in ConcurrentMap interface > specs, even though some methods are not very useful if they are > allowed, and no actual implementation I know allows them. > > The defaults for the new stream-collect-friendly Map methods > preserve this state of affairs. > > -Doug > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/f3b9d272/attachment.html From dl at cs.oswego.edu Thu Mar 28 04:22:20 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 28 Mar 2013 07:22:20 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> <51529EE1.1080003@gmail.com> <51531FC0.5080604@cs.oswego.edu> Message-ID: <515427EC.2030707@cs.oswego.edu> On 03/28/13 00:51, Joe Bowbeer wrote: > The way I'm interpreting this is: > > 1. ConcurrentMap effectively prohibits nulls (feel free to make analogies with > effective thread safety;) No, it has exactly the same issues as the added Map methods. If you use it with a null-value-accepting Map, then it does something that meets spec, but that something might not be what you have in mind. Backing up: the reason these are added to Map is that without them, streams can't reasonably implement map-related collect operations. Would you rather we kill them? And it is not just Streams; no one else can write them either without knowing exactly which concrete Map class they have. This is one of the reasons I ranted so much last year about why banishing nulls from streams was simplest path to API sanity. That is, if we can't banish them from Collections/Maps, we could at least do so as stream sources/destinations. Given that we didn't do this, there's a continual set of subsidiary issues like this one. -Doug From joe.bowbeer at gmail.com Thu Mar 28 07:15:18 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 28 Mar 2013 10:15:18 -0400 Subject: hg: lambda/lambda/jdk: bug fixes and unit test for Map Defaults In-Reply-To: <515427EC.2030707@cs.oswego.edu> References: <20130325192220.C64A0483B3@hg.openjdk.java.net> <5150C836.6060204@gmail.com> <51529EE1.1080003@gmail.com> <51531FC0.5080604@cs.oswego.edu> <515427EC.2030707@cs.oswego.edu> Message-ID: What about #2 which lists two options? I think we should choose one or the other option. We shouldn't choose the middle ground without changing the description, which would remove the middle ground. On Mar 28, 2013 5:22 AM, "Doug Lea"
wrote: > On 03/28/13 00:51, Joe Bowbeer wrote: > >> The way I'm interpreting this is: >> >> 1. ConcurrentMap effectively prohibits nulls (feel free to make analogies >> with >> effective thread safety;) >> > > No, it has exactly the same issues as the added Map methods. > If you use it with a null-value-accepting Map, then it > does something that meets spec, but that something might not > be what you have in mind. > > Backing up: the reason these are added to Map is that without > them, streams can't reasonably implement map-related collect > operations. Would you rather we kill them? And it is > not just Streams; no one else can write them either without > knowing exactly which concrete Map class they have. > > This is one of the reasons I ranted so much last year about > why banishing nulls from streams was simplest path to > API sanity. That is, if we can't banish them from > Collections/Maps, we could at least do so as stream > sources/destinations. Given that we didn't do this, > there's a continual set of subsidiary issues like this > one. > > -Doug > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/36883c7d/attachment.html From paul.sandoz at oracle.com Thu Mar 28 08:59:52 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 28 Mar 2013 16:59:52 +0100 Subject: RFR JDK-8010096 : Initial java.util.Spliterator putback Message-ID: <09A8DF98-6FF6-452E-8150-E86D9113E580@oracle.com> Hi, http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8010096 Webrev: http://cr.openjdk.java.net/~psandoz/lambda/spliterator/jdk-8010096/webrev/ Spec diff: http://cr.openjdk.java.net/~psandoz/lambda/spliterator/jdk-8010096/specdiff/overview-summary.html Relevant JavaDoc generated from lambda repo (required for viewing @apiNote, @implSpec, @implNote declarations): http://cr.openjdk.java.net/~psandoz/lambda/spliterator/jdk-8010096/api/java/ Note: some of the JavaDoc generated from the lambda repo may contain additional methods or specification that is relevant to the stream framework. -- To enable bulk operations, in parallel, on a data source it is required that such a source efficiently partition itself into smaller parts, where parts are processed concurrently, and each part is traversed sequentially. A new data interface is required that defines the contract for efficient partitioning and traversal. Collections need to implement that new data interface so the JSR-335 stream library can support bulk operations on common collection implementations, such as List/Set and arrays, in addition to other forms of data source e.g. third party collections. java.util.Spliterator, and primitive specializations of, is the key data interface that enables partitioning and traversal. JSR-335 java.util.stream.Stream instances will be constructed from spliterators. Collection is extended to define the default method spliterator(). Implementations are expected to override this method. The default implementation utilizes the collection's iterator and size to provide a spliterator implementation that permits limited parallism. Default overriding implementations are also provided for List, Set and SortedSet that support additional properties associated with those collections. Together with Spliterator some implementations are provided for creating Spliterator instances from data sources that are iterators and arrays. Note: Optimal spliterator implementations for many collection implementations in java.util and java.util.concurrent are present in the lambda repository and will duly make their way into TL after this webrev has been reviewed and pushed. -- Once this webrev is reviewed and pushed then the Stream API can follow, and then the fun really begins :-) The class com.sun.tools.jdi/EventSetImpl has been modified but is likely to change when spliterator implementations are pushed. This class cannot decide if it wants to be a List or a Set!! The compiler cannot work out if the default implementation of spliterator() for List or Set should apply. When the concrete collection implementations of spliterator arrive this class will not need to implement spliterator() and will inherit the implementation from ArrayList (implementations on classes always win over default implementations on interfaces). Unfortunately to build TL with this patch applied requires one also apply: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fbcafacf92ef Which i suspect was lost in the wash and should have been pushed to TL a while ago. Will follow up with Robert on that one. A JPRT of TL with both patches applied did not show any abnormal test failures. Paul. From paul.sandoz at oracle.com Thu Mar 28 09:12:25 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 28 Mar 2013 17:12:25 +0100 Subject: sorting and stability In-Reply-To: <515318BB.2030805@cs.oswego.edu> References: <5150DE09.3020505@cs.oswego.edu> <515318BB.2030805@cs.oswego.edu> Message-ID: On Mar 27, 2013, at 5:05 PM, Doug Lea
wrote: >> Algorithmically, the cost of this might on average be zero. >> Even though it does add cost in general, it can be made >> immeasurably small when there are no duplicate keys, and >> as fast as any other strategy when there are duplicates. > > This still holds for versions ready enough to tentatively > commit. Paul will help integrate into lambda repo within > a few days. > Tis now integrated. Paul. From josh at bloch.us Thu Mar 28 09:39:17 2013 From: josh at bloch.us (Joshua Bloch) Date: Thu, 28 Mar 2013 09:39:17 -0700 Subject: sorting and stability In-Reply-To: References: <5150DE09.3020505@cs.oswego.edu> <515318BB.2030805@cs.oswego.edu> Message-ID: Good. I like stable sorts. Josh On Thu, Mar 28, 2013 at 9:12 AM, Paul Sandoz wrote: > On Mar 27, 2013, at 5:05 PM, Doug Lea
wrote: > >> Algorithmically, the cost of this might on average be zero. > >> Even though it does add cost in general, it can be made > >> immeasurably small when there are no duplicate keys, and > >> as fast as any other strategy when there are duplicates. > > > > This still holds for versions ready enough to tentatively > > commit. Paul will help integrate into lambda repo within > > a few days. > > > > Tis now integrated. > > Paul. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/6293804b/attachment.html From tim at peierls.net Thu Mar 28 10:18:51 2013 From: tim at peierls.net (Tim Peierls) Date: Thu, 28 Mar 2013 13:18:51 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) Message-ID: I can't find a discussion of why Spliterator flags are ints rather than enum. The only thing coming close is this months-old update from Brian: *Sept 25, 2012 - Oct 24, 2012* > *... > **Stream flags improvements (Paul). *Added an "encounter order" flag. > Define flags with an enum. Make flags into declarative properties of Ops, > where an op can declare that they preserve, inject, or clear a given flag, > and move responsibility for flag computation into AbstractPipeline. Pass > flags to wrap{Sink,Iterator} so they can act on the current set of flags. But that was in another country, and besides, the Ops are dead. Is there anything more recent about this? I'm prepared to hear that it's for performance reasons, but there should at least be a record of the process through which this was determined. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/f2ddbabf/attachment-0001.html From kevinb at google.com Thu Mar 28 11:23:45 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 28 Mar 2013 11:23:45 -0700 Subject: RFR : JDK-8001642 : Add Optional, OptionalDouble, OptionalInt, OptionalLong In-Reply-To: <513710CC.3010903@univ-mlv.fr> References: <513710CC.3010903@univ-mlv.fr> Message-ID: I do NOT wish to restart this discussion; I just noticed a falsehood that was never exposed: On Wed, Mar 6, 2013 at 1:47 AM, Remi Forax wrote: Google's Guava, which is a popular library, defines a class named Optional, > but allow to store null unlike the current proposed implementation, this > will generate a lot of confusions and frustrations. > Guava's Optional *cannot* be used to hold null. So this particular concern is not a concern at all. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/abd703e7/attachment.html From dl at cs.oswego.edu Thu Mar 28 11:45:44 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 28 Mar 2013 14:45:44 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: Message-ID: <51548FD8.7030103@cs.oswego.edu> On 03/28/13 13:18, Tim Peierls wrote: > I can't find a discussion of why Spliterator flags are ints rather than enum. We started out with enums on (my) initial Spliterator side vs control flags internal to streams. The we had to somehow mesh these to work together. On the stream side, you need to set and unset various bits across stages. Clearly you can't do that to someone's EnumSet -- they will not expect you to modify it, but enforcing this makes it both unwieldy and sleaze-inducing (we'd have to grab underlying representation from EnumSet). Another way of saying this is that we needed an efficient propagate-by-value small-N bit set mechanism, and the only candidate was the traditional one. This amounts to the same reason that nio "interest" flags are done the same way. -Doug From josh at bloch.us Thu Mar 28 11:52:29 2013 From: josh at bloch.us (Joshua Bloch) Date: Thu, 28 Mar 2013 11:52:29 -0700 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: <51548FD8.7030103@cs.oswego.edu> References: <51548FD8.7030103@cs.oswego.edu> Message-ID: Doug, I don't get it. You can set and unset flags on your own EnumSet. Why isn't that sufficient? Josh On Thu, Mar 28, 2013 at 11:45 AM, Doug Lea
wrote: > On 03/28/13 13:18, Tim Peierls wrote: > >> I can't find a discussion of why Spliterator flags are ints rather than >> enum. >> > > We started out with enums on (my) initial Spliterator side vs > control flags internal to streams. The we had to somehow > mesh these to work together. On the stream side, you need > to set and unset various bits across stages. Clearly > you can't do that to someone's EnumSet -- they will not expect > you to modify it, but enforcing this makes it both unwieldy and > sleaze-inducing (we'd have to grab underlying representation from > EnumSet). > > Another way of saying this is that we needed an efficient > propagate-by-value small-N bit set mechanism, and the only candidate > was the traditional one. This amounts to the same reason > that nio "interest" flags are done the same way. > > -Doug > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/6135df8b/attachment.html From dl at cs.oswego.edu Thu Mar 28 12:06:17 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 28 Mar 2013 15:06:17 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> Message-ID: <515494A9.30600@cs.oswego.edu> On 03/28/13 14:52, Joshua Bloch wrote: > Doug, > > I don't get it. You can set and unset flags on your own EnumSet. Why isn't that > sufficient? There are a lot of problems. First, even though most spliterators will return the same set of characteristics each time, you can't just create one static one: class MySpliterator { ... static final EnumSet cs = EnumSet.of(...); EnumSet characteristics() return cs; } } ... because you cannot risk that no one will modify. Second, when inside streams, you'd have to create a new EnumSet Object across each stage, that somehow secretly extends the public Characteristics with non-public internal control flags. Which means either some slow conversion table or grabbing private EnumSet internals. So it is both slow and painful. I tried to make it less so, knowing that people sometimes react hostilely to plain bit sets. But I'm sure that the current scheme is better than all I tried. (Ditto for Brian Goetz and Paul Sandoz). In fact, I think the current scheme is sorta nice in an absolute sense. -Doug > > > Josh > > On Thu, Mar 28, 2013 at 11:45 AM, Doug Lea
> wrote: > > On 03/28/13 13:18, Tim Peierls wrote: > > I can't find a discussion of why Spliterator flags are ints rather than > enum. > > > We started out with enums on (my) initial Spliterator side vs > control flags internal to streams. The we had to somehow > mesh these to work together. On the stream side, you need > to set and unset various bits across stages. Clearly > you can't do that to someone's EnumSet -- they will not expect > you to modify it, but enforcing this makes it both unwieldy and > sleaze-inducing (we'd have to grab underlying representation from > EnumSet). > > Another way of saying this is that we needed an efficient > propagate-by-value small-N bit set mechanism, and the only candidate > was the traditional one. This amounts to the same reason > that nio "interest" flags are done the same way. > > -Doug > > > > > > From josh at bloch.us Thu Mar 28 12:14:54 2013 From: josh at bloch.us (Joshua Bloch) Date: Thu, 28 Mar 2013 12:14:54 -0700 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: <515494A9.30600@cs.oswego.edu> References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> Message-ID: Doug, On Thu, Mar 28, 2013 at 12:06 PM, Doug Lea
wrote: > On 03/28/13 14:52, Joshua Bloch wrote: > >> Doug, >> >> I don't get it. You can set and unset flags on your own EnumSet. Why >> isn't that >> sufficient? >> > > There are a lot of problems. First, even > though most spliterators will return the same set of > characteristics each time, you can't just create one static one: > > class MySpliterator { ... > static final EnumSet cs = EnumSet.of(...); > EnumSet characteristics() return cs; } > } > > ... because you cannot risk that no one will modify. > Sounds like a perfect opportunity to put in immutableEnumSet, which is trivial to implement and generally useful. Alternatively, don't share, and see if the performance it good enough. (I suspect it will be.) Second, when inside streams, you'd have to create a new EnumSet > Object across each stage, that somehow secretly extends the > public Characteristics with non-public internal control flags. > Which means either some slow conversion table or grabbing > private EnumSet internals. > Or having two EnumSets: one public, consisting of public constants, and one private, consisting of private constants. Again, doesn't sound like a big deal. So it is both slow and painful. You haven't convinced me of either (yet). Did you measure the performance? > I tried to make it less so, > knowing that people sometimes react hostilely to plain bit > sets. But I'm sure that the current scheme is better than all > I tried. (Ditto for Brian Goetz and Paul Sandoz). > In fact, I think the current scheme is sorta nice in > an absolute sense. Could be. I haven't seen it. That said, I find that bit fields are usually not a good idea in the post-enum age. Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/b9eeab6b/attachment.html From mike.duigou at oracle.com Thu Mar 28 18:37:03 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Thu, 28 Mar 2013 18:37:03 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: <514CD46F.9020508@oracle.com> References: <514CD46F.9020508@oracle.com> Message-ID: <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> I've responded to the survey feedback and updated the implementations with additional Javadoc. One comment which was not addressed was whether getAverage() should throw a zero division ArithmeticException if no values had been recorded. I believe the current default of returning 0.0 is reasonable and it is convenient to not have to check the catch the exception. It's also in line with the defaults we provide for sum, sumOfSquares, min, and max. For any of these defaults users can check the count themselves and choose to substitute their own default. double average = summary.getCount() != 0 ? summary.getAverage() : Double.NAN; I did introduce an ArithmeticException to IntSummaryStatistics.getSum() if the sum cannot be expressed as an int. Remember that the sum is internally maintained in a long and there is a long accessor, getAsLong(). Mike On Mar 22 2013, at 15:00 , Brian Goetz wrote: > I've posted a survey at: > https://www.surveymonkey.com/s/5VTLT26 > > To do an API and spec review for the classes Int/Long/DoubleSummaryStatistics. If you have comments, please provide them in the SurveyMonkey form. Usual password. > From joe.bowbeer at gmail.com Thu Mar 28 18:55:14 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 28 Mar 2013 21:55:14 -0400 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: Why not have is.getSum return long? Does getSum of a long stream throw an exception? On Mar 28, 2013 7:37 PM, "Mike Duigou" wrote: > I've responded to the survey feedback and updated the implementations with > additional Javadoc. > > One comment which was not addressed was whether getAverage() should throw > a zero division ArithmeticException if no values had been recorded. I > believe the current default of returning 0.0 is reasonable and it is > convenient to not have to check the catch the exception. It's also in line > with the defaults we provide for sum, sumOfSquares, min, and max. For any > of these defaults users can check the count themselves and choose to > substitute their own default. double average = summary.getCount() != 0 ? > summary.getAverage() : Double.NAN; > > I did introduce an ArithmeticException to IntSummaryStatistics.getSum() if > the sum cannot be expressed as an int. Remember that the sum is internally > maintained in a long and there is a long accessor, getAsLong(). > > Mike > > On Mar 22 2013, at 15:00 , Brian Goetz wrote: > > > I've posted a survey at: > > https://www.surveymonkey.com/s/5VTLT26 > > > > To do an API and spec review for the classes > Int/Long/DoubleSummaryStatistics. If you have comments, please provide > them in the SurveyMonkey form. Usual password. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130328/f8c959c7/attachment.html From paul.sandoz at oracle.com Fri Mar 29 07:45:16 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 29 Mar 2013 15:45:16 +0100 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> Message-ID: <252DC5A1-0B61-433C-A50F-5BD8AD544D2C@oracle.com> On Mar 29, 2013, at 5:39 AM, Paul Benedict wrote: > I think the use of EnumSet in a public API is superior to bit flags. > Worrying about the number of bytes here is not important since they will > all end up being garbage collected when the stream processing ends. > I worry. We need to reduce the fixed costs of setting up the pipeline so using streams are not unduly expensive compared to a for loop using an equivalent lambda. Fixed costs are especially noticeable for small data sizes e.g. < 10. We are currently working on refactoring the internals to further reduce fixed costs. Paul. From dl at cs.oswego.edu Fri Mar 29 07:53:44 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 29 Mar 2013 10:53:44 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> Message-ID: <5155AAF8.1050103@cs.oswego.edu> On 03/28/13 15:14, Joshua Bloch wrote: > Sounds like a perfect opportunity to put in immutableEnumSet, which is trivial > to implement and generally useful. Alternatively, don't share, and see if the > performance it good enough. (I suspect it will be.) Did you think that I of all people would I participate in a performance-related decision without measuring performance? :-) Imagine that you had to create a new EnumSet object every time you invoked arrayList.iterator(). You'd surely consider alternatives. But really, the painfulness quotient is equally important. We'd need to create immutableEnumSet class, and another class that can arbitrarily extend the Spliterator's enums with other control flags, all for the sake of arriving at an API that seems less clear and less easy to use than what we have. -Doug From tim at peierls.net Fri Mar 29 08:25:54 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 29 Mar 2013 11:25:54 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: <5155AAF8.1050103@cs.oswego.edu> References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> Message-ID: On Fri, Mar 29, 2013 at 10:53 AM, Doug Lea
wrote: > But really, the painfulness quotient is equally important. > We'd need to create immutableEnumSet class, and another class > that can arbitrarily extend the Spliterator's enums with > other control flags, all for the sake of arriving at an API > that seems less clear and less easy to use than what we have. That API doesn't exist, so it's not really fair to say that it seems less clear and easy to use. As far as I can see in the common discussions, no one has seriously explored any alternatives. The presence of such flags in a Java 8 API would (and should) raise a lot of eyebrows, because it goes against what people have been told for well over a decade. If it's adopted as is, there had better be a good explanation for doc readers of why alternatives were rejected. "We were comfortable with int flags and nothing else significantly better suggested itself" won't cut it. "We know int flags aren't great for an API, but we tried very hard to find better alternatives, to no avail" would (if it were true). --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/dd830142/attachment-0001.html From brian.goetz at oracle.com Fri Mar 29 08:40:39 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 29 Mar 2013 08:40:39 -0700 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> Message-ID: Let's not lose sight of the fact that this is not a "for everyone" class like ArrayList or Pattern. This is low-level machinery for supporting parallel operations. If we've done our job correctly, the vast majority of users will never see Spliterator. Imagine this were part of the FJ implementation (spliterators exist in 1:1 correspondence with FJTasks.) Would you be making this same argument if this were ForkJoinTask? On Mar 29, 2013, at 8:25 AM, Tim Peierls wrote: > On Fri, Mar 29, 2013 at 10:53 AM, Doug Lea
wrote: > But really, the painfulness quotient is equally important. > We'd need to create immutableEnumSet class, and another class > that can arbitrarily extend the Spliterator's enums with > other control flags, all for the sake of arriving at an API > that seems less clear and less easy to use than what we have. > > That API doesn't exist, so it's not really fair to say that it seems less clear and easy to use. As far as I can see in the common discussions, no one has seriously explored any alternatives. > > The presence of such flags in a Java 8 API would (and should) raise a lot of eyebrows, because it goes against what people have been told for well over a decade. If it's adopted as is, there had better be a good explanation for doc readers of why alternatives were rejected. "We were comfortable with int flags and nothing else significantly better suggested itself" won't cut it. "We know int flags aren't great for an API, but we tried very hard to find better alternatives, to no avail" would (if it were true). > > --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/5c044280/attachment.html From tim at peierls.net Fri Mar 29 09:05:40 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 29 Mar 2013 12:05:40 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> Message-ID: Yes, I'd make the same argument for ForkJoinTask, but don't (in turn) lose sight of what that argument is. I'm *not* saying "int flags always bad; enums always good". I *am* saying that the docs should take care to explain why a popularly disavowed practice is being tolerated, even in an API that only a few people will have to use. And that the explanation should be honest and not a fancy paraphrase of "It's good enough for an expert-level API, so we didn't bother to look seriously at alternatives." I'm not even trying to argue that the popular thinking is correct, just that it is popular enough to make it urgent that design decisions that run counter to it need to be explained carefully. --tim On Fri, Mar 29, 2013 at 11:40 AM, Brian Goetz wrote: > Let's not lose sight of the fact that this is not a "for everyone" class > like ArrayList or Pattern. This is low-level machinery for supporting > parallel operations. If we've done our job correctly, the vast majority of > users will never see Spliterator. Imagine this were part of the FJ > implementation (spliterators exist in 1:1 correspondence with FJTasks.) > Would you be making this same argument if this were ForkJoinTask? > > > On Mar 29, 2013, at 8:25 AM, Tim Peierls wrote: > > On Fri, Mar 29, 2013 at 10:53 AM, Doug Lea
wrote: > >> But really, the painfulness quotient is equally important. >> We'd need to create immutableEnumSet class, and another class >> that can arbitrarily extend the Spliterator's enums with >> other control flags, all for the sake of arriving at an API >> that seems less clear and less easy to use than what we have. > > > That API doesn't exist, so it's not really fair to say that it seems less > clear and easy to use. As far as I can see in the common discussions, no > one has seriously explored any alternatives. > > The presence of such flags in a Java 8 API would (and should) raise a lot > of eyebrows, because it goes against what people have been told for well > over a decade. If it's adopted as is, there had better be a good > explanation for doc readers of why alternatives were rejected. "We were > comfortable with int flags and nothing else significantly better suggested > itself" won't cut it. "We know int flags aren't great for an API, but we > tried very hard to find better alternatives, to no avail" would (if it were > true). > > --tim > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/35176ebf/attachment.html From dl at cs.oswego.edu Fri Mar 29 09:54:26 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 29 Mar 2013 12:54:26 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> Message-ID: <5155C742.5000300@cs.oswego.edu> On 03/29/13 12:05, Tim Peierls wrote: > I'm not even trying to argue that the popular thinking is correct, just that it > is popular enough to make it urgent that design decisions that run counter to it > need to be explained carefully. OK, but I'm not sure what to say here? "You might have expected these to be defined in terms of enums and EnumSets, but they weren't because we must ensure value-based propagation with additional masking and unmasking across stream stages, and there is no existing type to support this and we decided no to create one." Suggestions for something less inappropriate for inclusion in javadocs would be welcome. Unless we find some wording that rises above the particulars, this seems to me more like a FAQ issue than something to put in javadocs though. -Doug > > --tim > > > > On Fri, Mar 29, 2013 at 11:40 AM, Brian Goetz > wrote: > > Let's not lose sight of the fact that this is not a "for everyone" class > like ArrayList or Pattern. This is low-level machinery for supporting > parallel operations. If we've done our job correctly, the vast majority of > users will never see Spliterator. Imagine this were part of the FJ > implementation (spliterators exist in 1:1 correspondence with FJTasks.) > Would you be making this same argument if this were ForkJoinTask? > > > On Mar 29, 2013, at 8:25 AM, Tim Peierls wrote: > >> On Fri, Mar 29, 2013 at 10:53 AM, Doug Lea
> > wrote: >> >> But really, the painfulness quotient is equally important. >> We'd need to create immutableEnumSet class, and another class >> that can arbitrarily extend the Spliterator's enums with >> other control flags, all for the sake of arriving at an API >> that seems less clear and less easy to use than what we have. >> >> >> That API doesn't exist, so it's not really fair to say that it seems less >> clear and easy to use. As far as I can see in the common discussions, no >> one has seriously explored any alternatives. >> >> The presence of such flags in a Java 8 API would (and should) raise a lot >> of eyebrows, because it goes against what people have been told for well >> over a decade. If it's adopted as is, there had better be a good >> explanation for doc readers of why alternatives were rejected. "We were >> comfortable with int flags and nothing else significantly better suggested >> itself" won't cut it. "We know int flags aren't great for an API, but we >> tried very hard to find better alternatives, to no avail" would (if it >> were true). >> >> --tim > > From tim at peierls.net Fri Mar 29 10:02:21 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 29 Mar 2013 13:02:21 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: <5155C742.5000300@cs.oswego.edu> References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> <5155C742.5000300@cs.oswego.edu> Message-ID: On Fri, Mar 29, 2013 at 12:54 PM, Doug Lea
wrote: > OK, but I'm not sure what to say here? > > "You might have expected these to be defined in terms of enums and > EnumSets, > but they weren't because we must ensure value-based propagation with > additional masking and unmasking across stream stages, and there is > no existing type to support this and we decided no to create one." > It's a start. I particularly like the part up to the first comma. :-) > Suggestions for something less inappropriate for inclusion > in javadocs would be welcome. Unless we find some wording that rises > above the particulars, this seems to me more like a FAQ issue than > something to put in javadocs though. If the full text can't be put in javadocs, then at least there should be a link in the javadocs to some FAQ page. "Wondering why this isn't expressed using enums and EnumSet? " --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/4a90973b/attachment.html From dl at cs.oswego.edu Fri Mar 29 10:15:35 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 29 Mar 2013 13:15:35 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> <5155C742.5000300@cs.oswego.edu> Message-ID: <5155CC37.7060105@cs.oswego.edu> On 03/29/13 13:02, Tim Peierls wrote: > On Fri, Mar 29, 2013 at 12:54 PM, Doug Lea
> wrote: > > OK, but I'm not sure what to say here? > > "You might have expected these to be defined in terms of enums and EnumSets, > but they weren't because we must ensure value-based propagation with > additional masking and unmasking across stream stages, and there is > no existing type to support this and we decided no to create one." > > > It's a start. I particularly like the part up to the first comma. :-) > Your turn! (Or anyone else's) I can't think of anything appropriate. So I'm hoping someone else can. Compare the description of interest ops in SelectionKey http://docs.oracle.com/javase/7/docs/api/java/nio/channels/SelectionKey.html -Doug From mike.duigou at oracle.com Fri Mar 29 10:46:43 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 29 Mar 2013 10:46:43 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: <71F05557-E60D-4A4D-8760-66D6EB0297C2@oracle.com> On Mar 28 2013, at 18:55 , Joe Bowbeer wrote: > Why not have is.getSum return long? > There is a getAsLong() version which does return a long. For many cases it's known that the result will not overflow and it makes sense for an Int focused class to provide a result of the same size. > Does getSum of a long stream throw an exception? > No. None of the implementations currently detect overflow. If overflow was encountered it would have to be thrown from accept() rather than from getSum(). Worth the cost to implement? I am not sure. Mike > On Mar 28, 2013 7:37 PM, "Mike Duigou" wrote: > I've responded to the survey feedback and updated the implementations with additional Javadoc. > > One comment which was not addressed was whether getAverage() should throw a zero division ArithmeticException if no values had been recorded. I believe the current default of returning 0.0 is reasonable and it is convenient to not have to check the catch the exception. It's also in line with the defaults we provide for sum, sumOfSquares, min, and max. For any of these defaults users can check the count themselves and choose to substitute their own default. double average = summary.getCount() != 0 ? summary.getAverage() : Double.NAN; > > I did introduce an ArithmeticException to IntSummaryStatistics.getSum() if the sum cannot be expressed as an int. Remember that the sum is internally maintained in a long and there is a long accessor, getAsLong(). > > Mike > > On Mar 22 2013, at 15:00 , Brian Goetz wrote: > > > I've posted a survey at: > > https://www.surveymonkey.com/s/5VTLT26 > > > > To do an API and spec review for the classes Int/Long/DoubleSummaryStatistics. If you have comments, please provide them in the SurveyMonkey form. Usual password. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/0c57969a/attachment-0001.html From kevinb at google.com Fri Mar 29 11:01:49 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 29 Mar 2013 11:01:49 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: On Thu, Mar 28, 2013 at 6:37 PM, Mike Duigou wrote: > I've responded to the survey feedback and updated the implementations with > additional Javadoc. > > One comment which was not addressed was whether getAverage() should throw > a zero division ArithmeticException if no values had been recorded. I > believe the current default of returning 0.0 is reasonable and it is > convenient to not have to check the catch the exception. It's also in line > with the defaults we provide for sum, sumOfSquares, min, and max. I think I've said this before, but I believe this is extremely wrong. sum and sumOfSquares have a well-defined and obvious identity. min, max and average are entirely meaningless when applied to zero values. What would you think of a language where 1 / 0 returned 0? How can we claim this is any different? I believe no one will ever curse your name for throwing the exception. Also, while I'm here... Exposing sumOfSquares() does not permit users to safely calculate variance, which I believe makes it fairly useless and even dangerous: "The failure of Cauchy's fundamental inequality is another important example of the breakdown of traditional algebra in the presence of floating point arithmetic...Novice programmers who calculate the standard deviation of some observations by using the textbook formula [formula for the standard deviation in terms of the sum of squares] often find themselves taking the square root of a negative number!" (Knuth AoCP vol 2, section 4.2.2) Final nit: what's the consistent rule for when exactly this "get" prefix is used? -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/5136b97b/attachment.html From tim at peierls.net Fri Mar 29 12:01:41 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 29 Mar 2013 15:01:41 -0400 Subject: Spliterator flags as enum (was Initial java.util.Spliterator putback) In-Reply-To: <5155CC37.7060105@cs.oswego.edu> References: <51548FD8.7030103@cs.oswego.edu> <515494A9.30600@cs.oswego.edu> <5155AAF8.1050103@cs.oswego.edu> <5155C742.5000300@cs.oswego.edu> <5155CC37.7060105@cs.oswego.edu> Message-ID: On Fri, Mar 29, 2013 at 1:15 PM, Doug Lea
wrote: > >> "You might have expected these to be defined in terms of enums and >> EnumSets, >> but they weren't because we must ensure value-based propagation with >> additional masking and unmasking across stream stages, and there is >> no existing type to support this and we decided no to create one." >> >> It's a start. I particularly like the part up to the first comma. :-) >> > > Your turn! (Or anyone else's) > *sigh* If I had the time to do anything other than throw stones, I'd put it into a counter-proposal. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/8b5dcd90/attachment.html From mike.duigou at oracle.com Fri Mar 29 12:44:53 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 29 Mar 2013 12:44:53 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: On Mar 29 2013, at 11:01 , Kevin Bourrillion wrote: > On Thu, Mar 28, 2013 at 6:37 PM, Mike Duigou wrote: > I've responded to the survey feedback and updated the implementations with additional Javadoc. > > One comment which was not addressed was whether getAverage() should throw a zero division ArithmeticException if no values had been recorded. I believe the current default of returning 0.0 is reasonable and it is convenient to not have to check the catch the exception. It's also in line with the defaults we provide for sum, sumOfSquares, min, and max. > > I think I've said this before, but I believe this is extremely wrong. sum and sumOfSquares have a well-defined and obvious identity. min, max and average are entirely meaningless when applied to zero values. What would you think of a language where 1 / 0 returned 0? How can we claim this is any different? Maybe it isn't. The goal was presumably to avoid requiring people to routinely check for the exception to apply a default. > I believe no one will ever curse your name for throwing the exception. Worrying about the curses of Java programmers is something which does keep me up at night. You must also work on libraries.... > Also, while I'm here... > > Exposing sumOfSquares() does not permit users to safely calculate variance, which I believe makes it fairly useless and even dangerous: > > "The failure of Cauchy's fundamental inequality is another important example of the breakdown of traditional algebra in the presence of floating point arithmetic...Novice programmers who calculate the standard deviation of some observations by using the textbook formula [formula for the standard deviation in terms of the sum of squares] often find themselves taking the square root of a negative number!" (Knuth AoCP vol 2, section 4.2.2) I'm definitely not an expert in this area. > Final nit: what's the consistent rule for when exactly this "get" prefix is used? It's generally used to retrieve state with little or no computation or conversion required to produce the result. "as" or "to" imply conversion. Unadorned names usually imply more computation. Across the JDK it's easy to find counter examples to every rule unfortunately. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/9a20d361/attachment.html From josh at bloch.us Fri Mar 29 14:28:41 2013 From: josh at bloch.us (Joshua Bloch) Date: Fri, 29 Mar 2013 14:28:41 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: Mike, On Fri, Mar 29, 2013 at 12:44 PM, Mike Duigou wrote: > > On Mar 29 2013, at 11:01 , Kevin Bourrillion wrote: > > Final nit: what's the consistent rule for when exactly this "get" prefix > is used? > > > It's generally used to retrieve state with little or no computation or > conversion required to produce the result. "as" or "to" imply conversion. > Unadorned names usually imply more computation. Across the JDK it's easy to > find counter examples to every rule unfortunately. > Do you have any evidence that this is a rule across the JDK (i.e., that it holds in a significant majority of all cases)? I haven't done a methodical study, but it doesn't match my previous assumptions, which are contained in *Effective Java*. Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/8950eea1/attachment.html From kevinb at google.com Fri Mar 29 14:42:45 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 29 Mar 2013 14:42:45 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: I should point out: by "consistent guidelines for ", I surely have no illusion of finding such a thing spanning the whole JDK or spanning back before at the earliest 1.5. Only in the sense of "this is how we do things now in the areas where we, y'know, JSR-166-types work." I'll admit to simply *disliking* the get- prefix as being so much chaff, except where it's paired with set-, but more than that I just want the consistency. On Fri, Mar 29, 2013 at 2:28 PM, Joshua Bloch wrote: > Mike, > > On Fri, Mar 29, 2013 at 12:44 PM, Mike Duigou wrote: > >> >> On Mar 29 2013, at 11:01 , Kevin Bourrillion wrote: >> >> Final nit: what's the consistent rule for when exactly this "get" >> prefix is used? >> >> >> It's generally used to retrieve state with little or no computation or >> conversion required to produce the result. "as" or "to" imply conversion. >> Unadorned names usually imply more computation. Across the JDK it's easy to >> find counter examples to every rule unfortunately. >> > > Do you have any evidence that this is a rule across the JDK (i.e., that it > holds in a significant majority of all cases)? I haven't done a methodical > study, but it doesn't match my previous assumptions, which are contained in > *Effective Java*. > > Josh > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/88ce9ad3/attachment.html From brian.goetz at oracle.com Fri Mar 29 15:16:39 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 29 Mar 2013 15:16:39 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: <1E7C3B20-8B4A-4782-BD59-B82ACD7AF4DB@oracle.com> > Also, while I'm here... > > Exposing sumOfSquares() does not permit users to safely calculate variance, which I believe makes it fairly useless and even dangerous: > > "The failure of Cauchy's fundamental inequality is another important example of the breakdown of traditional algebra in the presence of floating point arithmetic...Novice programmers who calculate the standard deviation of some observations by using the textbook formula [formula for the standard deviation in terms of the sum of squares] often find themselves taking the square root of a negative number!" (Knuth AoCP vol 2, section 4.2.2) Thanks for raising this issue again -- I'd meant to respond earlier. I ran this by our numerics guys. Basically, the problem is that for floating point numbers, since squaring makes small numbers smaller and big numbers bigger, summing squares in the obvious way risks the usual problem with adding numbers of grossly differing magnitudes. So while the naive factoring of population/sample variance allows you to compute them from sum(x) and sum(x^2), the latter is potentially numerically challenged. (Note that this problem doesn't exist for int/long, assuming a long is big enough to compute sum(x^2) without overflow.) Still, I am not sure we do users a favor by leaving this out. Many of them are likely to simply extend DoubleSummaryStatistics to calculate sum(x^2) anyway. And the only other alternative is horrible; stream the data into a collection and make two passes on it, one for mean and one for variance. That's at least 3x as expensive, if you can fit the whole thing in memory in the first place. The Knuth section you cite also offers a means to calculate variance more effectively in a single pass using a recurrence relation based on Kahan summation. So I think the winning move is to provide a better implementation of sumsq than either of the naive implementations above, one that uses real numerics fu. (We intend to provide a better implementation of summation for DoubleSummaryStatistics as well, based on Kahan.) Of course the crappy implementation that is in there now is less than ideal. From mike.duigou at oracle.com Fri Mar 29 15:59:13 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Fri, 29 Mar 2013 15:59:13 -0700 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> Message-ID: On Mar 29 2013, at 14:28 , Joshua Bloch wrote: > Mike, > > On Fri, Mar 29, 2013 at 12:44 PM, Mike Duigou wrote: > > On Mar 29 2013, at 11:01 , Kevin Bourrillion wrote: > >> Final nit: what's the consistent rule for when exactly this "get" prefix is used? > > It's generally used to retrieve state with little or no computation or conversion required to produce the result. "as" or "to" imply conversion. Unadorned names usually imply more computation. Across the JDK it's easy to find counter examples to every rule unfortunately. > > Do you have any evidence that this is a rule across the JDK (i.e., that it holds in a significant majority of all cases)? Empirical evidence, no. A "get" or "put" method that looked like an accessor but actually does significant computation wouldn't pass the sniff test for me though. Similarly an "as" method that wasn't a conversion or returning a view would make me uncomfortable. Same for "to" method that wasn't a conversion. > I haven't done a methodical study, but it doesn't match my previous assumptions, which are contained in Effective Java. Presumably you are referring to Item 38. It seems that there is no more consensus now about get/put than when you wrote that. My opinion .... consistency where possible and reasonable but absent of consensus then any other factors which improve clarity, encourage correct usage or increase convenience should be considered. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130329/fc0a7e7e/attachment.html From joe.bowbeer at gmail.com Sat Mar 30 07:30:02 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Sat, 30 Mar 2013 10:30:02 -0400 Subject: Spec and API review for {Int,Long,Double}SummaryStatistics In-Reply-To: <71F05557-E60D-4A4D-8760-66D6EB0297C2@oracle.com> References: <514CD46F.9020508@oracle.com> <1BC38610-51E9-4A69-A1E7-192880618E5F@oracle.com> <71F05557-E60D-4A4D-8760-66D6EB0297C2@oracle.com> Message-ID: Are there any similar APIs to use for guidance here? This inconsistency regarding exceptions seems weird to me. 1. I would rather that all getSum methods throw exceptions or none do. A runtime unchecked exception in a numeric get accessor seems dangerous. So my strong preference is that neither do. 2. ints.getSum returning long seems like the natural signature. In the cases where the sum is known to be int, I'd make this the user's responsibility. Similar to size of files and stream, and various others. (I'd cast to int in those cases and add an assertion.) I suggest: 1. Remove ints.getSumAsLong 2. ints.getSum returns long Joe On Mar 29, 2013 11:46 AM, "Mike Duigou" wrote: > > On Mar 28 2013, at 18:55 , Joe Bowbeer wrote: > > Why not have is.getSum return long? > > There is a getAsLong() version which does return a long. For many cases > it's known that the result will not overflow and it makes sense for an Int > focused class to provide a result of the same size. > > Does getSum of a long stream throw an exception? > > No. None of the implementations currently detect overflow. If overflow was > encountered it would have to be thrown from accept() rather than from > getSum(). > > Worth the cost to implement? I am not sure. > > Mike > > On Mar 28, 2013 7:37 PM, "Mike Duigou" wrote: > >> I've responded to the survey feedback and updated the implementations >> with additional Javadoc. >> >> One comment which was not addressed was whether getAverage() should throw >> a zero division ArithmeticException if no values had been recorded. I >> believe the current default of returning 0.0 is reasonable and it is >> convenient to not have to check the catch the exception. It's also in line >> with the defaults we provide for sum, sumOfSquares, min, and max. For any >> of these defaults users can check the count themselves and choose to >> substitute their own default. double average = summary.getCount() != 0 ? >> summary.getAverage() : Double.NAN; >> >> I did introduce an ArithmeticException to IntSummaryStatistics.getSum() >> if the sum cannot be expressed as an int. Remember that the sum is >> internally maintained in a long and there is a long accessor, getAsLong(). >> >> Mike >> >> On Mar 22 2013, at 15:00 , Brian Goetz wrote: >> >> > I've posted a survey at: >> > https://www.surveymonkey.com/s/5VTLT26 >> > >> > To do an API and spec review for the classes >> Int/Long/DoubleSummaryStatistics. If you have comments, please provide >> them in the SurveyMonkey form. Usual password. >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130330/c5dc6a9b/attachment.html