From brian.goetz at oracle.com Mon Sep 10 10:57:31 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Sep 2012 13:57:31 -0400 Subject: Updated SotL/L documents for Iteration 2 Message-ID: <504E2A0B.4040005@oracle.com> Re-sending copies of documents sent last week. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120910/9e75a0c9/sotc2-intro-0001.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120910/9e75a0c9/sotc2-changes-0001.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120910/9e75a0c9/sotc2-impl-0001.html From forax at univ-mlv.fr Mon Sep 10 11:09:29 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 10 Sep 2012 20:09:29 +0200 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504E2A0B.4040005@oracle.com> References: <504E2A0B.4040005@oracle.com> Message-ID: <504E2CD9.3070905@univ-mlv.fr> On 09/10/2012 07:57 PM, Brian Goetz wrote: > Re-sending copies of documents sent last week. > I may be wrong, but there is no mention of primitive specializations. Is it something that is still in flux or not ? R?mi From brian.goetz at oracle.com Mon Sep 10 12:25:26 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Sep 2012 15:25:26 -0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504E2CD9.3070905@univ-mlv.fr> References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> Message-ID: <504E3EA6.50005@oracle.com> That's something we need to discuss. In the past, we outlined a few strategies for getting to primitive support: - Primitive specialization of streams (e.g., IntStream) with overloads like map(IntMapper) -> yields IntStream. We'd probably just do Int/Long/Double. - VM magic to make boxing costs go away (and give ponies to all the children of the world) - Fuse common operations to expose primitive opportunities, like a fused mapReduce(IntMapper, IntOperator) - Specialized version of the above, such as sumBy(IntMapper) - Box elimination in libraries (theoretically possible, but probably only if pipelines can be precompiled) I'm leaning towards preferring the first, because: - It doesn't rely on magic - It lets us expose methods like sum(), max(), min(), and sort() on {Int,Long,Double}Stream -- this is a huge plus - It lets us bring more numerics firepower to bear on things like DoubleStream.sum() Obviously the increased API surface area is a minus. On 9/10/2012 2:09 PM, Remi Forax wrote: > On 09/10/2012 07:57 PM, Brian Goetz wrote: >> Re-sending copies of documents sent last week. >> > > I may be wrong, but there is no mention of primitive specializations. > Is it something that is still in flux or not ? > > R?mi > > From david.holmes at oracle.com Mon Sep 10 20:51:58 2012 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Sep 2012 13:51:58 +1000 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504E2A0B.4040005@oracle.com> References: <504E2A0B.4040005@oracle.com> Message-ID: <504EB55E.5090304@oracle.com> Hi Brian, On 11/09/2012 3:57 AM, Brian Goetz wrote: > Re-sending copies of documents sent last week. I'm very much a lay-person in terms of being a consumer of these API's, and from that perspective I think we are telling a good story here. So well done to all for getting things to this stage. I have two queries on the parallel aspects of this: 1. There's no mention (as yet) of any hooks into the underlying parallel implementations i.e controlling what FJPool to use. Is the thinking that we will go with the "default" FJPool concept as currently outlined in ForkJoinUtils? 2. It seems to me that if arrays can be converted to stream() and parallel() then we no longer need the old-style ForkJoinUtils.parallelSort API? (Though I'm unsure where all the implementation code would live if we don't have it.) David From brian.goetz at oracle.com Tue Sep 11 08:00:45 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Sep 2012 11:00:45 -0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504EB55E.5090304@oracle.com> References: <504E2A0B.4040005@oracle.com> <504EB55E.5090304@oracle.com> Message-ID: <504F521D.3050405@oracle.com> > I have two queries on the parallel aspects of this: > > 1. There's no mention (as yet) of any hooks into the underlying parallel > implementations i.e controlling what FJPool to use. Is the thinking that > we will go with the "default" FJPool concept as currently outlined in > ForkJoinUtils? Yes. The current plan is to dump these into the default FJP. To the extent we want to support an alternate pool, decomposition hints, resource limits, etc, we can overload parallel() to take arguments about where/how parallel computations are done. Our strawman hypothesis is YAGNI, so we're putting that out and seeing if our hypothesis is right. > 2. It seems to me that if arrays can be converted to stream() and > parallel() then we no longer need the old-style > ForkJoinUtils.parallelSort API? (Though I'm unsure where all the > implementation code would live if we don't have it.) They're different. The Arrays.sort operations are explicitly in-place operations; stream operations will often (though not always) have more overhead. Our approach for in-place updates is to add a small number of methods to Collection (e.g., removeIf(Predicate)) and Arrays. From Donald.Raab at gs.com Wed Sep 12 21:55:01 2012 From: Donald.Raab at gs.com (Raab, Donald) Date: Thu, 13 Sep 2012 00:55:01 -0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504E3EA6.50005@oracle.com> References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> <504E3EA6.50005@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C054039E643627@GSCMAMP09EX.firmwide.corp.gs.com> We released GS Collections 2.0 a few weeks ago on GitHub and it includes an approach to primitives similar to what you describe below Brian. https://github.com/goldmansachs/gs-collections/tree/master/collections-api/src/main/java/com/gs/collections/api You'll find IntIterable, LongIterable, FloatIterable and DoubleIterable in this package. You get to these iterables through the LazyIterable api by calling collect{Int|Long|Float|Double} with the appropriate {Int|Long|Float|Double}Function. https://github.com/goldmansachs/gs-collections/blob/master/collections-api/src/main/java/com/gs/collections/api/LazyIterable.java#L103 We also added short-cuts for sumOf{Int|Long|Float|Double} on RichIterable. https://github.com/goldmansachs/gs-collections/blob/master/collections-api/src/main/java/com/gs/collections/api/RichIterable.java#L695 We've updated the GS Collections Kata to use the new initial primitive support available in GSC 2.0 to illustrate some of the benefits. You can see some of the differences in results leveraging Java 8 w/ GSC 2.0 primitive support below. Current using sumOfDouble: https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/main/java/com/gs/collections/kata/Customer.java#L59 Previous using primitive injectInto: https://github.com/goldmansachs/gs-collections-kata/blob/63478be0d0b95d72d66adf17d6391107363dd332/src/main/java/com/gs/collections/kata/Customer.java#L59 Current using collectDouble: https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise6Test.java https://github.com/goldmansachs/gs-collections-kata/blob/solutions-java8/src/test/java/com/gs/collections/kata/Exercise7Test.java Previous using boxed Doubles: https://github.com/goldmansachs/gs-collections-kata/blob/63478be0d0b95d72d66adf17d6391107363dd332/src/test/java/com/gs/collections/kata/Exercise6Test.java https://github.com/goldmansachs/gs-collections-kata/blob/63478be0d0b95d72d66adf17d6391107363dd332/src/test/java/com/gs/collections/kata/Exercise7Test.java We are working on adding primitive Lists, Sets, Bags, Stacks, Maps to round out the primitive support in GS Collections. We're also deciding how much additional API we would like to support on the current primitive Iterables. We should be adding select, reject, detect soon and will be deciding what other API we would like to carry over from RichIterable. The total surface area for a Java collections library with support for both object and primitive collections with a rich fluent API is certainly non-trivial. > -----Original Message----- > From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- > libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz > Sent: Monday, September 10, 2012 3:25 PM > To: Remi Forax > Cc: lambda-libs-spec-experts at openjdk.java.net > Subject: Re: Updated SotL/L documents for Iteration 2 > > That's something we need to discuss. > > In the past, we outlined a few strategies for getting to primitive > support: > > - Primitive specialization of streams (e.g., IntStream) with > overloads like map(IntMapper) -> yields IntStream. We'd probably just > do Int/Long/Double. > - VM magic to make boxing costs go away (and give ponies to all the > children of the world) > - Fuse common operations to expose primitive opportunities, like a > fused mapReduce(IntMapper, IntOperator) > - Specialized version of the above, such as sumBy(IntMapper) > - Box elimination in libraries (theoretically possible, but probably > only if pipelines can be precompiled) > > I'm leaning towards preferring the first, because: > - It doesn't rely on magic > - It lets us expose methods like sum(), max(), min(), and sort() on > {Int,Long,Double}Stream -- this is a huge plus > - It lets us bring more numerics firepower to bear on things like > DoubleStream.sum() > > Obviously the increased API surface area is a minus. > > On 9/10/2012 2:09 PM, Remi Forax wrote: > > On 09/10/2012 07:57 PM, Brian Goetz wrote: > >> Re-sending copies of documents sent last week. > >> > > > > I may be wrong, but there is no mention of primitive specializations. > > Is it something that is still in flux or not ? > > > > R?mi > > > > From brian.goetz at oracle.com Thu Sep 13 16:14:01 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Sep 2012 19:14:01 -0400 Subject: Iteration2 branch merged into default Message-ID: <505268B9.8090904@oracle.com> We've merged the temporary "it2-bootstrap" branch into the default branch (and closed the it2 branch) in the OpenJDK lambda/lambda repository. From dl at cs.oswego.edu Fri Sep 14 04:28:00 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 14 Sep 2012 07:28:00 -0400 Subject: Numeric Message-ID: <505314C0.2010506@cs.oswego.edu> Picking this up from a set of exchanges on old lambda-lib list, and first recycling initial rationale: The main reason for primitive specializations for bulk ops is to avoid boxing inside inner loops. Unless optimized away (which in practice is almost always too hard for compilers/JITs), not only is it slow to begin with, it doesn't get much faster under parallelism because it spews garbage and disrupts memory locality. The inner loop case is the most glaring problem, but the same issues arise during any combination/reduction/collection steps among a set of subtasks. As in r = combine(leftTask.join(), rightTask.join()); As well as client access of results, as in: r = p.invoke(...) Depending on all sorts of things, these cases can be as numerous and problematic as the inner-loop case. Wishing that they weren't as problematic isn't a good solution. Long-term, we need primitive specialization of generics. But shorter term, there's an intermediate solution that reduces both boxing and combinatorial explosions of special forms. The tradeoff is to accept some virtualness, plus a bit of tedium on the part of components implementing it: /** * Interface defining access methods for classes that provide numeric * results. A {@code Numeric} is not itself an instance of {@link * java.lang.Number}, but provides numeric methods to access its * primary result or property; normally via the method corresponding * to the listed {@code PreferredType} parameter. However, any * implementation of this interface must define the non-preferred * methods as well (typically by casting the results of the * preferred form). */ public interface Numeric { long getLong(); int getInt(); short getShort(); byte getByte(); double getDouble(); float getFloat(); } So for example, someone could define class MyFutureDouble implements Future, Numeric; which could then be used as f = new MyFutureDouble(); // .. async process f ... double d = f.getDouble(); This is a slightly odd interface because the PreferredType parameter exists just to communicate the preferred extraction method to client programmers. Looking ahead though, it might also serve as a heuristic guide for future efforts on automated generics specialization. The required tedium is that any class implementing Numeric will need to boringly implement five of the methods in terms of the one version corresponding to the preferred type. For example, class MyFutureDouble would need: public double getDouble() { return computeResult(); } public long getLong() { return (long)getDouble(); } public int getInt() { return (int)getDouble(); } public float getFloat() { return (float)getDouble(); } public short getShort() { return (short)getDouble(); } public byte getByte() { return (byte)getDouble(); } It would be possible but probably not worthwhile to create six subinterfaces that provide defaults of these forms. Adopting this has a surprisingly widespread impact on the form of some lambdaized/parallelized APIs (mainly internal ones), so it would be a good idea to decide on it soon. For example. I could use this now in methods like CHM.Parallel().reduceValuesToLong. To be maximally useful, this should probably go into java.lang, along side Number. (Otherwise, I'd just stick it in java.util.concurrent for our needs and be done with it.) It would be even nicer if there were a way to do something similar for arguments to function-like classes in addition to their results, but no solution along these lines applies. -Doug From dl at cs.oswego.edu Fri Sep 14 04:54:24 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 14 Sep 2012 07:54:24 -0400 Subject: Streams In-Reply-To: <5052D932.3090808@univ-mlv.fr> References: <5052D932.3090808@univ-mlv.fr> Message-ID: <50531AF0.2090804@cs.oswego.edu> (CCing lambda-libs-spec-experts, which seems better for follow-ups). On 09/14/12 03:13, Remi Forax wrote: > On 09/14/2012 07:23 AM, Sam Pullara wrote: >> Here are some issues that I find with the current Streams implementation: >> >> 1) Collection is not a Stream. This means that whenever you use >> collections and want to use some of the great new features you have to >> first convert it to a stream: >> >> list.stream().map(l -> parseInt(l)).into(new ArrayList<>()) >> >> I find this unnecessary. Collection could implement Stream and the >> default methods could do the conversion for me, probably by calling >> .stream() as above. > > I fuly agree, it will be convenient to have such delegation mechanism. There's some tension between convenience and sanity here :-) I'll let Brian walk through the decision space... > > Is there a document somewhere that explain the pro and cons of using Optional ? > Currently I see it as a way to add a level of indirection with no real benefit, I don't think there is a document, but there has been a lot of discussion about it here and there over the years. I think they mainly amount to two technical problems, plus at least one style/usage issue: 1. Some collections allow null elements, which means that you cannot unambiguously use null in its otherwise only reasonable sense of "there's nothing there". 2. If/when some of these APIs are extended to primitives, there is no value to return in the case of nothing there. The alternative to Optional is to return boxed types, which some people would prefer not to do. 3. Some people like the idea of using Optional to allow more fluent APIs. As in x = s.findFirst().or(valueIfEmpty) vs if ((x = s.findFirst()) == null) x = valueIfEmpty; Some people are happy to create an object for the sake of being able to do this. Although sometimes less happy when they realize that Optionalism then starts propagating through their designs, leading to Set>'s and so on. It's hard to win here. One of the many reasons that we are supplying parallel bulk ops for ConcurrentHashMap that fall outside the main stream framework is that issue #1 does not arise (CHM disallows null keys and values), which streamlines many design issues. >> 3) I really, really believe that we should have something around >> Future like I have prototyped here: Right. FutureValue/Promise/whatever is on our todo list for JDK8 (see JEP 155). One of the many little snags that have held it back so far is getting past related issues like Numeric (see my post on this). -Doug From forax at univ-mlv.fr Fri Sep 14 05:30:19 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 14 Sep 2012 14:30:19 +0200 Subject: Streams In-Reply-To: <50531AF0.2090804@cs.oswego.edu> References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> Message-ID: <5053235B.30707@univ-mlv.fr> On 09/14/2012 01:54 PM, Doug Lea wrote: > (CCing lambda-libs-spec-experts, which seems better for follow-ups). > > On 09/14/12 03:13, Remi Forax wrote: >> On 09/14/2012 07:23 AM, Sam Pullara wrote: >>> Here are some issues that I find with the current Streams >>> implementation: >>> >>> 1) Collection is not a Stream. This means that whenever you use >>> collections and want to use some of the great new features you have to >>> first convert it to a stream: >>> >>> list.stream().map(l -> parseInt(l)).into(new ArrayList<>()) >>> >>> I find this unnecessary. Collection could implement Stream and the >>> default methods could do the conversion for me, probably by calling >>> .stream() as above. >> >> I fuly agree, it will be convenient to have such delegation mechanism. > > There's some tension between convenience and sanity here :-) > I'll let Brian walk through the decision space... > >> >> Is there a document somewhere that explain the pro and cons of using >> Optional ? >> Currently I see it as a way to add a level of indirection with no >> real benefit, > > I don't think there is a document, but there has been a lot of discussion > about it here and there over the years. I think they mainly amount to > two technical problems, plus at least one style/usage issue: > > 1. Some collections allow null elements, which means that you cannot > unambiguously use null in its otherwise only reasonable sense of "there's > nothing there". It's better to never store null and use the null object pattern if you really want to store the null value. > > 2. If/when some of these APIs are extended to primitives, there is > no value to return in the case of nothing there. The alternative > to Optional is to return boxed types, which some people would prefer > not to do. you can ask user for a default value, see below. > > 3. Some people like the idea of using Optional to allow more fluent APIs. > As in > x = s.findFirst().or(valueIfEmpty) > vs > if ((x = s.findFirst()) == null) x = valueIfEmpty; > Some people are happy to create an object for the sake of > being able to do this. Although sometimes less happy when they > realize that Optionalism then starts propagating through their > designs, leading to Set>'s and so on. > > It's hard to win here. One of the many reasons that we > are supplying parallel bulk ops for ConcurrentHashMap that > fall outside the main stream framework is that issue #1 does > not arise (CHM disallows null keys and values), which > streamlines many design issues. There is in my opinion a better design, x = s.findFirst(valueIfEmpty); the default value is a parameter of the current function (see https://github.com/forax/lambda-perf/blob/master/lambda-perf/src/perf/Stream.java if you want to see how it can be implemented). > >>> 3) I really, really believe that we should have something around >>> Future like I have prototyped here: > > Right. FutureValue/Promise/whatever is on our todo list for JDK8 > (see JEP 155). One of the many little snags that have held it back > so far is getting past related issues like Numeric (see my post > on this). > > -Doug > R?mi From dl at cs.oswego.edu Fri Sep 14 05:36:35 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 14 Sep 2012 08:36:35 -0400 Subject: Streams In-Reply-To: <5053235B.30707@univ-mlv.fr> References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> <5053235B.30707@univ-mlv.fr> Message-ID: <505324D3.4070204@cs.oswego.edu> On 09/14/12 08:30, Remi Forax wrote: > There is in my opinion a better design, > x = s.findFirst(valueIfEmpty); > > the default value is a parameter of the current function Yes, sorry not to have listed this option. I like it too. Some people hate it so much that I don't think it will be adopted though :-) At least when applied in cases like: x. = max(Long.MIN_VALUE); -Doug From dl at cs.oswego.edu Fri Sep 14 05:40:18 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 14 Sep 2012 08:40:18 -0400 Subject: Primitive specialization and arrays Message-ID: <505325B2.1070303@cs.oswego.edu> (Finally getting a chance to post about some of the library issues that have been building up...) The parallel aspects of Brian's Stream-based APIs are targeted to a different audience than those aiming to fully parallelize aggregate operations. Which is fine. My CHM.Parallel APIs address one of these audiences -- those doing Hadoop/etc-like processing on possibly-"live" key-value pairs. The other is of course Array-based processing, that was prototyped long ago as extra166y.ParallelArray. Parallel array operations extend those that you can/would do under Stream APIs in part because of indexing -- operations can proceed with higher parallelism so long as the right args/results are in the right indices. (They are less parallel than CHM for some operations though, since CHM fully arbitrates contention per-key/entry rather than relying on execution control to guarantee exclusion/quiescence.) Also, there is much more demand for parallel ops on both CHM and arrays to include in-place updates. This in part because they are often very large, and so you can't afford to pretend that pure functional programming is the only path to happy software :-) But also because some operations, like sorting, are most naturally done in-place anyway. It would not be hard to re-introduce one or more classes that sit "aside" the Stream framework in the same way that CHM.Parallel does -- trading off poorer usability for more extensive functionality. So, no explicit fluency/streaminess etc, but still amenable to special-case translation to/from it. But there are a bunch of added issues, including: 1. Unlike CHM, we really do need specialized Double, Long, Int versions, because operations on primitive elements in arrays are extremely common. There are not already plain sequential forms of primitive specializations. If we think they are needed (and for consistency, also the ref version), there would be either 8 classes, or 4 classes, each with par/seq views. Possible names: Par only: ParallelArray, ParallelDoubleArray, ParallelLongArray, ParallelIntArray Plus seq: SequentialArray, SequentialDoubleArray, SequentialLongArray, SequentialIntArray Or both, where each has methods par()/seq(); BulkArray, BulkDoubleArray, BulkLongArray, BulkIntArray Just to pick something, I'll use first choice below. The proposed Numeric interface will reduce some of the sprawl with these, but they will still be plenty sprawly (in part because they require a surprisingly large number of function type interfaces.) 2. Essentially all of the parallel ops on arrays not covered already via Stream-ized ArrayList etc are those that not only don't focus on the List/Collection-like aspects, they preclude them -- in particular no size changes. So it might make sense to (1) treat an ArrayList as one kind of "builder" for ParallelArray, adding method ArrayList.toParallelArray() or static ParallelArray.from(ArrayList) that does a handoff causing the ArrayList to act empty after handoff. (2) Either support ParallelArray.asImmutableList() to enable basic read-only collections stuff on them, or just implement (the non-interface :-) ImmutableList directly. (3) For the specialized primitive versions, some other TBD "build phase" support might be warranted. Or just have a plain hand-off constructor for arrays created in any way people want to do. 3. Concurrency considerations force CHM to use "null means nothing there" conventions (which turn out to simplify many other issues/constructions). This also applies to arrays of references (as in a[i] == null), but not arrays of primitives or mixed-mode operations. All of the ways of dealing with this are objectionable to some part of target audience, but we'll need to settle on one. 4. Parallel operations on sub-arrays are also common, but as slices (origin <= index < fence), not as sublists (0 <= index < size(), offset from parent). This can either be done by supporting overloaded versions of forEach, map, reduce, replace etc methods taking origin,fence, or introducing a Slice interface. Neither way is always better; we'll need to pick one. 5. If you have array-based classes supporting bulk ops, is there any reason to support any other collection-like forms (List, Set?) Or any Stream-like forms? Not clear. My own indecision about some of these issues has kept me from doing all this out (which is not all that hard -- I've already done it once with ParallelArray). And further kept me from even proposing this path at all for JDK (as opposed to some non-JDK add-on package). Any thoughts would be welcome. -Doug From tim at peierls.net Fri Sep 14 05:51:18 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 14 Sep 2012 08:51:18 -0400 Subject: Streams In-Reply-To: <5053235B.30707@univ-mlv.fr> References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> <5053235B.30707@univ-mlv.fr> Message-ID: On Fri, Sep 14, 2012 at 8:30 AM, Remi Forax wrote: > It's hard to win here. One of the many reasons that we >> are supplying parallel bulk ops for ConcurrentHashMap that >> fall outside the main stream framework is that issue #1 does >> not arise (CHM disallows null keys and values), which >> streamlines many design issues. >> > > There is in my opinion a better design, > x = s.findFirst(valueIfEmpty); > The trouble with that is that I often want to do something different depending on whether I found anything, and it is not always easy (or even possible) to find a distinct value to use as a sentinel. Rather than write ad hoc things like this: Result result = filteredResults.findFirst(NO_RESULT); if (result != NO_RESULT) ... use result ... I prefer being able to write: Optional result = filteredResults.findFirst(); if (result.isPresent()) ... use result.get() ... Note that this preference is less about fluency than preventing user errors. It's all too easy to omit the test in the first snippet above, but if you try to use result directly in the second snippet without testing it, you'll get a compile error. I am sensitive to the risk of API pollution by folks who become over-enamored of Optional, as Doug hints (e.g., Set>), but the rewards outweigh the risks, IMHO. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120914/7dd0af08/attachment-0001.html From david.lloyd at redhat.com Fri Sep 14 06:15:59 2012 From: david.lloyd at redhat.com (David M. Lloyd) Date: Fri, 14 Sep 2012 08:15:59 -0500 Subject: Streams In-Reply-To: References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> <5053235B.30707@univ-mlv.fr> Message-ID: <50532E0F.2010700@redhat.com> On 09/14/2012 07:51 AM, Tim Peierls wrote: > On Fri, Sep 14, 2012 at 8:30 AM, Remi Forax > wrote: > > It's hard to win here. One of the many reasons that we > are supplying parallel bulk ops for ConcurrentHashMap that > fall outside the main stream framework is that issue #1 does > not arise (CHM disallows null keys and values), which > streamlines many design issues. > > > There is in my opinion a better design, > x = s.findFirst(valueIfEmpty); > > > The trouble with that is that I often want to do something different > depending on whether I found anything, and it is not always easy (or > even possible) to find a distinct value to use as a sentinel. Rather > than write ad hoc things like this: > > Result result = filteredResults.findFirst(NO_RESULT); > if (result != NO_RESULT) ... use result ... > > I prefer being able to write: > > Optional result = filteredResults.findFirst(); > if (result.isPresent()) ... use result.get() ... > > Note that this preference is less about fluency than preventing user > errors. It's all too easy to omit the test in the first snippet above, > but if you try to use result directly in the second snippet without > testing it, you'll get a compile error. > > I am sensitive to the risk of API pollution by folks who become > over-enamored of Optional, as Doug hints (e.g., Set>), but > the rewards outweigh the risks, IMHO. I don't know - to me Optional is Pair's brother. Both are useful, in their way, but both have potential to massively stink up code. I don't really believe the benefits are worth it - I mean the best improvement we get isn't an objective "it's faster" or "it allows more optimal code paths", it's purely a style thing and it does have a cost. I don't like it; I think it's going to result in things like: Map>>> or worse. -- - DML From tim at peierls.net Fri Sep 14 06:46:44 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 14 Sep 2012 09:46:44 -0400 Subject: Streams In-Reply-To: <50532E0F.2010700@redhat.com> References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> <5053235B.30707@univ-mlv.fr> <50532E0F.2010700@redhat.com> Message-ID: On Fri, Sep 14, 2012 at 9:15 AM, David M. Lloyd wrote: > I don't like it; I think it's going to result in things like: > > Map>>> > > or worse. Only if you really work hard at obfuscating your code. I've been using a version of Optional for about a year, and the only time I had reason to use Optional as a type parameter was Callable>, which conveys exactly what I mean: "Might have a result when it returns." There's very little incentive (and a pretty daunting disincentive, in fact) to do anything with an Optional besides test for the presence of a value and get that value from it. Optional has (or should have) convenience methods to get default values in the event that you do *not* want a different code path: Result result = filteredResults.findFirst().or(defaultResult); In such cases, you don't even mention the Optional type explicitly. Calling it a matter of style is misleading: It helps prevent user errors, and that's a very desirable property. The experts here might be disciplined enough not to need such help, but we can't assume that everyone using these libraries will be that disciplined. One of the things that appealed to me about Java early on was the sense that if I could get the code to compile, it would just work. My experiences have fallen short of that ideal over the years, of course, but with Optional I've had recent moments where it has come gratifyingly close. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120914/28f7b03a/attachment.html From dl at cs.oswego.edu Fri Sep 14 06:53:30 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 14 Sep 2012 09:53:30 -0400 Subject: Streams In-Reply-To: <50532E0F.2010700@redhat.com> References: <5052D932.3090808@univ-mlv.fr> <50531AF0.2090804@cs.oswego.edu> <5053235B.30707@univ-mlv.fr> <50532E0F.2010700@redhat.com> Message-ID: <505336DA.6090506@cs.oswego.edu> On 09/14/12 09:15, David M. Lloyd wrote: > it's purely a > style thing and it does have a cost. I don't like it; I think it's going to > result in things like: > > Map>>> > You'd think there would be some nice compromise here of offering multiple versions of only a few methods so that people could avoid the propagation effects when the want/need to. But when trying this out for CHM, I ended up thinking that the only consistent design points are all-optional vs all-null. So for example, CHM does not even support "filter" -- instead you can supply indicator functions (as in: (x) -> pred(x) ? x : null) in mappings and the nulls will be ignored. But it may be worth breaking some consistency for the sake of usability in supplying a few such choices in Stream API. In particular, findAny Optional findAny(); T findAny(T ifNone); -Doug From brian.goetz at oracle.com Fri Sep 14 07:52:56 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Sep 2012 10:52:56 -0400 Subject: Streams -- the "bun" problem In-Reply-To: References: Message-ID: <505344C8.9050203@oracle.com> [ moving to right list ] On 9/14/2012 1:23 AM, Sam Pullara wrote: > Here are some issues that I find with the current Streams implementation: > > 1) Collection is not a Stream. This means that whenever you use > collections and want to use some of the great new features you have to > first convert it to a stream: > > list.stream().map(l -> parseInt(l)).into(new ArrayList<>()) > > I find this unnecessary. Collection could implement Stream and the > default methods could do the conversion for me, probably by calling > .stream() as above. I sympathize mightily! This is where we started, and we were there for a while, and (with great resistance on my part, as Doug will attest) reluctantly moved away. Let me walk through the reasoning here. First, I'll observe you don't really want Collection to be a Stream, any more than you want Collection to implement Iterator. What you want is for Collection to have bulk operations like filter, map, etc, without having to write an extra stream() call. The "poster child" for this libraries works all along was: int sumOfBlueWeights = foos.filter(e -> e.color == BLUE) .map(Foo::getWeight) .sum(); The libraries design exercise then largely became one of "what are the right types for the intermediate results." Option 1 was that Collection.filter() should return a Collection. This is pretty natural, but not what we wanted; the performance overhead of filling intermediate collections just to be an input into the next stage was too much. So we didn't explore this one very far. The next option, which was the subject of Iteration 1, was to put these extension methods on Iterable, and use Iterable as our proxy for "has bulk operations". From a "where do we put the methods" perspective, this seemed pretty clean, and consistent with what some other collection frameworks have done. But the Iteration 1 approach had warts too. By glomming onto Iterable, things got uncomfortable for bulk sources that were not backed by repeatably-iterable collections, like IO; we wanted to be able to write higher-level methods in the IO classes like "Reader.lines()". But describing that as an Iterable was a stretch. Worse, we found that the Iteration 1 approach of "Iterable as bulk primitive" was just confusing to people. We got constant questions about "how do I know if the collection is in lazy or eager mode", which didn't make sense, but was evidence that we were pushing on people's mental models in uncomfortable ways. People commonly made the mistake of iterating the stream twice, once to get a count, not aware that iterating (a) might not be repeatable for reasons described above, and (b) ignorant of the potential performance cost if upstream operations like filter/map were expensive. In the end, it seemed that the choice of Iterator as host for these methods was more one of convenience than of sensibility. The next iteration, started with the choice that there should be an entity called Stream, which is like an Iterator -- the values flow by, and when they're consumed, they're gone. People understand this already, its a very basic computer science concept. Iteration 2 started with these interfaces: interface StreamOps { // naming problematic StreamOps filter(Predicate); StreamOps map(Mapper); T reduce(T base, BinaryOperator); ... } We then had Stream and Streamable: interface Stream extends StreamOps { Stream filter(Predicate); // covariant override ... } interface Streamable extends StreamOps { Stream stream(); Stream filter(Predicate p) default { stream().filter(p); } ... } with Collection implementing Streamable. This seemed a huge improvement over Iteration 1, having the convenient way of expressing what you want, and bringing clarity to the model at the same time. One cost was an explosion of interfaces -- so much so that people couldn't see the forest for the trees. We've since done a lot of work pruning / merging the interfaces (at various costs -- a subject for separate messages), so at some point it *might* be practical to bring back the Streamable interface. (It looks like a small overhead now in isolation, but multiply that times the number of stream shapes, which currently is { scalar, key-value } but primitives are coming, and the interaction with other interfaces was pretty significant, as Doug can attest.) The problem you are describing is what we call the "bun" problem; if a user wants to map the values under f from c1 to c2, under the current API they have to say c1.stream().map(...).into(c2); which is two "bun" operations for one "meat" operation, and seems unnecessarily caloric. We resisted really hard introducing the bun. Here are the reasons we ultimately relented and "went bun" (which was a painful decision): 1. Reducing conflict surface area. We can add methods to Collection now, but there are people (Hi Don) who have actually implemented their own collections, and they've added many of the same kinds of methods we are adding. If we add an overload that is incompatible, they're screwed; their class is rendered permanently uncompilable. Now, you can't make an omelette without breaking some eggs, but we can reduce the potential conflict surface area. Adding one or two methods to Collection (stream, parallel) is less potential conflict than adding thirty. Names like sort() are the most problematic, since they are short, have no parameters with with to disambiguate, and Java is hostile to overloading on return type. 2. User model confusion. Collection has existing methods like "removalAll(Object")", which perform in-place mutation. If we added a filter(Predicate) method alongside it which did not perform in-place mutation but instead produced a stream, this would be pretty confusing. Mixing the mutative and functional methods together in one bag might be OK for those who have a strong sense of "these are the old methods, and these are the new methods", but we want Collection to hold together more consistently. Moving these to Stream restored consistency; a Collection can be turned into a Stream, and a Stream has these operations. (As a middle ground, we could consider bringing the *eager* Stream methods (reduce, groupBy) to Collection, since they don't have this property.) 3. Lazy vs eager. We've prioritized adding new lazy filter/map methods over adding eager versions of the same, but I don't think the probability is zero that we might at some point want an eager filter method on Collection. So, for these reasons and others, we relented and "went bun", at least for the time being. On 9/14/2012 3:13 AM, Remi Forax wrote: > I fuly agree, it will be convenient to have such delegation mechanism. I think this is the key -- it is a convenience. I agree it is convenient. The question is, how much distortion of the model of "what is a Collection" are we willing to bear for this convenience? On 9/14/2012 7:54 AM, Doug Lea wrote: > There's some tension between convenience and sanity here :-) So that's where my sanity went... > 2) Optional should implement more of the Stream API like flatMap and > some others. [ Will address these in a separate message ] From brian.goetz at oracle.com Fri Sep 14 08:14:03 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Sep 2012 11:14:03 -0400 Subject: Optional (was: Streams) In-Reply-To: References: Message-ID: <505349BB.5050903@oracle.com> On 9/14/2012 1:23 AM, Sam Pullara wrote: > 2) Optional should implement more of the Stream API like flatMap and > some others. This is a reasonable option. Currently, the Optional class is a strawman, focusing on the methods needed for dealing with absence and omitting these convenience methods. Happy to consider this. > I also hate that with Optional.none() often needs to be witnessed and > not a big fan of using a constructor for Optional: This is a problem not with Optional, but with type inference. In Java 7 we do very limited type inference of generic methods in nested method call contexts. This should improve. Remi wrote: > Is there a document somewhere that explain the pro and cons of using Optional ? The pros are simple: there are eager stream methods that may return nothing. (Lazy methods like filter can return an empty stream.) The "obvious" way to deal with this is to return null. But this has multiple disadvantages: - If null is a valid value, you're hosed just like when map.get(key) returns null; you can't tell the difference between "no mapping" and "mapping with value=null". - People forget to do the null check, and get NPEs. Explicit optional doesn't have this risk; the type system saves you from yourself. - For primitive-valued streams, we can't even coopt null here. - Optional provides a means of doing the null check in a fluent manner. Compare: T result = collection.stream() .filter(...) .map(...) .findFirst(); if (result == null) { throw new NoSuchFooException(); } return result; and return collection.stream() .filter(...) .map(...) .findFirst() .getOrThrow(() -> new NoSuchFooException()); Many people like to harp on the performance issues here, but I think those are red herrings. If you look at the use of Optional in this API, it shows up in exactly one situation: at the end of a bulk operation that might yield no results. There is no List> anywhere, there is no O(n) Optional-boxing anywhere. It is a small O(1) overhead which gives a significant improvement in safety and expressiveness. This tradeoff is totally worth it. (It may even be that the VM can eliminate the Optional box someday via box elimination, at which point the performance cost is zero.) David wrote: > I don't know - to me Optional is Pair's brother. Both are > useful, in their way, but both have potential to massively stink up > code. I don't really believe the benefits are worth it - I mean the > best improvement we get isn't an objective "it's faster" or "it > allows more optimal code paths", it's purely a style thing and it > does have a cost. I don't like it; I think it's going to result in > things like: > > Map>>> Yes, if you invent fire people will burn themselves. But, I'm with Tim on this one. In the cases where we've used it, it is just the right thing, and food tastes better and is safer when cooked. Doug wrote: > But it may be worth breaking some consistency for > the sake of usability in supplying a few such choices > in Stream API. In particular, findAny > Optional findAny(); > T findAny(T ifNone); These are certainly easy enough to implement, and might carry their weight if "use a default value" were the dominant fallback action. Is it? Or is throwing just as common / more common? Tim and Sam, what's your experience here? From brian.goetz at oracle.com Fri Sep 14 09:27:00 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Sep 2012 12:27:00 -0400 Subject: Numeric In-Reply-To: <505314C0.2010506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> Message-ID: <50535AD4.5070701@oracle.com> > Long-term, we need primitive specialization of generics. Yes. If we start on this now (which we have), the earliest it could realistically be here is Java 10. > But shorter > term, there's an intermediate solution that reduces both boxing and > combinatorial explosions of special forms. The tradeoff is to accept > some virtualness, plus a bit of tedium on the part of components > implementing it: > > /** > * Interface defining access methods for classes that provide numeric > * results. A {@code Numeric} is not itself an instance of {@link > * java.lang.Number}, but provides numeric methods to access its > * primary result or property; normally via the method corresponding > * to the listed {@code PreferredType} parameter. However, any > * implementation of this interface must define the non-preferred > * methods as well (typically by casting the results of the > * preferred form). > */ > public interface Numeric { > long getLong(); > int getInt(); > short getShort(); > byte getByte(); > double getDouble(); > float getFloat(); > } Along with, probably: PreferredType getBoxed(); > This is a slightly odd interface because the PreferredType parameter > exists just to communicate the preferred extraction method to > client programmers. Not entirely. In some of the streams classes, for example, we do this: intf MapIterator extends Iterator> so that we can return a MapIterator as a covariant override of something that returns Iterator. But, once you know you have a MapIterator, you can avoid calling the next() method which gets you a boxed Mapping, and instead call nextKey and friends which are unboxed. But "dumb" code that wants to treat it as Iterator will be able to do so. Having the getBoxed() method lets you do the same thing here. The benefit is mostly for the client, but it also lets you wedge these things through some existing return paths. > Looking ahead though, it might also serve as a > heuristic guide for future efforts on automated generics specialization. > > The required tedium is that any class implementing Numeric > will need to boringly implement five of the methods in terms > of the one version corresponding to the preferred type. For > example, class MyFutureDouble would need: > public double getDouble() { return computeResult(); } > public long getLong() { return (long)getDouble(); } > public int getInt() { return (int)getDouble(); } > public float getFloat() { return (float)getDouble(); } > public short getShort() { return (short)getDouble(); } > public byte getByte() { return (byte)getDouble(); } > > It would be possible but probably not worthwhile to create > six subinterfaces that provide defaults of these forms. Why not worthwhile? We've already stumbled over this pattern a few times, where you have N methods and typically N-1 are implemented in terms of one, so you create subinterfaces with defaults and then you've got a SAM type: NumericViaDouble n = () -> 3.14d; > Adopting this has a surprisingly widespread impact on the form of some > lambdaized/parallelized APIs (mainly internal ones), so it would be a > good idea to decide on it soon. For example. I could use this now > in methods like CHM.Parallel().reduceValuesToLong. Could you characterize the impact here? I think this would be really useful to understand. > To be maximally useful, this should probably go into java.lang, > along side Number. (Otherwise, I'd just stick it in java.util.concurrent > for our needs and be done with it.) Yeah, that worked out great with TimeUnit :( > It would be even nicer if there were a way to do something similar for > arguments to function-like classes in addition to their results, but > no solution along these lines applies. See the approach we've taken for Sink. It's ugly. From tim at peierls.net Fri Sep 14 09:27:58 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 14 Sep 2012 12:27:58 -0400 Subject: Streams -- the "bun" problem In-Reply-To: <505344C8.9050203@oracle.com> References: <505344C8.9050203@oracle.com> Message-ID: On Fri, Sep 14, 2012 at 10:52 AM, Brian Goetz wrote: > Here are the reasons we ultimately relented and "went bun" (which was a > painful decision): > > 1. Reducing conflict surface area. > 2. User model confusion. > 3. Lazy vs eager. > I'm a bun man, myself, always have been, and I buy all three of these. I'd even add reason 2a: pedagogical expedience. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120914/503a0bb4/attachment.html From brian.goetz at oracle.com Fri Sep 14 13:56:21 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Sep 2012 16:56:21 -0400 Subject: Stream operations -- current set Message-ID: <505399F5.80600@oracle.com> Here's the current set of stream operations. Intermediate / Lazy (Stateless) ------------------------------- Stream filter(Predicate predicate); Stream map(Mapper mapper); Stream flatMap(FlatMapper mapper); Stream tee(Block block); MapStream mapped(Mapper mapper); Of these, the only one where there is some controversy is over the signature of flatMap, where the mapper takes a lambda into which the results are pushed. Some people prefer something like flatMap(t -> Collection) or flatMap(t -> T[]) but I think these are mostly value-destroying. If you don't already have an array or Collection lying around, its a lot more code/work to construct one, and then its more work to iterate it. And if you do have a Collection lying around, you can just do: flatMap((b, t) -> findResult(t).forEach(b)) and so having the extra overload doesn't help you much. The existing signature seems a better "primitive". Intermediate / Lazy (Stateful) ------------------------------ Stream uniqueElements(); Stream sorted(Comparator comparator); Stream cumulate(BinaryOperator operator); Stream sequential(); Of these, we might want to add a sorted() which assumes natural ordering and takes no Comparator, and throws CCE if the elements are not Comparable (just like new TreeMap() does.) We might also want a version of cumulate that takes an explicit base, not just to deal with the "stream is empty" case (since that's easy with an intermediate operation), but so that you can resume an existing cumulation. Terminal / Eager ---------------- void forEach(Block block); > A into(A target); Object[] toArray(); Map> groupBy(Mapper classifier); Map reduceBy(Mapper classifier, Factory baseFactory, Combiner reducer); T reduce(T base, BinaryOperator op); Optional reduce(BinaryOperator op); U fold(Factory baseFactory, Combiner reducer, BinaryOperator combiner); boolean anyMatch(Predicate predicate); boolean allMatch(Predicate predicate); boolean noneMatch(Predicate predicate); Optional findFirst(); Optional findAny(); Of these, there are a lot more options. For toArray, we might want to do interface ArrayFactory { T[] make(int size); } and have T[] toArray(ArrayFactory) (the two existing versions of toArray in Collection both stink; the no-arg one returns Object[], and the array-taking one uses reflection to instantiate the array. Lambdas buy us out of that (we might even consider treating Foo[]::new as a syntax for array constructor refs.) The most controversial signature here is groupBy, because it is the only place in the Streams API that is tied to Collections. The rationale is; you really can't implement groupBy without having an internal Map anyway, so why not just return that rather than making the user create a MapStream (which has an internal Map) and then dump the elements into a real Map with into(). But that leaves us tied to Collections I, where I'd rather not be. Don has suggested a multi-valued version of groupBy: Map> groupByMulti(FlatMapper classifier); which is easy to implement and makes sense to me. The reduceBy method is one of my favorites. (Not sure if we have the signature quite right yet, it probably needs multiple versions.) It is a combination of group-by and reduce-values. So if you want to compute the highest score by person: Map bestScoresByPerson = scores.reduceBy(s -> getName(), ()-> 0, (sc, s) -> max(sc, s.getScore()); The fold() method could use a better name, but it is a generalized parallel fold where the intermediate result could be mutable or immutable, and there are interesting use cases in both domains. There are a few others in the maybe-should-have list, including limit/skip/slice. But I'd like to nail down the details of the must-haves first. From brian.goetz at oracle.com Fri Sep 14 14:10:10 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Sep 2012 17:10:10 -0400 Subject: Numeric In-Reply-To: <505314C0.2010506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> Message-ID: <50539D32.10105@oracle.com> > Long-term, we need primitive specialization of generics. But shorter > term, there's an intermediate solution that reduces both boxing and > combinatorial explosions of special forms. The tradeoff is to accept > some virtualness, plus a bit of tedium on the part of components > implementing it: > > /** > * Interface defining access methods for classes that provide numeric > * results. A {@code Numeric} is not itself an instance of {@link > * java.lang.Number}, but provides numeric methods to access its > * primary result or property; normally via the method corresponding > * to the listed {@code PreferredType} parameter. However, any > * implementation of this interface must define the non-preferred > * methods as well (typically by casting the results of the > * preferred form). > */ > public interface Numeric { > long getLong(); > int getInt(); > short getShort(); > byte getByte(); > double getDouble(); > float getFloat(); > } These guys also need companion classes that are mutable, so that reductions can have O(1) boxing costs instead of O(n). class IntMutableNumeric implements Numeric { private int value; public setInt(int value) { this.value = value; } // obvious getter implementations } This would be really useful in the groupBy+reduce case: Map> highScoresByName = scores.foldBy(s -> s.getName(), () -> new IntMutableNumeric(), (bx, s) -> bx.setValue(Math.max(bx.getValue, s))); The knowledge that the values are mutable does not escape from the initializing computation. From forax at univ-mlv.fr Fri Sep 14 14:42:02 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 14 Sep 2012 23:42:02 +0200 Subject: Numeric In-Reply-To: <50539D32.10105@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> Message-ID: <5053A4AA.5000900@univ-mlv.fr> On 09/14/2012 11:10 PM, Brian Goetz wrote: >> Long-term, we need primitive specialization of generics. But shorter >> term, there's an intermediate solution that reduces both boxing and >> combinatorial explosions of special forms. The tradeoff is to accept >> some virtualness, plus a bit of tedium on the part of components >> implementing it: >> >> /** >> * Interface defining access methods for classes that provide numeric >> * results. A {@code Numeric} is not itself an instance of {@link >> * java.lang.Number}, but provides numeric methods to access its >> * primary result or property; normally via the method corresponding >> * to the listed {@code PreferredType} parameter. However, any >> * implementation of this interface must define the non-preferred >> * methods as well (typically by casting the results of the >> * preferred form). >> */ >> public interface Numeric { >> long getLong(); >> int getInt(); >> short getShort(); >> byte getByte(); >> double getDouble(); >> float getFloat(); >> } > > These guys also need companion classes that are mutable, so that > reductions can have O(1) boxing costs instead of O(n). > > class IntMutableNumeric implements Numeric { > private int value; > > public setInt(int value) { this.value = value; } > > // obvious getter implementations > } > > This would be really useful in the groupBy+reduce case: > > Map> highScoresByName = > scores.foldBy(s -> s.getName(), > () -> new IntMutableNumeric(), > (bx, s) -> bx.setValue(Math.max(bx.getValue, s))); I suppose it's more something like that: Map> highScoresByName = scores.foldBy(s -> s.getName(), () -> new IntMutableNumeric(), (bx, s) -> bx.setValue(Math.max(bx.getValue(), s.length()))); > > The knowledge that the values are mutable does not escape from the > initializing computation. > There is two problems, the first one is how exchange values between the threads in fork/join if we have lambda. Here creating an object that store the lambda to apply, an Object and a long to store the result (object or primitive) is enough. i think it's an error to try to specify the Fork/join internal operation as a lambda that return a box of value, it's easier to specify it as an inner-class. The second problem, the Brian's one can be solved by specializing reduce and fold for int/long/double instead of using mutable box that will be hard to use in parallel world. About the primitive specialization, we can easily specialize the eager operations like fold or reduce. specializing Stream or StreamMap is far harder because it requires or specialization to be done by the VM or tagged value (here, Jim Laskey's array of tagged value can be great but anyway, we haven't enough time to play with that idea before the release of lambdas. R?mi From forax at univ-mlv.fr Fri Sep 14 14:45:45 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 14 Sep 2012 23:45:45 +0200 Subject: Stream operations -- current set In-Reply-To: <505399F5.80600@oracle.com> References: <505399F5.80600@oracle.com> Message-ID: <5053A589.6090001@univ-mlv.fr> On 09/14/2012 10:56 PM, Brian Goetz wrote: > Here's the current set of stream operations. > > Terminal / Eager > ---------------- > > void forEach(Block block); > > > A into(A target); > > Object[] toArray(); > > Map> groupBy(Mapper > classifier); > > Map reduceBy(Mapper classifier, > Factory baseFactory, > Combiner reducer); > > T reduce(T base, BinaryOperator op); > Optional reduce(BinaryOperator op); > > U fold(Factory baseFactory, > Combiner reducer, > BinaryOperator combiner); > > boolean anyMatch(Predicate predicate); > boolean allMatch(Predicate predicate); > boolean noneMatch(Predicate predicate); > > Optional findFirst(); > Optional findAny(); There is a coherency problem, there is two versions of reduce, one with a base and one with Optional, but findFirst/findAny have only one version. R?mi From forax at univ-mlv.fr Fri Sep 14 15:14:16 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 15 Sep 2012 00:14:16 +0200 Subject: Optional In-Reply-To: <505349BB.5050903@oracle.com> References: <505349BB.5050903@oracle.com> Message-ID: <5053AC38.7070109@univ-mlv.fr> On 09/14/2012 05:14 PM, Brian Goetz wrote: > On 9/14/2012 1:23 AM, Sam Pullara wrote: > >> 2) Optional should implement more of the Stream API like flatMap and >> some others. > > This is a reasonable option. Currently, the Optional class is a > strawman, focusing on the methods needed for dealing with absence and > omitting these convenience methods. Happy to consider this. If optional implements Stream or StreamOps then you will see List> and likes because you allow to delay the check to now if the optional value exist or not. > >> I also hate that with Optional.none() often needs to be witnessed and >> not a big fan of using a constructor for Optional: > > This is a problem not with Optional, but with type inference. In Java > 7 we do very limited type inference of generic methods in nested > method call contexts. This should improve. > > > Remi wrote: >> Is there a document somewhere that explain the pro and cons of using >> Optional ? > > The pros are simple: there are eager stream methods that may return > nothing. (Lazy methods like filter can return an empty stream.) The > "obvious" way to deal with this is to return null. But this has > multiple disadvantages: > - If null is a valid value, you're hosed just like when map.get(key) > returns null; you can't tell the difference between "no mapping" and > "mapping with value=null". yes, it's not a perfect solution but may be people will stock to put null in collections or streams. > - People forget to do the null check, and get NPEs. Explicit > optional doesn't have this risk; the type system saves you from yourself. in that cas @Nullable/@NonNull are better than Optional because they are pure type without any runtime overhead and can be integrated to the type system a la Kotlin. > - For primitive-valued streams, we can't even coopt null here. You can not use Optional of T too. > - Optional provides a means of doing the null check in a fluent > manner. Compare: > > T result = collection.stream() > .filter(...) > .map(...) > .findFirst(); > if (result == null) { > throw new NoSuchFooException(); > } > return result; > > and > > return collection.stream() > .filter(...) > .map(...) > .findFirst() > .getOrThrow(() -> new NoSuchFooException()); you have also to compare to: return collection.filter(...) .map(...) .findFirstOr(() -> new NoSuchFooException()); > > Many people like to harp on the performance issues here, but I think > those are red herrings. If you look at the use of Optional in this > API, it shows up in exactly one situation: at the end of a bulk > operation that might yield no results. There is no List> > anywhere, there is no O(n) Optional-boxing anywhere. It is a small > O(1) overhead which gives a significant improvement in safety and > expressiveness. This tradeoff is totally worth it. > > (It may even be that the VM can eliminate the Optional box someday via > box elimination, at which point the performance cost is zero.) My main concern is that theoretically the VM should remove allocation of Optional because of escape analysis, but practically it will not occur because escape analysis works if you can inline the caller of findFirst() with the code of findFirst() and because the code of findFirst() is too big to be duplicated, the allocation is not removed in practice. Brian and Tim, I agree with you that introducing Optional for findFirst/findAny is harmless, So if you can guarantee me that I will never have to fix perf bugs in a program that use Optional as return value of a hashmap or a cache, I will vote for Optional*. R?mi * otherwise I propose to rename it as PandoraBox because it's a special kind of boxing :) > > > David wrote: > >> I don't know - to me Optional is Pair's brother. Both are >> useful, in their way, but both have potential to massively stink up >> code. I don't really believe the benefits are worth it - I mean the >> best improvement we get isn't an objective "it's faster" or "it >> allows more optimal code paths", it's purely a style thing and it >> does have a cost. I don't like it; I think it's going to result in >> things like: >> >> Map>>> > > Yes, if you invent fire people will burn themselves. But, I'm with > Tim on this one. In the cases where we've used it, it is just the > right thing, and food tastes better and is safer when cooked. > > > Doug wrote: > >> But it may be worth breaking some consistency for >> the sake of usability in supplying a few such choices >> in Stream API. In particular, findAny >> Optional findAny(); >> T findAny(T ifNone); > > These are certainly easy enough to implement, and might carry their > weight if "use a default value" were the dominant fallback action. Is > it? Or is throwing just as common / more common? Tim and Sam, what's > your experience here? From dl at cs.oswego.edu Sat Sep 15 04:14:29 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 15 Sep 2012 07:14:29 -0400 Subject: Optional In-Reply-To: <5053AC38.7070109@univ-mlv.fr> References: <505349BB.5050903@oracle.com> <5053AC38.7070109@univ-mlv.fr> Message-ID: <50546315.2070403@cs.oswego.edu> On 09/14/12 18:14, Remi Forax wrote: > Brian and Tim, I agree with you that introducing Optional for findFirst/findAny > is harmless, > So if you can guarantee me that I will never have to fix perf bugs in a program > that use Optional as return value of a hashmap or a cache, I will vote for > Optional*. This hits one of the basic design issues of core JDK libraries: Core java.{lang,util} components are used vastly more than any others. So any construction that *could* generate a time/space problem is sure to do so, leading to lots of downstream time/cost to someday try to address. This case seems easier than most because it can be addressed simply by adding a couple of methods: Most people can afford to waste an object to get nicer and less error-prone usages. Some can't. Giving people a choice now will spare much agony later. A random sampling of precedents: How bad could it be to create a new WeakRef in each call to WeakHashMap.get()? Why not pre-allocate segments of ConcurrentHashMaps -- who would ever create 1 million mostly empty ConcurrentHashMaps in their program? Why not track per-reader holds with a new mutable thread-local counter just for the sake of throwing more precise exceptions in ReentrantReadWriteLock? (The first two were later addressed. The last one seems unfixable.) -Doug From dl at cs.oswego.edu Sat Sep 15 06:23:12 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 15 Sep 2012 09:23:12 -0400 Subject: Numeric (and accumulators) In-Reply-To: <50539D32.10105@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> Message-ID: <50548140.1060506@cs.oswego.edu> On 09/14/12 17:10, Brian Goetz wrote: > These guys also need companion classes that are mutable, Like for example Atomic{Integer,Long}? :-) It has always been a little weird that JDK provides immutable ones (Number}, and atomically mutable ones (AtomicX}, but not plain mutable ones. I think the main reason is that the ugly singleton array alternative has always been available. But the same combinatorics-reducing forces that lead to Numeric also lead to finally supplying these as well. We defined Atomic{Integer,Long} to extend Number (thus not needing a Numeric interface). We could similarly add plain MutableInteger and MutableLong (plus MutableDouBle and even AtomicDouble, despite the concurrency oddities of AtomicDouble (bitwise "==" used in CAS and floating-point "==" are not the same for double). But there's a much better move here -- take the opportunity to provide standardized forms of structured accumulators. We can define classes that are updatable only via a lambda supplied in constructor. This not only nice matches the semantics of reductions and other bulk-op combinations, but lends itself to encapsulation such that concurrency style/semantics can be chosen via a factory. Here's what it might look like for "Long": public abstract class LongAccumulator implements Numeric { // Nested decl until function names straightened out public static interface LongByLongToLong { long apply(long a, long b); } // factories for standard implementations public static LongAccumulator sequentialAccumulator (long initialValue, LongByLongToLong updateFunction); public static LongAccumulator lockedAccumulator (long initialValue, LongByLongToLong updateFunction); public static LongAccumulator optimisticAccumulator (long initialValue, LongByLongToLong updateFunction); public void update(long x); // cannot return value public long get(); public void reset(long initialValue); public long getThenReset(long initialValue); // plus implement Numeric.... public long getLong(); public int getInt(); public short getShort(); public byte getByte(); public double getDouble(); public float getFloat(); public Number getNumber(); // More factories for convenience/efficency public static LongAccumulator sequentialSum(); public static LongAccumulator lockedSum(); public static LongAccumulator optimisticSum(); public static LongAccumulator sequentialMinimum(); public static LongAccumulator lockedMinimum(); public static LongAccumulator optimisticMinimum(); public static LongAccumulator sequentialMinimum(); public static LongAccumulator lockedMaximum(); public static LongAccumulator optimisticMaximum(); protected LongAccumulator(); // extensibility hook } (As is often the case, there seems to be no perfect naming convention factory methods. Suggestions for improvment would be welcome.) The "optimistic" versions has the property that the update function could be invoked more than once (on CAS failure). Rather than pure CAS based though, these should be a refactoring of the jsr166e LongAdder etc classes, to build in better scalability. They turn out to be very useful in cases where you need to do combinations that cannot easily be arranged as structured parallel reductions -- they add some per-update overhead (CAS vs plain write) but self-adjust space to reduce contention among threads to near-optimal levels (at the expense of dynamically adding a non-trivial amount of space overhead, but only when needed -- basically they save you from having to create multiple reduction targets yourself). In some testing I've done using them; as in: for each element x in parallel { adder.update(x); } is only around 25% slower than clever reduction schemes even on machines with lots of cores and updates. This is worth avoiding when possible, but the option is worth providing for the cases where people don't know of an appropriate reduction. The "protected" no-op ctor, still allows people to add further variants like plain CAS-based "atomic". Additionally, we'd need a non-numeric generics-based version. A pure interface one would be easy, nut I think that even here, an abstract class version works out better. With interfaces, you'd be tempted to make LongAccumulator etc extend it, which hits enough overload/override snags to be unworkable. Additionally, it wouldn't support such convenient use of pre-supplied forms. So: public abstract class Accumulator { // Nested decl until function names straightened out public static interface BiFun { T apply(A a, B b); } public void update(T x); public T get(); public void reset(T initialValue); public T getThenReset(T initialValue); public static Accumulator sequentialAccumulator (T initialValue, BiFun updateFunction); public static Accumulator lockedAccumulator (T initialValue, BiFun updateFunction); public static Accumulator optimisticAccumulator (T initialValue, BiFun updateFunction); protected Accumulator(); // extensibility hook } All of these might live in package java.util.functions? -Doug From dl at cs.oswego.edu Sat Sep 15 07:24:43 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 15 Sep 2012 10:24:43 -0400 Subject: Stream operations -- current set In-Reply-To: <505399F5.80600@oracle.com> References: <505399F5.80600@oracle.com> Message-ID: <50548FAB.5070602@cs.oswego.edu> On 09/14/12 16:56, Brian Goetz wrote: > Here's the current set of stream operations. > ... > Stream flatMap(FlatMapper mapper); > Of these, the only one where there is some controversy is over the signature of > flatMap, where the mapper takes a lambda into which the results are pushed. > Some people prefer something like > > flatMap(t -> Collection) > or > flatMap(t -> T[]) To further rub in how central the "little" issues of optional/null, (as well as numerics) are in all this, note that flatMap is just a special form of mapReduce(x->coll, addAll), which can be implemented so as to require a basis/default policy only if there is nothing there, so could do one of: (1) return null (2) accept an empty-collection generator as basis/defaultValue arg (3) return Optional (4) factor into a special flatMapper interface that absorbs the problem (as Brian chose; in CHM, I support unified forms of map+Reduce explicitly, which leverages intrinsic null policy to naturally use option #1, so method flatMap does not even appear. > Intermediate / Lazy (Stateful) > ------------------------------ > > Stream uniqueElements(); > > Stream sorted(Comparator comparator); > > Stream cumulate(BinaryOperator operator); > > Stream sequential(); (Capsule summary of many, um, discussions between Brian and me: I hate all of these. But not enough to act hatefully about them :-) > Map> groupBy(Mapper classifier); > > Map reduceBy(Mapper classifier, > Factory baseFactory, > Combiner reducer); > > > The most controversial signature here is groupBy, because it is the only place > in the Streams API that is tied to Collections. So why is this in Streams rather than in Maps? > Don has suggested a multi-valued version of groupBy: > > Map> groupByMulti(FlatMapper > classifier); > > which is easy to implement and makes sense to me. The main argument against this is that at least in parallel designs, it is vastly better to reduce the nested value collection while it is being generated. There are surely cases where circumstances don't let you do this, but it's a little uncomfortable to support a method that you hope that people only rarely use. > > The reduceBy method is one of my favorites. (Not sure if we have the signature > quite right yet, it probably needs multiple versions.) It is a combination of > group-by and reduce-values. So if you want to compute the highest score by person: > > Map bestScoresByPerson = > scores.reduceBy(s -> getName(), > ()-> 0, > (sc, s) -> max(sc, s.getScore()); > (Right. better support for constructions like this were one of the reasons I expanded lambda-accepting methods in CHM a few months ago.) -Doug From dl at cs.oswego.edu Sun Sep 16 06:14:18 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 16 Sep 2012 09:14:18 -0400 Subject: Numeric In-Reply-To: <505314C0.2010506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> Message-ID: <5055D0AA.60807@cs.oswego.edu> One further note on defining the Numeric interface. Suppose you have a reduction-style operation with a basis (initial or default value) argument: Numeric reduce(SomeType basis, BiFun<...> combiner); "SomeType" cannot be any Number class because Number is not itself Numeric. The best way out is to retrofit Number to implement Numeric. The interface-ness is basically why Numeric is needed to begin with. Doing this would not be strictly necessary, but all the other alternatives (like creating new classes that DO claim to implement Numeric) are worse. It would probably be better to do nothing at all. -Doug From dl at cs.oswego.edu Sun Sep 16 06:52:13 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 16 Sep 2012 09:52:13 -0400 Subject: Numeric (and accumulators) In-Reply-To: <50548140.1060506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> Message-ID: <5055D98D.1010509@cs.oswego.edu> On 09/15/12 09:23, Doug Lea wrote: > > But there's a much better move here -- take the opportunity to provide > standardized forms of structured accumulators. ... > > All of these might live in package java.util.functions? > After fleshing out a bit, I'm now thinking that a variant of this would be better in j.u.c as a refactoring of jsr166e.LongAdder etc. The availability of non-thread-safe forms as well would then just be an opportunistic byproduct. One reason for integrating into j.u.c is that keyed versions of these forms would need to generalize what is now LongAdderTable (http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/LongAdderTable.html) Reminder: LongAdderTable supports creation of frequency maps, histograms etc in parallel in a much, much better way (less overhead, less GC, more scalability) than multimap-like constructions. This is another application of reduce-while-building designs; a parallel mutative analog of flatMap. (Note that the design advantage holds primarily for parallel/concurrent applications. It matters much less in sequential code where locality and multiple pass set-up/tear-down have less impact.) The current LongAdderTable exploits CHM's thread-safe lambda-accepting methods that make most methods short and simple. The two most commonly used methods are: public void add(K key, long x) { map.computeIfAbsent(key, createLongAdderFunc).add(x); } public long sum(K key) { LongAdder a = map.get(key); return a == null ? 0L : a.sum(); } These could easily be generalized for arbitrary accumulators, as in: public void update(K key, long x) { map.computeIfAbsent(key, accCreationFunc).update(x); } public long accumulation(K key) { LongAccumulator a = map.get(key); return a == null ? basisValue : a.get(); } But the prospects for further generalizing to cover all forms of {int, long, double, T} accumulators are not so nice: Four boringly similar classes, each of which share the same problem as the original of not itself being a Map, but just a thin veneer to simplify some common usages. The ones other than LongAdderTable don't get a very high score on the "does it pull its weight" JDK inclusion scale. So what would it take to avoid defining these classes at all? If all the common usage constructions could be written as expression-y 1-liners on CHM itself, then these would be more appropriate as javadoc examples than classes. There are only a few such methods, so it seems worth a try. One place to start is to finally add a (reduction-argument-style) default-returning version of get to CHM: getValueOrDefault(K key, T valueIfAbsent); This could be used to save a line or two here and there when using a straight CHM. For example, accumulation via: f = map.getValueOrDefault(k, EMPTY_ACCUMULATOR).get(); Even nicer might be: mapValueFor(K key, Fun f, U ifAbsent); Unfortunately, this wouldn't help a lot in the main target use cases of collecting simple counts etc -- these usages go beyond what we can do to evade boxing and incorporate numerics. And it doesn't help at all with method reset() or getThenReset(). But considering that update() and get() are by far the most common operations, maybe we can get by with adding some javadoc (to CHM) along the lines of: /** * Example: Maps from keys to Accumulators can be used to create Histograms * (frequency maps) and related tables. * * To increment a count, installing if not already present: * map.computeIfAbsent(key, () -> LongAccumulator.optimisticSum()).update(1L); * * To determine the sum for a key, it is convenient to create a * singleton static EMPTY_ACCUMULATOR, enabling * long sum = map.getValueOrDefault(key, EMPTY_ACCUMULATOR).get(); */ Any thoughts on this? -Doug From brian.goetz at oracle.com Sun Sep 16 09:33:41 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 16 Sep 2012 12:33:41 -0400 Subject: Numeric (and accumulators) In-Reply-To: <5055D98D.1010509@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <5055D98D.1010509@cs.oswego.edu> Message-ID: <5055FF65.4010006@oracle.com> >> But there's a much better move here -- take the opportunity to provide >> standardized forms of structured accumulators. ... >> >> All of these might live in package java.util.functions? > > After fleshing out a bit, I'm now thinking that a variant of > this would be better in j.u.c as a refactoring of > jsr166e.LongAdder etc. The availability of non-thread-safe forms > as well would then just be an opportunistic byproduct. Yes, I had a note to write up why I thought this was a better strategy, but you beat me to it. My intent is that j.u.f is really just for SAM types and their minimal helper methods (e.g., Predicate.and). From andrey.breslav at jetbrains.com Sun Sep 16 12:11:53 2012 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Sun, 16 Sep 2012 21:11:53 +0200 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <504E3EA6.50005@oracle.com> References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> <504E3EA6.50005@oracle.com> Message-ID: > In the past, we outlined a few strategies for getting to primitive support: > > - Primitive specialization of streams (e.g., IntStream) with overloads like map(IntMapper) -> yields IntStream. We'd probably just do Int/Long/Double. > - VM magic to make boxing costs go away (and give ponies to all the children of the world) Is this really an option, considering the deadlines etc? If we by any chance could have fixnums, the world would be so much nicer... -- Andrey Breslav http://jetbrains.com Develop with pleasure! From dl at cs.oswego.edu Sun Sep 16 12:41:10 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 16 Sep 2012 15:41:10 -0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> <504E3EA6.50005@oracle.com> Message-ID: <50562B56.7070605@cs.oswego.edu> On 09/16/12 15:11, Andrey Breslav wrote: >> - VM magic to make boxing costs go away (and give ponies >> to all the children of the world) > Is this really an option, considering the deadlines etc? Didn't you get your pony yet? -Doug From brian.goetz at oracle.com Sun Sep 16 14:26:57 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 16 Sep 2012 17:26:57 -0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> <504E3EA6.50005@oracle.com> Message-ID: <50564421.7000504@oracle.com> >> - VM magic to make boxing costs go away (and give ponies to all the children of the world) > Is this really an option, considering the deadlines etc? If we by any chance could have fixnums, the world would be so much nicer... No, not a realistic option in the timeframe we have, sadly. Included for completeness and optimism only. From joe.darcy at oracle.com Sun Sep 16 22:45:50 2012 From: joe.darcy at oracle.com (Joe Darcy) Date: Sun, 16 Sep 2012 22:45:50 -0700 Subject: Numeric In-Reply-To: <505314C0.2010506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> Message-ID: <5056B90E.6040205@oracle.com> On 9/14/2012 4:28 AM, Doug Lea wrote: > > Picking this up from a set of exchanges on old > lambda-lib list, and first recycling initial rationale: > > The main reason for primitive specializations for bulk ops > is to avoid boxing inside inner loops. Unless optimized away > (which in practice is almost always too hard for compilers/JITs), > not only is it slow to begin with, it doesn't get much > faster under parallelism because it spews garbage and > disrupts memory locality. > > The inner loop case is the most glaring problem, but the same > issues arise during any combination/reduction/collection > steps among a set of subtasks. As in > r = combine(leftTask.join(), rightTask.join()); > As well as client access of results, as in: > r = p.invoke(...) > Depending on all sorts of things, these cases can be > as numerous and problematic as the inner-loop case. > Wishing that they weren't as problematic isn't a good solution. > > Long-term, we need primitive specialization of generics. But shorter > term, there's an intermediate solution that reduces both boxing and > combinatorial explosions of special forms. The tradeoff is to accept > some virtualness, plus a bit of tedium on the part of components > implementing it: > > /** > * Interface defining access methods for classes that provide numeric > * results. A {@code Numeric} is not itself an instance of {@link > * java.lang.Number}, but provides numeric methods to access its > * primary result or property; normally via the method corresponding > * to the listed {@code PreferredType} parameter. However, any > * implementation of this interface must define the non-preferred > * methods as well (typically by casting the results of the > * preferred form). > */ > public interface Numeric { > long getLong(); > int getInt(); > short getShort(); > byte getByte(); > double getDouble(); > float getFloat(); > } > > So for example, someone could define > class MyFutureDouble implements Future, Numeric; > which could then be used as > f = new MyFutureDouble(); > // .. async process f ... > double d = f.getDouble(); > > This is a slightly odd interface because the PreferredType parameter > exists just to communicate the preferred extraction method to > client programmers. Looking ahead though, it might also serve as a > heuristic guide for future efforts on automated generics specialization. > > The required tedium is that any class implementing Numeric > will need to boringly implement five of the methods in terms > of the one version corresponding to the preferred type. For One quick observation on this API, the bound "PreferredType extends Number" does not limit the preferred value to one of the boxed primitives java.lang.{Integer, Long, Double, ...} since even in the JDK there are other classes which extend java.lang.Number, for example BigDecimal and AtomicInteger. The Number type as it stands basically means "is convertible to a primitive" and capturing that capability would have been more appropriately modeled as an interface than an abstract class. Unfortunately, I don't see a way to model "A Number subclass limited to these six choices" in the type system without adding a new type for that purpose: public abstract class NumericBox extends Number { protected NumericBox() {...} // Don't allow instantiation outside of java.lang } and changing the wrapper classes in java.lang to extend NumericBox (probably with a better name) rather than Number. -Joe From dl at cs.oswego.edu Mon Sep 17 05:59:56 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 17 Sep 2012 08:59:56 -0400 Subject: Numeric In-Reply-To: <5056B90E.6040205@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <5056B90E.6040205@oracle.com> Message-ID: <50571ECC.7090000@cs.oswego.edu> On 09/17/12 01:45, Joe Darcy wrote: > One quick observation on this API, the bound "PreferredType extends Number" does > not limit the preferred value to one of the boxed primitives java.lang.{Integer, > Long, Double, ...} Yes. The main downside is that Numeric is not perfect for someday guiding/automating future primitive specializations. The addition of method getNumber (a variant of one of Brian's suggestions) helps a bit... public interface Numeric { long getLong(); int getInt(); short getShort(); byte getByte(); double getDouble(); float getFloat(); PreferredType getNumber(); } ... because then it becomes easier for compilers/VMS to possible specially handle only those with a PreferredType corresponding to boxed types. But in any case, people can use Numeric in cases where no primitive specialization is even possible, so at best it is a guide. Still, I think that the main question here is whether, in the absence of language/VM supported specialization, the benefits of using Numeric to reduce interface/class/method combinatorics (as well as to reduce boxing) are worthwhile. It's too bad that it is only a partial solution. It would be nicer to find a scheme such that for each relevant collection, stream, etc you needed only a "plain" and a "Numeric" version. (Although looking further ahead, introducing compound value/tuple types may complicate this.) -Doug From dl at cs.oswego.edu Mon Sep 17 06:23:37 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 17 Sep 2012 09:23:37 -0400 Subject: Numeric (and accumulators) In-Reply-To: <5055FF65.4010006@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <5055D98D.1010509@cs.oswego.edu> <5055FF65.4010006@oracle.com> Message-ID: <50572459.30204@cs.oswego.edu> On 09/16/12 12:33, Brian Goetz wrote: >>> But there's a much better move here -- take the opportunity to provide >>> standardized forms of structured accumulators. ... >> After fleshing out a bit, I'm now thinking that a variant of >> this would be better in j.u.c as a refactoring of >> jsr166e.LongAdder etc. The availability of non-thread-safe forms >> as well would then just be an opportunistic byproduct. > > Yes, I had a note to write up why I thought this was a better strategy, but you > beat me to it. > OK, I'll move further discussion to concurrency-interest list, after deciding whether to go with the multiple implementation styles I proposed versus just generalizing a bit some of the current jsr166e dynamically striped classes to accept lambdas (i.e., most likely keeping LongAdder, but replacing the Double and min/max versions with lambda-ized forms.) In some ways this would be a blown opportunity to try to standardize usages: the notion of an Accumulator is central to reductions of any form. But this might be one of the cases where "notion" doesn't match up well enough with "JDK class" to be worth doing. -Doug From andrey.breslav at jetbrains.com Mon Sep 17 06:28:02 2012 From: andrey.breslav at jetbrains.com (Andrey Breslav) Date: Mon, 17 Sep 2012 17:28:02 +0400 Subject: Updated SotL/L documents for Iteration 2 In-Reply-To: <50562B56.7070605@cs.oswego.edu> References: <504E2A0B.4040005@oracle.com> <504E2CD9.3070905@univ-mlv.fr> <504E3EA6.50005@oracle.com> <50562B56.7070605@cs.oswego.edu> Message-ID: <4C8E5924-2C8D-4144-9246-B89BA52C4A1D@jetbrains.com> > Didn't you get your pony yet? > > -Doug > On behalf of my pony, I would like to express one concern that, I realize, may sound too idealistic, but... We have an option not to specialize new methods for primitive types now and implement fixnums, in a future version of the JDK (I realize that having fixnums ever is unlikely, but I'd think that having interface evolution support would be even more unlikely, yet here we are). If we specialize now, we are stuck with a cluttered API forever, or until fixnums are done AND a new collection framework is rolled out. Now, the question is what do we prefer: - Cluttered API until two unlikely events (fixnums and new collections) happen, - Performance penalty until one of those events (fixnums) happens? -- Andrey Breslav JetBrains, Inc. http://www.jetbrains.com "Develop with pleasure!" -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120917/35eb6ed7/attachment.html From brian.goetz at oracle.com Mon Sep 17 08:08:44 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 17 Sep 2012 11:08:44 -0400 Subject: Nulls (was: Stream operations -- current set) In-Reply-To: <50548FAB.5070602@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> Message-ID: <50573CFC.6070607@oracle.com> On 9/15/2012 10:24 AM, Doug Lea wrote: > To further rub in how central the "little" issues of optional/null, > (as well as numerics) are in all this, note that flatMap is just a > special form of mapReduce(x->coll, addAll) and filter() is just a special form of flatMap > , which can be > implemented so as to require a basis/default policy > only if there is nothing there, so could do one of: > (1) return null (2) accept an empty-collection generator as > basis/defaultValue arg (3) return Optional (4) factor into > a special flatMapper interface that absorbs the problem > (as Brian chose; in CHM, I support unified forms of map+Reduce > explicitly, which leverages intrinsic null policy to naturally use > option #1, so method flatMap does not even appear. We keep circling round the question of what to do about null values in streams. Are they supported? Banned? Grumblingly tolerated? As Doug points out, for concurrent collections, the sensible way to interpret null is "there's nothing there right now." But, given that the streams API is built around an assumption of non-interference, we don't have to worry about "right now" as much; the contents of the stream source should remain constant during the course of the calculation. If we'd bitten the bullet and not allowed nulls as elements in Collections from the beginning, we'd have less of a problem now. So, looking at only the sequential case right now, what are the realistic options for handling nulls in streams? From brian.goetz at oracle.com Mon Sep 17 08:29:29 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 17 Sep 2012 11:29:29 -0400 Subject: Stream operations -- current set In-Reply-To: <50548FAB.5070602@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> Message-ID: <505741D9.9030308@oracle.com> >> Intermediate / Lazy (Stateful) >> ------------------------------ >> >> Stream uniqueElements(); >> >> Stream sorted(Comparator comparator); >> >> Stream cumulate(BinaryOperator operator); >> >> Stream sequential(); > > (Capsule summary of many, um, discussions between Brian and me: > I hate all of these. But not enough to act hatefully about them :-) To fill this in some more: The first three are inconvenient to implement in parallel because they are stateful and require nonlocal processing, which puts significant constraints on parallel implementations. The stateless intermediate operations all have the nice property of being homomorphisms under concatenation: f(a::b) = f(a) :: f(b); the stateful ones do not. The statelessness property is a tremendously useful one for parallelization / GPUization / etc. The argument in favor of these ops is that they're useful, and because we don't commit to an execution strategy until we see the end of the pipeline anyway, we can determine if the pipeline is entirely of the "nicer" (stateless) kind anyway and optimize accordingly in cases where the stateful ops are not used. The last one (sequential) deserves discussion on its own. The idea is that streams begin life as either sequential or parallel streams, but there are use cases where we want to use parallelism for the "head" part of a pipeline but then constrain the "tail" part to act sequentially in encounter order. For example, imagine an expensive filter function f. There's a lot of parallelism to be gained by applying the filter function in parallel, but we may not be willing in all cases to lose the encounter order of the elements at the source. Some examples (using the current API): // Pure sequential, preserves encounter order c.filter(f).forEach(g) // Parallel; no constraint on order of arrival at g c.parallel().filter(f).forEach(g) // Parallel filtering; results arrive at g in encounter order c.parallel().filter(f).sequential().forEach(g) All of the above are useful. The question is, how do we specify the latter constraint? Currently, we implement this as a stateful intermediate/lazy operation, which is a no-op in sequential pipelines. The parallel implementation does the usual decomposition, computes a result at each leaf, and then builds a conc-tree for the results to minimize copying. There are other ways to model this as well; for example, as a fluently propagated constraint. This is harder to implement but might play better with, say, parallel operations on infinite streams (which make sense if the terminal operation is something like findFirst.) This question is also related to the question of "how does a client programatically get access to the stream contents." Given that under the Iteration2 model a Stream is more like an Iterator, it should be possible to ask for the results sequentially and lazily. There is a findFirst operation now; we haven't defined what happens when findFirst is called repeatedly. It is also easy to provide an Iterator-bearing method that has the desired effect in the sequential case. But, what should happen here in the parallel case? Should there be a Spliterator-bearing method? Should we have a concept of "restartable" parallel computation where we can ask for more results and this may spur incremental calculation? >> Map> groupBy(Mapper >> classifier); >> >> Map reduceBy(Mapper classifier, >> Factory baseFactory, >> Combiner reducer); >> >> The most controversial signature here is groupBy, because it is the >> only place >> in the Streams API that is tied to Collections. > > So why is this in Streams rather than in Maps? Because while groupBy *produces* a Map, it operates on scalar/linear streams of values. The map structure is induced by the classifier function. >> Don has suggested a multi-valued version of groupBy: >> >> Map> groupByMulti(FlatMapper> extends U> >> classifier); >> >> which is easy to implement and makes sense to me. > > The main argument against this is that at least in parallel designs, > it is vastly better to reduce the nested value collection while > it is being generated. This is totally true (and hence the comments about the reduceBy method below), but there are times when you are not going to reduce at all. (If these cases didn't exist, we wouldn't need groupBy at all; it is slightly regrettable that we have it because people will use it wrong instead of reduceBy.) But if we are going to have groupBy, Don has made a strong case for groupByMulti (which is "no harder" than groupBy to implement, and addresses some use cases that are hard to do with groupBy.) > There are surely cases where circumstances > don't let you do this, but it's a little uncomfortable to > support a method that you hope that people only rarely use. Yeah, that's the regrettable part. Perhaps the answer here is to give reducyBy a better name and groupBy a worse name. From brian.goetz at oracle.com Mon Sep 17 08:37:05 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 17 Sep 2012 11:37:05 -0400 Subject: Numeric (and accumulators) In-Reply-To: <50548140.1060506@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> Message-ID: <505743A1.7060601@oracle.com> >> These guys also need companion classes that are mutable, > > Like for example Atomic{Integer,Long}? :-) So, writing out the complete set of candidate numeric types that have been proposed (we can prune them later): - New interface BoxedPrimitive extends Number (Joe) - New interface Numeric (Doug) - Mutable implementation classes of Numeric (Brian) - Complete the set of atomic implementation classes (Doug) - Accumulator classes (Doug) I'll just say that we don't need to get caught up in premature either/or; if all of these are worth having, we can have them. > public abstract class LongAccumulator implements Numeric { I like these. I believe that they all assume commutative combination functions, not simply associative (what are sometimes called abelian monoids, rather than ordinary monoids)? > In some testing I've done using them; as in: > for each element x in parallel { adder.update(x); } > is only around 25% slower than clever reduction schemes even > on machines with lots of cores and updates. This is worth > avoiding when possible, but the option is worth providing > for the cases where people don't know of an appropriate reduction. Can you clarify what you mean by "clever reduction scheme"? > All of these might live in package java.util.functions? Trying to avoid putting (nontrivial) implementations in j.u.f. From brian.goetz at oracle.com Mon Sep 17 08:40:01 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 17 Sep 2012 11:40:01 -0400 Subject: Numeric (and accumulators) In-Reply-To: <5055D98D.1010509@cs.oswego.edu> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <5055D98D.1010509@cs.oswego.edu> Message-ID: <50574451.3020701@oracle.com> > One reason for integrating into j.u.c is that keyed versions of > these forms would need to generalize what is now LongAdderTable How much better (pick your metrics) is the keyed form of LAT than a CHM of Accumulators? > /** > * Example: Maps from keys to Accumulators can be used to create > Histograms > * (frequency maps) and related tables. > * > * To increment a count, installing if not already present: > * map.computeIfAbsent(key, () -> > LongAccumulator.optimisticSum()).update(1L); > * > * To determine the sum for a key, it is convenient to create a > * singleton static EMPTY_ACCUMULATOR, enabling > * long sum = map.getValueOrDefault(key, EMPTY_ACCUMULATOR).get(); > */ > > Any thoughts on this? Like it. Does that mean we don't need LAT at all? From joe.darcy at oracle.com Mon Sep 17 08:59:22 2012 From: joe.darcy at oracle.com (Joe Darcy) Date: Mon, 17 Sep 2012 08:59:22 -0700 Subject: Numeric (and accumulators) In-Reply-To: <505743A1.7060601@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <505743A1.7060601@oracle.com> Message-ID: <505748DA.1030007@oracle.com> On 9/17/2012 8:37 AM, Brian Goetz wrote: >>> These guys also need companion classes that are mutable, >> >> Like for example Atomic{Integer,Long}? :-) > > So, writing out the complete set of candidate numeric types that have > been proposed (we can prune them later): > > - New interface BoxedPrimitive extends Number (Joe) > - New interface Numeric (Doug) > - Mutable implementation classes of Numeric (Brian) > - Complete the set of atomic implementation classes (Doug) > - Accumulator classes (Doug) > > I'll just say that we don't need to get caught up in premature > either/or; if all of these are worth having, we can have them. > >> public abstract class LongAccumulator implements Numeric { > > I like these. I believe that they all assume commutative combination > functions, not simply associative (what are sometimes called abelian > monoids, rather than ordinary monoids)? For a "DoubleAccumulator", I'll just note that a good implementation will need to maintain some internal state. To have a reasonably good chance of getting a robustly accurate double sum, extra work is needed over just "sum += d[i++]." One pretty simply approach is called compensated summation. People have also looked at sorting the input, etc. -Joe From dl at cs.oswego.edu Tue Sep 18 04:46:28 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 18 Sep 2012 07:46:28 -0400 Subject: Numeric (and accumulators) In-Reply-To: <50574451.3020701@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <5055D98D.1010509@cs.oswego.edu> <50574451.3020701@oracle.com> Message-ID: <50585F14.5000202@cs.oswego.edu> On 09/17/12 11:40, Brian Goetz wrote: >> One reason for integrating into j.u.c is that keyed versions of >> these forms would need to generalize what is now LongAdderTable > > How much better (pick your metrics) is the keyed form of LAT than a CHM of > Accumulators? No better at all performance-wise. LAT exists/existed because the implementations of update/get using CHM+LongAdder were initially too messy and non-obvious. But by adding various CHM methods over the past year, they are getting close enough to "easy" that LAT could go away as a class and turn into a javadoc (and/or tutorial) example. I think that planning for it to go away is a good stance -- maybe something making it and similar constructions a bit simpler yet will occur to us. BTW, at some point we should discuss whether/how to support the various CHM additions in other Map classes. > >> /** >> * Example: Maps from keys to Accumulators can be used to create >> Histograms >> * (frequency maps) and related tables. >> * >> * To increment a count, installing if not already present: >> * map.computeIfAbsent(key, () -> >> LongAccumulator.optimisticSum()).update(1L); (Actually, "k ->", not "() ->") >> * >> * To determine the sum for a key, it is convenient to create a >> * singleton static EMPTY_ACCUMULATOR, enabling >> * long sum = map.getValueOrDefault(key, EMPTY_ACCUMULATOR).get(); >> */ >> >> Any thoughts on this? > > Like it. Does that mean we don't need LAT at all? > From dl at cs.oswego.edu Tue Sep 18 05:38:59 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 18 Sep 2012 08:38:59 -0400 Subject: Numeric (and accumulators) In-Reply-To: <505743A1.7060601@oracle.com> References: <505314C0.2010506@cs.oswego.edu> <50539D32.10105@oracle.com> <50548140.1060506@cs.oswego.edu> <505743A1.7060601@oracle.com> Message-ID: <50586B63.5060007@cs.oswego.edu> On 09/17/12 11:37, Brian Goetz wrote: >> public abstract class LongAccumulator implements Numeric { > > I like these. I believe that they all assume commutative combination functions, > not simply associative (what are sometimes called abelian monoids, rather than > ordinary monoids)? Starting with meta-answer (sorry :-) The impact of commutativity is in part a matter of usage of a function, not just the function. For example, suppose the accumulation were list-append, which is not commutative. But further suppose that you don't care about the order of the elements in the result. In which case, maybe you shouldn't have used a List, but instead a Bag (which you can think of as a list in which order doesn't matter). But we have no Bag interface/classes. And on the other side, an Accumulator itself doesn't/can't know if it is being called in a way that guarantees any further properties beyond basic thread-safety consequences of choosing sequential, locked, or optimistic. In particular, the optimistic versions may invoke the combining function multiple times (on CAS failures), so may act surprisingly if the functions are not idempotent. Yet even here there is no absolute requirement. A function that introduces random noise to results using a per-invocation random number is not strictly idempotent but may be completely fine in an Accumulator. These kinds of issues/cases arise a lot. So I don't think you can do more than clearly explain them in javadoc specs. > >> In some testing I've done using them; as in: >> for each element x in parallel { adder.update(x); } >> is only around 25% slower than clever reduction schemes even >> on machines with lots of cores and updates. This is worth >> avoiding when possible, but the option is worth providing >> for the cases where people don't know of an appropriate reduction. > > Can you clarify what you mean by "clever reduction scheme"? > For example, CHM.reduceValuesToLong that uses a tree-like reduction scheme is only about 25% faster in some tests/machines than implementing it as forEach(x -> adder.add(x)). Around 25% is about the best case I've seen, but I haven't seen any worse than 100% (i.e., twice as slow), which is still not terrible given that you still get scalability, so stays half as fast even on a hundred or so cores. So having this scheme available as a backup option for reductions on data structures that resist tree-based reductions seems like a good idea. -Doug From brian.goetz at oracle.com Wed Sep 19 13:39:29 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 19 Sep 2012 16:39:29 -0400 Subject: ArrayFactory SAM type / toArray Message-ID: <505A2D81.3050306@oracle.com> In looking at the Collection API, there are two forms of toArray method, both of which are unfortunate: Object[] toArray() -- returns an Object[], not a T[] T[] toArray(T[]) -- reflective instantiation Lambdas offer us a way out of this: interface ArrayFactory { T[] make(int n); } interface Collection { T[] toArray(ArrayFactory factory) default { return toArray(factory.make(size()); } } The default is imperfect (though no worse than what clients typically do), and concrete implementations of Collection can do better. Given that Stream has a toArray method, my preference would be to expose T[] toArray(ArrayFactory) possibly as the only toArray method. We might be able to extend the constructor reference syntax to arrays: Foo[]::new. If not, n -> new Foo[n] works fine. From forax at univ-mlv.fr Wed Sep 19 13:54:07 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 19 Sep 2012 22:54:07 +0200 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A2D81.3050306@oracle.com> References: <505A2D81.3050306@oracle.com> Message-ID: <505A30EF.3090201@univ-mlv.fr> On 09/19/2012 10:39 PM, Brian Goetz wrote: > In looking at the Collection API, there are two forms of toArray > method, both of which are unfortunate: > > Object[] toArray() -- returns an Object[], not a T[] > > T[] toArray(T[]) -- reflective instantiation > > Lambdas offer us a way out of this: > > interface ArrayFactory { > T[] make(int n); > } > > interface Collection { > T[] toArray(ArrayFactory factory) default { > return toArray(factory.make(size()); > } > } > > The default is imperfect (though no worse than what clients typically > do), and concrete implementations of Collection can do better. > > Given that Stream has a toArray method, my preference would be to expose > > T[] toArray(ArrayFactory) > > possibly as the only toArray method. > > We might be able to extend the constructor reference syntax to arrays: > Foo[]::new. If not, n -> new Foo[n] works fine. > Why passing a lambda that will be called in the body of the toArray ? It's simpler to directly pass the array. R?mi From josh at bloch.us Wed Sep 19 14:01:42 2012 From: josh at bloch.us (Joshua Bloch) Date: Wed, 19 Sep 2012 14:01:42 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A2D81.3050306@oracle.com> References: <505A2D81.3050306@oracle.com> Message-ID: Brian, What problem are you trying solve? Josh On Wed, Sep 19, 2012 at 1:39 PM, Brian Goetz wrote: > In looking at the Collection API, there are two forms of toArray method, > both of which are unfortunate: > > Object[] toArray() -- returns an Object[], not a T[] > > T[] toArray(T[]) -- reflective instantiation > > Lambdas offer us a way out of this: > > interface ArrayFactory { > T[] make(int n); > } > > interface Collection { > T[] toArray(ArrayFactory factory) default { > return toArray(factory.make(size()); > } > } > > The default is imperfect (though no worse than what clients typically do), > and concrete implementations of Collection can do better. > > Given that Stream has a toArray method, my preference would be to expose > > T[] toArray(ArrayFactory) > > possibly as the only toArray method. > > We might be able to extend the constructor reference syntax to arrays: > Foo[]::new. If not, n -> new Foo[n] works fine. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120919/197f4b4a/attachment.html From david.lloyd at redhat.com Wed Sep 19 14:09:24 2012 From: david.lloyd at redhat.com (David M. Lloyd) Date: Wed, 19 Sep 2012 16:09:24 -0500 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A2D81.3050306@oracle.com> References: <505A2D81.3050306@oracle.com> Message-ID: <505A3484.5030604@redhat.com> On 09/19/2012 03:39 PM, Brian Goetz wrote: > In looking at the Collection API, there are two forms of toArray method, > both of which are unfortunate: > > Object[] toArray() -- returns an Object[], not a T[] > > T[] toArray(T[]) -- reflective instantiation > > Lambdas offer us a way out of this: > > interface ArrayFactory { > T[] make(int n); > } > > interface Collection { > T[] toArray(ArrayFactory factory) default { > return toArray(factory.make(size()); > } > } > > The default is imperfect (though no worse than what clients typically > do), and concrete implementations of Collection can do better. > > Given that Stream has a toArray method, my preference would be to expose > > T[] toArray(ArrayFactory) > > possibly as the only toArray method. > > We might be able to extend the constructor reference syntax to arrays: > Foo[]::new. If not, n -> new Foo[n] works fine. Seems like you could as well just pass a Class and not have all these lambdas around (iirc reflective array creation is a hotspot intrinsic so perf shouldn't be an issue). -- - DML From brian.goetz at oracle.com Wed Sep 19 14:15:07 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 19 Sep 2012 17:15:07 -0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A30EF.3090201@univ-mlv.fr> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> Message-ID: <505A35DB.90601@oracle.com> > Why passing a lambda that will be called in the body of the toArray ? > It's simpler to directly pass the array. No, that's what the existing unfortunate toArray(T[]) does. Things that are wrong with it: - If you get the size wrong, it has to reallocate. - If it reallocates, it has to do it reflectively. - It is inherently racy. To see the racy part, consider an implementation like SynchronizedList. If the user does: Foo[] array = c.toArray(new Foo[c.size()]); where c is a synchronized list, we acquire the lock, compute the size, release the lock, and pass the array into toArray, which will have to allocate again (reflectively) if the size has changed. Whereas an implementation of toArray(ArrayFactory) can create the array once at the proper size while ensuring no concurrent modifications. So, comparing the status quo toArray(T[]) with the proposed (low quality) default version: - Worst case is identical - Best case is better in that allocation is not done reflectively And a non-crappy overriden implementation can be better still (eliminate races.) Arguably the proposed approach also provides a better separation of concerns; having the client allocate the array for the library seems questionable. (Arguably it is even better from an API design perspective to pass a class literal rather than a lambda, but that gets us back into reflection.) Josh writes: > What problem are you trying solve? I think the above should explain, but in a nutshell the problem is: the two existing precedents for toArray as done in Collection are both unfortunate, and I'd rather not propagate them blindly into Streams. They're probably the best we could have done without lambdas in the language, but with lambdas, a better alternative arises -- let the library create the array with a factory provided by the caller. I would like to offer a better version of toArray for Streams, and possibly consider retrofitting onto Collection. From spullara at gmail.com Wed Sep 19 14:08:06 2012 From: spullara at gmail.com (Sam Pullara) Date: Wed, 19 Sep 2012 14:08:06 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <12994838.27199.1348088698750.JavaMail.mobile-sync@iczb2> References: <505A2D81.3050306@oracle.com> <12994838.27199.1348088698750.JavaMail.mobile-sync@iczb2> Message-ID: <-7580340111016603839@unknownmsgid> Presumably because you don't know the size of the array until the lambda is called if this is on Stream. I think that you should just overload .into with a factory so you could even pre allocate collections. Sam Any errors are bugs in iOS 6 On Sep 19, 2012, at 2:02 PM, Remi Forax wrote: > On 09/19/2012 10:39 PM, Brian Goetz wrote: >> In looking at the Collection API, there are two forms of toArray >> method, both of which are unfortunate: >> >> Object[] toArray() -- returns an Object[], not a T[] >> >> T[] toArray(T[]) -- reflective instantiation >> >> Lambdas offer us a way out of this: >> >> interface ArrayFactory { >> T[] make(int n); >> } >> >> interface Collection { >> T[] toArray(ArrayFactory factory) default { >> return toArray(factory.make(size()); >> } >> } >> >> The default is imperfect (though no worse than what clients typically >> do), and concrete implementations of Collection can do better. >> >> Given that Stream has a toArray method, my preference would be to expose >> >> T[] toArray(ArrayFactory) >> >> possibly as the only toArray method. >> >> We might be able to extend the constructor reference syntax to arrays: >> Foo[]::new. If not, n -> new Foo[n] works fine. > > Why passing a lambda that will be called in the body of the toArray ? > It's simpler to directly pass the array. > > R?mi > > > > > > > From forax at univ-mlv.fr Wed Sep 19 14:20:23 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 19 Sep 2012 23:20:23 +0200 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A35DB.90601@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> Message-ID: <505A3717.40601@univ-mlv.fr> On 09/19/2012 11:15 PM, Brian Goetz wrote: >> Why passing a lambda that will be called in the body of the toArray ? >> It's simpler to directly pass the array. > > No, that's what the existing unfortunate toArray(T[]) does. Things > that are wrong with it: > > - If you get the size wrong, it has to reallocate. > - If it reallocates, it has to do it reflectively. > - It is inherently racy. > > To see the racy part, consider an implementation like > SynchronizedList. If the user does: > > Foo[] array = c.toArray(new Foo[c.size()]); > > where c is a synchronized list, we acquire the lock, compute the size, > release the lock, and pass the array into toArray, which will have to > allocate again (reflectively) if the size has changed. Whereas an > implementation of toArray(ArrayFactory) can create the array once at > the proper size while ensuring no concurrent modifications. ok, the API is slightly better for synchronized collections, I never use them, and you still need to protect the code against mutations when using concurrent collections. > > So, comparing the status quo toArray(T[]) with the proposed (low > quality) default version: > - Worst case is identical > - Best case is better in that allocation is not done reflectively as David says, Hostspot as an optimization for that :) > > And a non-crappy overriden implementation can be better still > (eliminate races.) > > Arguably the proposed approach also provides a better separation of > concerns; having the client allocate the array for the library seems > questionable. (Arguably it is even better from an API design > perspective to pass a class literal rather than a lambda, but that > gets us back into reflection.) see above. R?mi From brian.goetz at oracle.com Wed Sep 19 14:28:12 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 19 Sep 2012 17:28:12 -0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: <-7580340111016603839@unknownmsgid> References: <505A2D81.3050306@oracle.com> <12994838.27199.1348088698750.JavaMail.mobile-sync@iczb2> <-7580340111016603839@unknownmsgid> Message-ID: <505A38EC.6080009@oracle.com> > Presumably because you don't know the size of the array until the > lambda is called if this is on Stream. Right, in the stream case, you may not know the target size, but the stream implementation very well may. Making the toArray(T[]) even less useful for streams, and a toArray(ArrayFactory) more useful. > I think that you should just overload .into with a factory so you > could even pre allocate collections. But how does that help if you really want an array? (And, if you're talking about overloading into with a factory that produces an array, the only difference is whether you call it into() or toArray()). (BTW, there are some moderately compelling type-inference-limitation reasons to not overload into()). From josh at bloch.us Wed Sep 19 14:48:10 2012 From: josh at bloch.us (Joshua Bloch) Date: Wed, 19 Sep 2012 14:48:10 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A35DB.90601@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> Message-ID: Brian, I don't find these arguments convincing. There's no race (any more than there is for any bulk operation) as the allocation is done by the object itself. The allocation stuff is pretty much a red herring: most users don't preallocate the array. So it seems to me that using factories here might amount to needless complexity and inconsistency. Josh On Wed, Sep 19, 2012 at 2:15 PM, Brian Goetz wrote: > Why passing a lambda that will be called in the body of the toArray ? >> It's simpler to directly pass the array. >> > > No, that's what the existing unfortunate toArray(T[]) does. Things that > are wrong with it: > > - If you get the size wrong, it has to reallocate. > - If it reallocates, it has to do it reflectively. > - It is inherently racy. > > To see the racy part, consider an implementation like SynchronizedList. > If the user does: > > Foo[] array = c.toArray(new Foo[c.size()]); > > where c is a synchronized list, we acquire the lock, compute the size, > release the lock, and pass the array into toArray, which will have to > allocate again (reflectively) if the size has changed. Whereas an > implementation of toArray(ArrayFactory) can create the array once at the > proper size while ensuring no concurrent modifications. > > So, comparing the status quo toArray(T[]) with the proposed (low quality) > default version: > - Worst case is identical > - Best case is better in that allocation is not done reflectively > > And a non-crappy overriden implementation can be better still (eliminate > races.) > > Arguably the proposed approach also provides a better separation of > concerns; having the client allocate the array for the library seems > questionable. (Arguably it is even better from an API design perspective > to pass a class literal rather than a lambda, but that gets us back into > reflection.) > > > Josh writes: > > What problem are you trying solve? >> > > I think the above should explain, but in a nutshell the problem is: the > two existing precedents for toArray as done in Collection are both > unfortunate, and I'd rather not propagate them blindly into Streams. > They're probably the best we could have done without lambdas in the > language, but with lambdas, a better alternative arises -- let the library > create the array with a factory provided by the caller. I would like to > offer a better version of toArray for Streams, and possibly consider > retrofitting onto Collection. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120919/d1e0faaf/attachment-0001.html From brian.goetz at oracle.com Wed Sep 19 15:13:43 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 19 Sep 2012 18:13:43 -0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> Message-ID: <505A4397.6060501@oracle.com> > I don't find these arguments convincing. There's no race (any more than > there is for any bulk operation) as the allocation is done by the object > itself. The allocation stuff is pretty much a red herring: most users > don't preallocate the array. So it seems to me that using factories here > might amount to needless complexity and inconsistency. I agree with you that most users don't pre-allocate the array. Which makes the existing form of toArray even more unfortunate! Because then the allocation always involves multiple reflective calls. (Some of which are sometimes optimized by some VMs in some conditions, but none of which are always optimized by all VMs in all conditions.) So the performance will always be worse in the toArray(T[]) formulation. The fundamental problem is that the client knows how best to create the array (the client can say "new Foo" but the library cannot say "new T", and therefore has to fall back to reflection), but the library knows best how big the array should be. This is a classic example of the sort of differences in APIs you get when designing an API with or without closures. The client knows how; the library knows how much; ideally we'd like for the client to pass that knowledge into the library. The approximations we got when it is hard to combine these are unfortunate; we can do better now. I'd find David's suggestion of toArray(Class) more compelling (in some sense it is the most "right" in that it doesn't conflate "what" with "how") except I don't buy that the intrinsification of reflective array allocation in some VMs in some compilation modes in some situations makes all the reflective costs go away. We're creating a new API here. All things being equal, we should lean on consistency with existing APIs when we can, but obviously that is just a guideline (someday we're going to have to contend with the fact that an int isn't big enough to store the size of collections.) The existing toArray signatures are the best we could have done at the time (and that was a very different time), but that doesn't mean we shouldn't seek to do any better. Here are what the client callsites might look like in various cases: // status quo Foo[] foos = ...toArray(new Foo[0]); // ugh reflection Foo[] foos = ...toArray(new Foo[xyz.size()]); // ugh ugly and racy // proposed Foo[] foos = ...toArray(n -> new Foo[n]); // David's alternative Foo[] foos = ...toArray(Foo.class); I don't see the "complexity of factories" being a big problem here -- if people can deal with lambdas at all, this is a pretty simple case, and its only a few characters longer than the "new Foo[0]" version. I think the lambda code reads pretty naturally. (Actually I find the "new Foo[0]" the most confusing -- why would I pass in a new empty array?) From josh at bloch.us Wed Sep 19 15:33:42 2012 From: josh at bloch.us (Joshua Bloch) Date: Wed, 19 Sep 2012 15:33:42 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A4397.6060501@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> Message-ID: Brian, On Wed, Sep 19, 2012 at 3:13 PM, Brian Goetz wrote: > I don't find these arguments convincing. There's no race (any more than >> there is for any bulk operation) as the allocation is done by the object >> itself. The allocation stuff is pretty much a red herring: most users >> don't preallocate the array. So it seems to me that using factories here >> might amount to needless complexity and inconsistency. >> > > I agree with you that most users don't pre-allocate the array. Which > makes the existing form of toArray even more unfortunate! Because then the > allocation always involves multiple reflective calls. (Some of which are > sometimes optimized by some VMs in some conditions, but none of which are > always optimized by all VMs in all conditions.) So the performance will > always be worse in the toArray(T[]) formulation. > Performance is typically irrelevant. In the rare cases where it isn't, you preallocate. Warping API for performance is generally a bad idea. > > We're creating a new API here. Yeah, I haven't exactly been keeping up, so I don't know the context. That said, the current toArray APIs are pretty good. I've never heard anyone complain about them. > > Here are what the client callsites might look like in various cases: > > // status quo > Foo[] foos = ...toArray(new Foo[0]); // ugh reflection > I agree that the above is a bit nasty. I think some syntactic sugar, perhaps coupled with a cache of commonly used 0-length arrays might be a good idea. Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120919/9176e6cf/attachment.html From sam at sampullara.com Wed Sep 19 15:58:05 2012 From: sam at sampullara.com (Sam Pullara) Date: Wed, 19 Sep 2012 15:58:05 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> Message-ID: I think that we should not only use this proposal but also make it work for .into() so i can pass in a lambda that gets a size if it is available: interface CollectionFactory { T create(int minSize); } Stream.into(CollectionFactory factory) stream.into(n -> new ArrayList(n)) I don't think I have ever NOT created mine to be the right size and type with the current API. Sam On Wed, Sep 19, 2012 at 3:33 PM, Joshua Bloch wrote: > Brian, > > On Wed, Sep 19, 2012 at 3:13 PM, Brian Goetz wrote: >>> >>> I don't find these arguments convincing. There's no race (any more than >>> there is for any bulk operation) as the allocation is done by the object >>> itself. The allocation stuff is pretty much a red herring: most users >>> don't preallocate the array. So it seems to me that using factories here >>> might amount to needless complexity and inconsistency. >> >> >> I agree with you that most users don't pre-allocate the array. Which >> makes the existing form of toArray even more unfortunate! Because then the >> allocation always involves multiple reflective calls. (Some of which are >> sometimes optimized by some VMs in some conditions, but none of which are >> always optimized by all VMs in all conditions.) So the performance will >> always be worse in the toArray(T[]) formulation. > > > Performance is typically irrelevant. In the rare cases where it isn't, you > preallocate. Warping API for performance is generally a bad idea. > >> >> >> We're creating a new API here. > > > Yeah, I haven't exactly been keeping up, so I don't know the context. That > said, the current toArray APIs are pretty good. I've never heard anyone > complain about them. > >> >> >> Here are what the client callsites might look like in various cases: >> >> // status quo >> Foo[] foos = ...toArray(new Foo[0]); // ugh reflection > > > I agree that the above is a bit nasty. I think some syntactic sugar, perhaps > coupled with a cache of commonly used 0-length arrays might be a good idea. > > Josh > From david.lloyd at redhat.com Wed Sep 19 16:45:44 2012 From: david.lloyd at redhat.com (David M. Lloyd) Date: Wed, 19 Sep 2012 18:45:44 -0500 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A4397.6060501@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> Message-ID: <505A5928.4090705@redhat.com> On 09/19/2012 05:13 PM, Brian Goetz wrote: >> I don't find these arguments convincing. There's no race (any more than >> there is for any bulk operation) as the allocation is done by the object >> itself. The allocation stuff is pretty much a red herring: most users >> don't preallocate the array. So it seems to me that using factories here >> might amount to needless complexity and inconsistency. > > I agree with you that most users don't pre-allocate the array. Which > makes the existing form of toArray even more unfortunate! Because then > the allocation always involves multiple reflective calls. (Some of > which are sometimes optimized by some VMs in some conditions, but none > of which are always optimized by all VMs in all conditions.) So the > performance will always be worse in the toArray(T[]) formulation. > > The fundamental problem is that the client knows how best to create the > array (the client can say "new Foo" but the library cannot say "new T", > and therefore has to fall back to reflection), but the library knows > best how big the array should be. > > This is a classic example of the sort of differences in APIs you get > when designing an API with or without closures. The client knows how; > the library knows how much; ideally we'd like for the client to pass > that knowledge into the library. The approximations we got when it is > hard to combine these are unfortunate; we can do better now. > > I'd find David's suggestion of toArray(Class) more compelling (in some > sense it is the most "right" in that it doesn't conflate "what" with > "how") except I don't buy that the intrinsification of reflective array > allocation in some VMs in some compilation modes in some situations > makes all the reflective costs go away. It should be: S[] toArray(Class clazz); Likewise if the ArrayFactory deal is the way we go it probably ought to be: S[] toArray(ArrayFactory clazz); ..because it seems to me you might have a wildcarded stream, or a more specific stream type than you intend to capture. Restricting it to the one actual type is, in a way, as bad as returning Object[]. > We're creating a new API here. All things being equal, we should lean > on consistency with existing APIs when we can, but obviously that is > just a guideline (someday we're going to have to contend with the fact > that an int isn't big enough to store the size of collections.) The > existing toArray signatures are the best we could have done at the time > (and that was a very different time), but that doesn't mean we shouldn't > seek to do any better. > > Here are what the client callsites might look like in various cases: > > // status quo > Foo[] foos = ...toArray(new Foo[0]); // ugh reflection > Foo[] foos = ...toArray(new Foo[xyz.size()]); // ugh ugly and racy > > // proposed > Foo[] foos = ...toArray(n -> new Foo[n]); > > // David's alternative > Foo[] foos = ...toArray(Foo.class); > > I don't see the "complexity of factories" being a big problem here -- if > people can deal with lambdas at all, this is a pretty simple case, and > its only a few characters longer than the "new Foo[0]" version. I think > the lambda code reads pretty naturally. (Actually I find the "new > Foo[0]" the most confusing -- why would I pass in a new empty array?) > -- - DML From josh at bloch.us Wed Sep 19 17:11:43 2012 From: josh at bloch.us (Joshua Bloch) Date: Wed, 19 Sep 2012 17:11:43 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505A5928.4090705@redhat.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505A5928.4090705@redhat.com> Message-ID: David, On Wed, Sep 19, 2012 at 4:45 PM, David M. Lloyd wrote: > > It should be: > > S[] toArray(Class clazz); > > Actually we considered and rejected that parameterization back in '03 (if memory serves). Sometimes the client knows more about the contents of the array than the compiler does. So, for example, suppose you know that a Collection contains only Integers. Then you might write: private static final Integer[] EMPTY_INTEGER_ARRAY = new Integer[0]; ... Collection c = ... ; ... Integer[] a = c.toArray(EMPTY_INTEGER_ARRAY); The type system can't prove that the call won't result in an ArrayStoreException, but you (the programmer) know that it won't. Perhaps we made the wrong decision, but it was a conscious decision. Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120919/47e37a26/attachment.html From Donald.Raab at gs.com Wed Sep 19 19:27:51 2012 From: Donald.Raab at gs.com (Raab, Donald) Date: Wed, 19 Sep 2012 22:27:51 -0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505A5928.4090705@redhat.com> Message-ID: <6712820CB52CFB4D842561213A77C054039E64369C@GSCMAMP09EX.firmwide.corp.gs.com> I like David's approach of passing Class to toArray(). I also didn't mind the approach of passing a factory interface where you could use a lambda. Then when I saw the example code below with a new Integer[0] array in a static variable, I remembered how many times I've written this kind of code over the years with static empty arrays and started thinking. Could we add the following methods to the Class class? T[] emptyArray() T[] newArray(int size) I may be missing something simple in Java's static type system somewhere that makes this not possible, but imagine if we could write the following: Integer.class.emptyArray() or Integer.class.newArray(0) and they returned the same static instance (held in the class of course), which would be possible because the empty array is immutable. This would be an improvement even with the current T[] toArray(T[]) cases because we could forever get rid of spurious empty array creation. Someone please wake me up if this is just an unfortunate pipe dream. From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda-libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Joshua Bloch Sent: Wednesday, September 19, 2012 8:12 PM To: David M. Lloyd Cc: lambda-libs-spec-experts at openjdk.java.net Subject: Re: ArrayFactory SAM type / toArray David, On Wed, Sep 19, 2012 at 4:45 PM, David M. Lloyd wrote: It should be: ? S[] toArray(Class clazz); Actually we considered and rejected that parameterization ?back in '03 (if memory serves). ?Sometimes the client knows more about the contents of the array than the compiler does. ?So, for example, suppose you know that a Collection contains only Integers. ?Then you might write: ? ? private static final Integer[]?EMPTY_INTEGER_ARRAY = new Integer[0]; ? ? ... ? ? Collection c = ... ; ? ? ... ? ? Integer[] a = c.toArray(EMPTY_INTEGER_ARRAY); The type system can't prove that the call won't result in an ArrayStoreException, but you (the programmer) know that it won't. ?Perhaps we made the wrong decision, but it was a conscious decision. ? ? ?Josh From josh at bloch.us Wed Sep 19 23:49:16 2012 From: josh at bloch.us (Joshua Bloch) Date: Wed, 19 Sep 2012 23:49:16 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <6712820CB52CFB4D842561213A77C054039E64369C@GSCMAMP09EX.firmwide.corp.gs.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505A5928.4090705@redhat.com> <6712820CB52CFB4D842561213A77C054039E64369C@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: Donald, I believe this does work, with one caveat: if the type is generic, these methods will return a generic array, which will generate an unchecked cast warning. Josh On Wed, Sep 19, 2012 at 7:27 PM, Raab, Donald wrote: > I like David's approach of passing Class to toArray(). I also didn't mind > the approach of passing a factory interface where you could use a lambda. > > Then when I saw the example code below with a new Integer[0] array in a > static variable, I remembered how many times I've written this kind of code > over the years with static empty arrays and started thinking. > > Could we add the following methods to the Class class? > > T[] emptyArray() > T[] newArray(int size) > > I may be missing something simple in Java's static type system somewhere > that makes this not possible, but imagine if we could write the following: > > Integer.class.emptyArray() > > or > > Integer.class.newArray(0) > > and they returned the same static instance (held in the class of course), > which would be possible because the empty array is immutable. This would > be an improvement even with the current T[] toArray(T[]) cases because we > could forever get rid of spurious empty array creation. > > Someone please wake me up if this is just an unfortunate pipe dream. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120919/ad5979e8/attachment.html From forax at univ-mlv.fr Thu Sep 20 01:13:27 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 20 Sep 2012 10:13:27 +0200 Subject: ArrayFactory SAM type / toArray In-Reply-To: References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505A5928.4090705@redhat.com> <6712820CB52CFB4D842561213A77C054039E64369C@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <505AD027.6090709@univ-mlv.fr> On 09/20/2012 08:49 AM, Joshua Bloch wrote: > Donald, > > I believe this does work, with one caveat: if the type is generic, > these methods will return a generic array, which will generate an > unchecked cast warning. No, with the current JLS rules, you can't create a Class of T with a generic T without a warning. List.class doesn't compile and foo = new ArrayList(), foo.getClass() returns a Class (and not a Class>ArrayList>), so you can't create a generic array without having an unsafe cast somewhere. But there is an issue with Class that represents a primitive type, int.class is typed Class, so either emptyArray() should return an Integer[] or it should throw an exception. > > Josh R?mi > > On Wed, Sep 19, 2012 at 7:27 PM, Raab, Donald > wrote: > > I like David's approach of passing Class to toArray(). I also > didn't mind the approach of passing a factory interface where you > could use a lambda. > > Then when I saw the example code below with a new Integer[0] array > in a static variable, I remembered how many times I've written > this kind of code over the years with static empty arrays and > started thinking. > > Could we add the following methods to the Class class? > > T[] emptyArray() > T[] newArray(int size) > > I may be missing something simple in Java's static type system > somewhere that makes this not possible, but imagine if we could > write the following: > > Integer.class.emptyArray() > > or > > Integer.class.newArray(0) > > and they returned the same static instance (held in the class of > course), which would be possible because the empty array is > immutable. This would be an improvement even with the current T[] > toArray(T[]) cases because we could forever get rid of spurious > empty array creation. > > Someone please wake me up if this is just an unfortunate pipe dream. > From aleksey.shipilev at oracle.com Thu Sep 20 03:25:56 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Thu, 20 Sep 2012 14:25:56 +0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> Message-ID: <505AEF34.2070200@oracle.com> Hi Josh, On 09/20/2012 02:33 AM, Joshua Bloch wrote: > I agree with you that most users don't pre-allocate the array. > Which makes the existing form of toArray even more unfortunate! > Because then the allocation always involves multiple reflective > calls. (Some of which are sometimes optimized by some VMs in some > conditions, but none of which are always optimized by all VMs in all > conditions.) So the performance will always be worse in the > toArray(T[]) formulation. > > > Performance is typically irrelevant. In the rare cases where it isn't, > you preallocate. Warping API for performance is generally a bad idea. Begging to differ here. Isn't the API the middleground for _both_ user convenience and library performance? Most users choose API for the consistency, but more and more users these days ask themselves from the ground up about the expected performance. In general sense, I would agree this is a premature optimization type of thing; but not after you start to deal with concurrency, excess allocations, and such. Having concise but more-computation-involved API is OK, but having one which messes with memory more than it should is no recipe for good performance. This, by the way, makes me thinking: if ArrayFactory contract is relaxed to say "provides the candidate for storing the values", i.e. dropping the referential equality out of the question, we can then do pre-cached returns from toArray for small collections, and thus completely spare the allocations on some of the code paths. -Aleksey. From aleksey.shipilev at oracle.com Thu Sep 20 03:37:24 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Thu, 20 Sep 2012 14:37:24 +0400 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505AEF34.2070200@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505AEF34.2070200@oracle.com> Message-ID: <505AF1E4.7090109@oracle.com> On 09/20/2012 02:25 PM, Aleksey Shipilev wrote: > This, by the way, makes me thinking: if ArrayFactory contract is relaxed > to say "provides the candidate for storing the values", i.e. dropping > the referential equality out of the question, we can then do pre-cached > returns from toArray for small collections, and thus completely spare > the allocations on some of the code paths. Haven't been thinking it through. This opens up the question about array immutability, and while generally possible in immutable collections, there is still a loophole to change the values in that returned array. Still, it can be a viable choice for returning empty array for empty collection. -Aleksey. From josh at bloch.us Thu Sep 20 08:37:08 2012 From: josh at bloch.us (Joshua Bloch) Date: Thu, 20 Sep 2012 08:37:08 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505AD027.6090709@univ-mlv.fr> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505A5928.4090705@redhat.com> <6712820CB52CFB4D842561213A77C054039E64369C@GSCMAMP09EX.firmwide.corp.gs.com> <505AD027.6090709@univ-mlv.fr> Message-ID: Remi, On Thu, Sep 20, 2012 at 1:13 AM, Remi Forax wrote: > On 09/20/2012 08:49 AM, Joshua Bloch wrote: > >> Donald, >> >> I believe this does work, with one caveat: if the type is generic, these >> methods will return a generic array, which will generate an unchecked cast >> warning. >> > > No, with the current JLS rules, you can't create a Class of T with a > generic T without a warning. > List.class doesn't compile and foo = new ArrayList(), > foo.getClass() returns a Class > (and not a Class>ArrayList>), so you can't create a generic array > without having an unsafe cast somewhere. > Umm... Yeah. That's exactly what I said. It works except that you have to use raw types if the type you're trying to get an empty array of is generic, in which case you'll get unchecked cast warnings. I suspect this won't be a common case, but it won't be vanishingly rare either. > But there is an issue with Class that represents a primitive type, > int.class is typed Class, so either emptyArray() should return an > Integer[] or > it should throw an exception. > Probably a non-issue, since Java doesn't support primitives as type parameters (sadly). Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120920/5b6830e7/attachment.html From josh at bloch.us Thu Sep 20 08:47:28 2012 From: josh at bloch.us (Joshua Bloch) Date: Thu, 20 Sep 2012 08:47:28 -0700 Subject: ArrayFactory SAM type / toArray In-Reply-To: <505AEF34.2070200@oracle.com> References: <505A2D81.3050306@oracle.com> <505A30EF.3090201@univ-mlv.fr> <505A35DB.90601@oracle.com> <505A4397.6060501@oracle.com> <505AEF34.2070200@oracle.com> Message-ID: Aleksey, On Thu, Sep 20, 2012 at 3:25 AM, Aleksey Shipilev < aleksey.shipilev at oracle.com> wrote: > > Begging to differ here. Isn't the API the middleground for _both_ user > convenience and library performance? Most users choose API for the > consistency, but more and more users these days ask themselves from the > ground up about the expected performance. > Yeah, I believe we're in complete agreement here. > This, by the way, makes me thinking: if ArrayFactory contract is relaxed > to say "provides the candidate for storing the values", i.e. dropping > the referential equality out of the question, we can then do pre-cached > returns from toArray for small collections, and thus completely spare > the allocations on some of the code paths. > Your subsequent observation (that all zero-length arrays are immutable, and all nonzero-length arrays are mutable, hence not sharable) is correct. That said, Donald's suggestion (Class.emptyArray and Class.newArray(T)) seems like a decent middle ground. It's (in all likelihood) faster than reflection and factory methods, and has smaller "conceptual surface area." It's too bad that it shines a light on two problems in Java's type system: the impedance mismatch between arrays and generics, and between primitives and generics. But those problems will haunt us forever (unless we decide to redo the type system and break compatibility in a major way). They're just a fact of life, given the design that we adopted for generics. Josh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120920/8fdbec39/attachment.html From dl at cs.oswego.edu Fri Sep 21 05:26:07 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 21 Sep 2012 08:26:07 -0400 Subject: Nulls In-Reply-To: <50573CFC.6070607@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> Message-ID: <505C5CDF.5010305@cs.oswego.edu> On 09/17/12 11:08, Brian Goetz wrote: > If we'd bitten the bullet and not allowed nulls as elements in Collections from > the beginning, we'd have less of a problem now. > > So, looking at only the sequential case right now, what are the realistic > options for handling nulls in streams? > Here's a stab at helping to clear this up a bit: Let's separate "dense" and "sparse" aggregates. A sparse aggregate has some way of representing and dealing with elements the could be present, but aren't, by using "null". A dense aggregate only has elements, not potential elements. Most sequence/stream-like things are dense, so null elements make no sense. Most array/map-like things are sparse -- for example, a null returned from an indexed or keyed access (a[i] or map.get(key)) means "the index/key is valid, but there's nothing there". The operations in the current Stream API in general apply only to actual elements, not representations of potential elements (i.e., not null). If you elevate this "in general" to "must", it leads to a choice of either or both of two simple rules: 1. Stream sources should only present actual elements, not nulls. 2. Stream operations should ignore nulls. Rule (1) means for example that the stream source for an ArrayList should not include unoccupied array cells. But the Iterator that it would be based on in current APIs doesn't do this filtering. And probably similarly for other classes, including nonJDK-classes that could not be adjusted to obey rule (1). Which, from this line of thinking, pretty much forces rule (2). There are other lines of thinking that lead to different rules. Others should feel free to argue for them. -Doug From forax at univ-mlv.fr Fri Sep 21 05:49:51 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 21 Sep 2012 14:49:51 +0200 Subject: Nulls In-Reply-To: <50573CFC.6070607@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> Message-ID: <505C626F.3080900@univ-mlv.fr> On 09/17/2012 05:08 PM, Brian Goetz wrote: > On 9/15/2012 10:24 AM, Doug Lea wrote: > >> To further rub in how central the "little" issues of optional/null, >> (as well as numerics) are in all this, note that flatMap is just a >> special form of mapReduce(x->coll, addAll) > > and filter() is just a special form of flatMap > >> , which can be >> implemented so as to require a basis/default policy >> only if there is nothing there, so could do one of: >> (1) return null (2) accept an empty-collection generator as >> basis/defaultValue arg (3) return Optional (4) factor into >> a special flatMapper interface that absorbs the problem >> (as Brian chose; in CHM, I support unified forms of map+Reduce >> explicitly, which leverages intrinsic null policy to naturally use >> option #1, so method flatMap does not even appear. > > We keep circling round the question of what to do about null values in > streams. Are they supported? Banned? Grumblingly tolerated? > > As Doug points out, for concurrent collections, the sensible way to > interpret null is "there's nothing there right now." But, given that > the streams API is built around an assumption of non-interference, we > don't have to worry about "right now" as much; the contents of the > stream source should remain constant during the course of the > calculation. > > If we'd bitten the bullet and not allowed nulls as elements in > Collections from the beginning, we'd have less of a problem now. > > So, looking at only the sequential case right now, what are the > realistic options for handling nulls in streams? > Support them is the only realistic option, there are too many codes out there that put null in collections, otherwise it will seriously impede the adoption of lambdas. R?mi From kevinb at google.com Fri Sep 21 08:37:21 2012 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 21 Sep 2012 08:37:21 -0700 Subject: Nulls In-Reply-To: <505C626F.3080900@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> Message-ID: On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax wrote: Support them is the only realistic option, there are too many codes out > there that put null in collections, otherwise it will seriously impede the > adoption of lambdas. > You would think so, but take a look at how hard Guava is on nulls, and we pretty much get away with it. There are always plenty of strategies for fixing your code to not need to put nulls into collections, and most of them leave the code better off. I'm not taking a position on the issue, just saying the argument that we * have* to support nulls doesn't hold water with me. So *what* if it "impedes adoption" of lambdas a bit? Pleasing everyone all of the time isn't an option anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/6898ae90/attachment.html From joe.bowbeer at gmail.com Fri Sep 21 09:04:29 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 09:04:29 -0700 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> Message-ID: My position is to support nulls in collections, which leads to supporting nulls in streams. I've never dealt with a popular language that didn't allow nulls -- except for the concurrent flavor of collections in Java, which strikes me as implementation leaking through into design. That said, I don't like nulls, and when I'm using a language like Scala that allows me to eliminate them, I try to do that, and I feel bad when I can't. But I still have nulls in some of my Scala code because it seems like the right thing in those cases. Maybe if I were a better Scala programmer, then I would know of a way to eliminate them in those cases that would seem even better... Btw, I make a distinction with maps and their entries. Maps don't contain null mappings, obviously, but I see some leeway regarding what objects are allowed in the map entries. For example, I would not allow null keys, but would allow null values. I think there's a similar distinction with streams. That is, I would allow nulls in streams, but not null mappings (MapEntry) in MapStreams. Joe On Fri, Sep 21, 2012 at 8:37 AM, Kevin Bourrillion wrote: > On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax wrote: > > Support them is the only realistic option, there are too many codes out >> there that put null in collections, otherwise it will seriously impede the >> adoption of lambdas. >> > > You would think so, but take a look at how hard Guava is on nulls, and we > pretty much get away with it. > > There are always plenty of strategies for fixing your code to not need to > put nulls into collections, and most of them leave the code better off. > > I'm not taking a position on the issue, just saying the argument that we * > have* to support nulls doesn't hold water with me. So *what* if it > "impedes adoption" of lambdas a bit? Pleasing everyone all of the time > isn't an option anyway. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/1059e1db/attachment.html From brian.goetz at oracle.com Fri Sep 21 09:08:18 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 12:08:18 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> Message-ID: <505C90F2.2000003@oracle.com> Should stream ops be permitted to squeeze out nulls when they don't make sense? For example: list...findFirst() returns an Optional. There are good arguments why an Optional should *not* be allowed to contain null. Should findFirst ignore nulls? Throw something if the stream begins with null? On 9/21/2012 12:04 PM, Joe Bowbeer wrote: > My position is to support nulls in collections, which leads to > supporting nulls in streams. > > I've never dealt with a popular language that didn't allow nulls -- > except for the concurrent flavor of collections in Java, which strikes > me as implementation leaking through into design. > > That said, I don't like nulls, and when I'm using a language like Scala > that allows me to eliminate them, I try to do that, and I feel bad when > I can't. But I still have nulls in some of my Scala code because it > seems like the right thing in those cases. Maybe if I were a better > Scala programmer, then I would know of a way to eliminate them in those > cases that would seem even better... > > Btw, I make a distinction with maps and their entries. Maps don't > contain null mappings, obviously, but I see some leeway regarding what > objects are allowed in the map entries. For example, I would not allow > null keys, but would allow null values. I think there's a similar > distinction with streams. That is, I would allow nulls in streams, but > not null mappings (MapEntry) in MapStreams. > > Joe > > On Fri, Sep 21, 2012 at 8:37 AM, Kevin Bourrillion wrote: > > On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax wrote: > > Support them is the only realistic option, there are too many > codes out there that put null in collections, otherwise it will > seriously impede the adoption of lambdas. > > > You would think so, but take a look at how hard Guava is on nulls, > and we pretty much get away with it. > > There are always plenty of strategies for fixing your code to not > need to put nulls into collections, and most of them leave the code > better off. > > I'm not taking a position on the issue, just saying the argument > that we /have/ to support nulls doesn't hold water with me. So > /what/ if it "impedes adoption" of lambdas a bit? Pleasing everyone > all of the time isn't an option anyway. > > From forax at univ-mlv.fr Fri Sep 21 09:11:02 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 21 Sep 2012 18:11:02 +0200 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> Message-ID: <505C9196.8020301@univ-mlv.fr> On 09/21/2012 05:37 PM, Kevin Bourrillion wrote: > On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax > wrote: > > Support them is the only realistic option, there are too many > codes out there that put null in collections, otherwise it will > seriously impede the adoption of lambdas. > > > You would think so, but take a look at how hard Guava is on nulls, and > we pretty much get away with it. > > There are always plenty of strategies for fixing your code to not need > to put nulls into collections, and most of them leave the code better off. Let's say that streams will not support nulls. My fear is that if a collection have a null in it, it will blow in the middle of the process, far away from where the error lies i.e. when null was added in the collection. To reuse the Josh moto, blow often, blow early, if you don't throw the exception early, at the point where the mistake is made, throwing an exception in the middle of the process will be seen as something annoying instead as something that heps devs. > > I'm not taking a position on the issue, just saying the argument that > we /have/ to support nulls doesn't hold water with me. So /what/ if it > "impedes adoption" of lambdas a bit? Pleasing everyone all of the time > isn't an option anyway. R?mi From brian.goetz at oracle.com Fri Sep 21 09:25:52 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 12:25:52 -0400 Subject: Nulls In-Reply-To: <505C9196.8020301@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> Message-ID: <505C9510.1090606@oracle.com> I think the "if we don't support it people won't adopt lambda" is way, way overstated. Its a factor to consider, nothing more. But, the reality is that code where there are nulls in collections is still likely to blow *somewhere*. Yes, it would be great if it blew where the null was inserted. But that ship has sailed. What we're discussing now is: - should we check nulls when the element flows into the stream? - should we check nulls at the other end, like forEach/into/toArray? - should we stick our fingers in our ears and say "la la la can't see those nulls la la la"? On 9/21/2012 12:11 PM, Remi Forax wrote: > On 09/21/2012 05:37 PM, Kevin Bourrillion wrote: >> On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax > > wrote: >> >> Support them is the only realistic option, there are too many >> codes out there that put null in collections, otherwise it will >> seriously impede the adoption of lambdas. >> >> >> You would think so, but take a look at how hard Guava is on nulls, and >> we pretty much get away with it. >> >> There are always plenty of strategies for fixing your code to not need >> to put nulls into collections, and most of them leave the code better >> off. > > Let's say that streams will not support nulls. > My fear is that if a collection have a null in it, it will blow in the > middle of the process, > far away from where the error lies i.e. when null was added in the > collection. > To reuse the Josh moto, blow often, blow early, if you don't throw the > exception early, > at the point where the mistake is made, throwing an exception in the > middle of the process > will be seen as something annoying instead as something that heps devs. > >> >> I'm not taking a position on the issue, just saying the argument that >> we /have/ to support nulls doesn't hold water with me. So /what/ if it >> "impedes adoption" of lambdas a bit? Pleasing everyone all of the time >> isn't an option anyway. > > R?mi > From joe.bowbeer at gmail.com Fri Sep 21 09:26:43 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 09:26:43 -0700 Subject: Nulls In-Reply-To: <505C90F2.2000003@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> Message-ID: It depends on how findFirst is defined. In groovy, find() returns the first element that satisfies groovy truth, and null does not satisfy! But there is also a find(predicate) where the predicate determines what truth is. (Please excuse the groovy lingo.) If there is this operation: Optional findFirst() then there should also be a version that takes a Predicate: Optional findFirst(Predicate) and then the question can be rephrased: What is the implicit predicate that findFirst() employs? If the predicate ignores null, then I think it should ignore a few other things as well. But the question really boils down to whether Optional can contain null, and I don't think it should. This leads me to question whether there should be another version of findFirst for non-Optional programming. I find that the two styles don't mix well. Most Java programmers are naturally non-Optional but might want to take advantage of Optional. However, I would tend to use a non-Optional findFirst more in Java than not -- because the rest of Java is not Optional-friendly. Joe On Fri, Sep 21, 2012 at 9:08 AM, Brian Goetz wrote: > Should stream ops be permitted to squeeze out nulls when they don't make > sense? > > For example: > > list...findFirst() > > returns an Optional. There are good arguments why an Optional should > *not* be allowed to contain null. Should findFirst ignore nulls? Throw > something if the stream begins with null? > > > On 9/21/2012 12:04 PM, Joe Bowbeer wrote: > >> My position is to support nulls in collections, which leads to >> supporting nulls in streams. >> >> I've never dealt with a popular language that didn't allow nulls -- >> except for the concurrent flavor of collections in Java, which strikes >> me as implementation leaking through into design. >> >> That said, I don't like nulls, and when I'm using a language like Scala >> that allows me to eliminate them, I try to do that, and I feel bad when >> I can't. But I still have nulls in some of my Scala code because it >> seems like the right thing in those cases. Maybe if I were a better >> Scala programmer, then I would know of a way to eliminate them in those >> cases that would seem even better... >> >> Btw, I make a distinction with maps and their entries. Maps don't >> contain null mappings, obviously, but I see some leeway regarding what >> objects are allowed in the map entries. For example, I would not allow >> null keys, but would allow null values. I think there's a similar >> distinction with streams. That is, I would allow nulls in streams, but >> not null mappings (MapEntry) in MapStreams. >> >> Joe >> >> On Fri, Sep 21, 2012 at 8:37 AM, Kevin Bourrillion wrote: >> >> On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax wrote: >> >> Support them is the only realistic option, there are too many >> codes out there that put null in collections, otherwise it will >> seriously impede the adoption of lambdas. >> >> >> You would think so, but take a look at how hard Guava is on nulls, >> and we pretty much get away with it. >> >> There are always plenty of strategies for fixing your code to not >> need to put nulls into collections, and most of them leave the code >> better off. >> >> I'm not taking a position on the issue, just saying the argument >> that we /have/ to support nulls doesn't hold water with me. So >> /what/ if it "impedes adoption" of lambdas a bit? Pleasing everyone >> >> all of the time isn't an option anyway. >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/fbf598f3/attachment.html From brian.goetz at oracle.com Fri Sep 21 09:30:54 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 12:30:54 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> Message-ID: <505C963E.90006@oracle.com> > If there is this operation: > > Optional findFirst() > > then there should also be a version that takes a Predicate: > > Optional findFirst(Predicate) We started there and beat a rapid U-turn. If you have findFirst(Predicate), you end up reinventing the whole stream protocol either with overloads of find or with other methods on Optional or both, because what if you want to filter and then map? Do you do firstFirst(Predicate, Mapper)? There already is filter(Predicate), so having a findFirst(Predicate) is unnecessary. If we have filter(Predicate) all the same arguments apply anyway. > But the question really boils down to whether Optional can contain null, > and I don't think it should. I agree, so this is where the strong force meets the heavy object. Do we try to keep the nulls away, or do we treat this as an illegal stream and blow when it gets to findFirst, or do we ignore nulls and treat them as "not there"? From joe.bowbeer at gmail.com Fri Sep 21 09:31:58 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 09:31:58 -0700 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> Message-ID: I may want to revisit my statement that Optional cannot contain null: > But the question really boils down to whether Optional can contain null, and I don't think it should. Maybe an Optional that contains null is the right way to differentiate a null (no element) from a null element. On Fri, Sep 21, 2012 at 9:26 AM, Joe Bowbeer wrote: > It depends on how findFirst is defined. > > In groovy, find() returns the first element that satisfies groovy truth, > and null does not satisfy! But there is also a find(predicate) where the > predicate determines what truth is. (Please excuse the groovy lingo.) > > If there is this operation: > > Optional findFirst() > > then there should also be a version that takes a Predicate: > > Optional findFirst(Predicate) > > and then the question can be rephrased: What is the implicit predicate > that findFirst() employs? > > If the predicate ignores null, then I think it should ignore a few other > things as well. > > But the question really boils down to whether Optional can contain null, > and I don't think it should. > > This leads me to question whether there should be another version of > findFirst for non-Optional programming. I find that the two styles don't > mix well. Most Java programmers are naturally non-Optional but might want > to take advantage of Optional. However, I would tend to use a non-Optional > findFirst more in Java than not -- because the rest of Java is not > Optional-friendly. > > Joe > > On Fri, Sep 21, 2012 at 9:08 AM, Brian Goetz wrote: > >> Should stream ops be permitted to squeeze out nulls when they don't make >> sense? >> >> For example: >> >> list...findFirst() >> >> returns an Optional. There are good arguments why an Optional should >> *not* be allowed to contain null. Should findFirst ignore nulls? Throw >> something if the stream begins with null? >> >> >> On 9/21/2012 12:04 PM, Joe Bowbeer wrote: >> >>> My position is to support nulls in collections, which leads to >>> supporting nulls in streams. >>> >>> I've never dealt with a popular language that didn't allow nulls -- >>> except for the concurrent flavor of collections in Java, which strikes >>> me as implementation leaking through into design. >>> >>> That said, I don't like nulls, and when I'm using a language like Scala >>> that allows me to eliminate them, I try to do that, and I feel bad when >>> I can't. But I still have nulls in some of my Scala code because it >>> seems like the right thing in those cases. Maybe if I were a better >>> Scala programmer, then I would know of a way to eliminate them in those >>> cases that would seem even better... >>> >>> Btw, I make a distinction with maps and their entries. Maps don't >>> contain null mappings, obviously, but I see some leeway regarding what >>> objects are allowed in the map entries. For example, I would not allow >>> null keys, but would allow null values. I think there's a similar >>> distinction with streams. That is, I would allow nulls in streams, but >>> not null mappings (MapEntry) in MapStreams. >>> >>> Joe >>> >>> On Fri, Sep 21, 2012 at 8:37 AM, Kevin Bourrillion wrote: >>> >>> On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax wrote: >>> >>> Support them is the only realistic option, there are too many >>> codes out there that put null in collections, otherwise it will >>> seriously impede the adoption of lambdas. >>> >>> >>> You would think so, but take a look at how hard Guava is on nulls, >>> and we pretty much get away with it. >>> >>> There are always plenty of strategies for fixing your code to not >>> need to put nulls into collections, and most of them leave the code >>> better off. >>> >>> I'm not taking a position on the issue, just saying the argument >>> that we /have/ to support nulls doesn't hold water with me. So >>> /what/ if it "impedes adoption" of lambdas a bit? Pleasing everyone >>> >>> all of the time isn't an option anyway. >>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/b5f1c521/attachment.html From joe.bowbeer at gmail.com Fri Sep 21 09:35:20 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 09:35:20 -0700 Subject: Nulls In-Reply-To: <505C963E.90006@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> Message-ID: > I agree, so this is where the strong force meets the heavy object. The null penetrates the Optional? On Fri, Sep 21, 2012 at 9:30 AM, Brian Goetz wrote: > If there is this operation: >> >> Optional findFirst() >> >> then there should also be a version that takes a Predicate: >> >> Optional findFirst(Predicate) >> > > We started there and beat a rapid U-turn. > > If you have findFirst(Predicate), you end up reinventing the whole stream > protocol either with overloads of find or with other methods on Optional or > both, because what if you want to filter and then map? Do you do > firstFirst(Predicate, Mapper)? There already is filter(Predicate), so > having a findFirst(Predicate) is unnecessary. > > If we have filter(Predicate) all the same arguments apply anyway. > > > But the question really boils down to whether Optional can contain null, >> and I don't think it should. >> > > I agree, so this is where the strong force meets the heavy object. Do we > try to keep the nulls away, or do we treat this as an illegal stream and > blow when it gets to findFirst, or do we ignore nulls and treat them as > "not there"? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/692210bf/attachment-0001.html From brian.goetz at oracle.com Fri Sep 21 09:36:56 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 12:36:56 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> Message-ID: <505C97A8.2010901@oracle.com> But this leaves us with now three ways of saying "no value" - reference to Optional is null - Optional is empty - Optional is not empty but contains null And this just kicks the NPE can down the street. On 9/21/2012 12:35 PM, Joe Bowbeer wrote: > > > I agree, so this is where the strong force meets the heavy object. > > The null penetrates the Optional? > > > On Fri, Sep 21, 2012 at 9:30 AM, Brian Goetz > wrote: > > If there is this operation: > > Optional findFirst() > > then there should also be a version that takes a Predicate: > > Optional findFirst(Predicate) > > > We started there and beat a rapid U-turn. > > If you have findFirst(Predicate), you end up reinventing the whole > stream protocol either with overloads of find or with other methods > on Optional or both, because what if you want to filter and then > map? Do you do firstFirst(Predicate, Mapper)? There already is > filter(Predicate), so having a findFirst(Predicate) is unnecessary. > > If we have filter(Predicate) all the same arguments apply anyway. > > > But the question really boils down to whether Optional can > contain null, > and I don't think it should. > > > I agree, so this is where the strong force meets the heavy object. > Do we try to keep the nulls away, or do we treat this as an > illegal stream and blow when it gets to findFirst, or do we ignore > nulls and treat them as "not there"? > > From forax at univ-mlv.fr Fri Sep 21 09:56:21 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 21 Sep 2012 18:56:21 +0200 Subject: Nulls In-Reply-To: <505C9510.1090606@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: <505C9C35.6050104@univ-mlv.fr> On 09/21/2012 06:25 PM, Brian Goetz wrote: > I think the "if we don't support it people won't adopt lambda" is way, > way overstated. Its a factor to consider, nothing more. > > But, the reality is that code where there are nulls in collections is > still likely to blow *somewhere*. Yes, it would be great if it blew > where the null was inserted. But that ship has sailed. What we're > discussing now is: > - should we check nulls when the element flows into the stream? you will blow in the middle of the pipeline. > - should we check nulls at the other end, like forEach/into/toArray? what about filter/map, what is the meaning of null, like in concurrent collection ? (no meaning) > - should we stick our fingers in our ears and say "la la la can't see > those nulls la la la"? I suppose the last sentence, is the same as if a user put null in a collection, just send him back. R?mi > > On 9/21/2012 12:11 PM, Remi Forax wrote: >> On 09/21/2012 05:37 PM, Kevin Bourrillion wrote: >>> On Fri, Sep 21, 2012 at 5:49 AM, Remi Forax >> > wrote: >>> >>> Support them is the only realistic option, there are too many >>> codes out there that put null in collections, otherwise it will >>> seriously impede the adoption of lambdas. >>> >>> >>> You would think so, but take a look at how hard Guava is on nulls, and >>> we pretty much get away with it. >>> >>> There are always plenty of strategies for fixing your code to not need >>> to put nulls into collections, and most of them leave the code better >>> off. >> >> Let's say that streams will not support nulls. >> My fear is that if a collection have a null in it, it will blow in the >> middle of the process, >> far away from where the error lies i.e. when null was added in the >> collection. >> To reuse the Josh moto, blow often, blow early, if you don't throw the >> exception early, >> at the point where the mistake is made, throwing an exception in the >> middle of the process >> will be seen as something annoying instead as something that heps devs. >> >>> >>> I'm not taking a position on the issue, just saying the argument that >>> we /have/ to support nulls doesn't hold water with me. So /what/ if it >>> "impedes adoption" of lambdas a bit? Pleasing everyone all of the time >>> isn't an option anyway. >> >> R?mi >> From tim at peierls.net Fri Sep 21 10:16:46 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 21 Sep 2012 13:16:46 -0400 Subject: Nulls In-Reply-To: <505C97A8.2010901@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> <505C97A8.2010901@oracle.com> Message-ID: On Fri, Sep 21, 2012 at 12:36 PM, Brian Goetz wrote: > But this leaves us with now three ways of saying "no value" > > - reference to Optional is null > - Optional is empty > - Optional is not empty but contains null > > And this just kicks the NPE can down the street. Which would vitiate the main reason to introduce Optional in the first place. I like Optional and would hate to see it gutted this way. I want to eat my cake and have it, too: 1. Outlaw null from collections (and as a value for Optiona), blowing up noisily if null encountered. 2. *Maybe *allow narrow exception to previous for combined filter/map and the like as efficient shortcut. 3. Use Optional for return values where there might not be a result. 4. Provide non-Optional-returning variants that take a non-null default return value to avoid object creation *in some cases*. That makes things hard for people who like to put nulls in their collections (e.g., Joe), but things are pretty grim for those people already: All the really cool collections out there forbid nulls. ;-) --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/e12ed484/attachment.html From brian.goetz at oracle.com Fri Sep 21 10:19:47 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 13:19:47 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> <505C97A8.2010901@oracle.com> Message-ID: <505CA1B3.6060004@oracle.com> > 4. Provide non-Optional-returning variants that take a non-null default > return value to avoid object creation *in some cases*. It's called: iterator(). From tim at peierls.net Fri Sep 21 10:37:44 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 21 Sep 2012 13:37:44 -0400 Subject: Nulls In-Reply-To: <505CA1B3.6060004@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> <505C97A8.2010901@oracle.com> <505CA1B3.6060004@oracle.com> Message-ID: On Fri, Sep 21, 2012 at 1:19 PM, Brian Goetz wrote: > 4. Provide non-Optional-returning variants that take a non-null default >> return value to avoid object creation *in some cases*. >> > > It's called: iterator(). > For some things, yes, but in general, good luck explaining that. For example, I bet more people would grok what's going on in this: T maxOr(T defval); ... Integer m = intStream.maxOr(Integer.MIN_VALUE); if (m == Integer.MIN_VALUE) { // empty, no max } else { // do something with m } than in this: Iterator max(); ... Iterator it = intStream.max(); if (it.hasNext()) { // do something with it.next() } else { // empty, no max } Actually, more obvious still would be to use Optional: Optional m = intStream.max(); if (m.isPresent()) { // do something with m.get() } else { // empty, no max } I only bring up the others because so many people are worried that the extra Optional object will never ever be optimized away and will slow everything down. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/486bfd48/attachment.html From brian.goetz at oracle.com Fri Sep 21 10:41:12 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 13:41:12 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> <505C97A8.2010901@oracle.com> <505CA1B3.6060004@oracle.com> Message-ID: <505CA6B8.70605@oracle.com> > I only bring up the others because so many people are worried that the > extra Optional object will never ever be optimized away and will slow > everything down. The "extra object" argument here is mostly a red herring. There is so much work going on to set up the iterators and such, that one more object *per pipeline* is noise. An object per element, sure, that's a big deal. But to set up a pipeline you're probably already allocating a few objects per pipeline stage -- the Op object, an Iterator or a Sink, etc. If you're doing a parallel decomposition you're creating log(n) FJtasks and such. Again, another object at the end is noise. From joe.bowbeer at gmail.com Fri Sep 21 11:04:02 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 11:04:02 -0700 Subject: Nulls In-Reply-To: <505C9510.1090606@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: On Fri, Sep 21, 2012 at 9:25 AM, Brian Goetz wrote: > I think the "if we don't support it people won't adopt lambda" is way, way > overstated. Its a factor to consider, nothing more. > > But, the reality is that code where there are nulls in collections is > still likely to blow *somewhere*. Yes, it would be great if it blew where > the null was inserted. But that ship has sailed. What we're discussing > now is: > - should we check nulls when the element flows into the stream? > - should we check nulls at the other end, like forEach/into/toArray? > - should we stick our fingers in our ears and say "la la la can't see > those nulls la la la"? I think there is a strong usability case for supporting nulls. I see it this way: Q: How many Java methods return null? A: Countless many, and there is no type-system in place to indicate when they do or don't Q: What will lamda users want to do? A: Apply their methods and collect the results; perform parallel reduction, stream them, etc. We will either need to force these lambda users to create Optional-like adapters for all of their sources of null in the world, or we will need to support the most common use cases without throwing exceptions. I don't like the first option for lambda initiates because it presents them with a hurdle right away. I'm undecided whether Optional is of any use in the bigger design effort. I'm certain they won't be as useful as they are in Scala, because the absence of Optional doesn't mean anything in Java -- whereas the absence in Scala can mean no NPE. In any event, Optional should be subservient to the bigger design effort. Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/cbd81cb6/attachment-0001.html From forax at univ-mlv.fr Fri Sep 21 11:17:12 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 21 Sep 2012 20:17:12 +0200 Subject: Nulls In-Reply-To: <505CA6B8.70605@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C90F2.2000003@oracle.com> <505C963E.90006@oracle.com> <505C97A8.2010901@oracle.com> <505CA1B3.6060004@oracle.com> <505CA6B8.70605@oracle.com> Message-ID: <505CAF28.30507@univ-mlv.fr> On 09/21/2012 07:41 PM, Brian Goetz wrote: >> I only bring up the others because so many people are worried that the >> extra Optional object will never ever be optimized away and will slow >> everything down. > > The "extra object" argument here is mostly a red herring. There is so > much work going on to set up the iterators and such, that one more > object *per pipeline* is noise. An object per element, sure, that's a > big deal. But to set up a pipeline you're probably already allocating > a few objects per pipeline stage -- the Op object, an Iterator or a > Sink, etc. Objects per pipeline is noise until you set up the pipeline in a loop. BTW, you can use the same object for the Iterator and the Op, may be not a good design but it will reduce the number of objects creation. > > If you're doing a parallel decomposition you're creating log(n) > FJtasks and such. Again, another object at the end is noise. > R?mi From tim at peierls.net Fri Sep 21 11:26:23 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 21 Sep 2012 14:26:23 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: On Fri, Sep 21, 2012 at 2:04 PM, Joe Bowbeer wrote: > On Fri, Sep 21, 2012 at 9:25 AM, Brian Goetz wrote: > >> I think the "if we don't support it people won't adopt lambda" is way, >> way overstated. Its a factor to consider, nothing more. >> > It never occured to me that Optional was being proposed as a sop to lambda users. I've been using Guava's version of Optional for about a year, in Java 6 without any thought of lambdas. > I think there is a strong usability case for supporting nulls. I see it > this way: > > Q: How many Java methods return null? > A: Countless many, and there is no type-system in place to indicate when > they do or don't > > Q: What will lamda users want to do? > A: Apply their methods and collect the results; perform parallel > reduction, stream them, etc. > > We will either need to force these lambda users to create Optional-like > adapters for all of their sources of null in the world, > Seems like a few standard adapters would help a lot. Optional.fromNullable(nullable), Collection.fromNullable(collectionWithNulls), etc. > or we will need to support the most common use cases without throwing > exceptions. I don't like the first option for lambda initiates because it > presents them with a hurdle right away. > Maybe there are some very common usages that should be tolerated, but if everything is tolerated, no one will ever change, and people will continue to write APIs that return null, and users will continue not to test for null, and an opportunity to improve things will have been missed. > I'm undecided whether Optional is of any use in the bigger design effort. > I'm certain they won't be as useful as they are in Scala, because the > absence of Optional doesn't mean anything in Java -- whereas the absence in > Scala can mean no NPE. In any event, Optional should be subservient to the > bigger design effort. > I had forgotten that Scala has an Optional. The Optional I use is *not *Scala's Optional. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/37b2f498/attachment.html From david.lloyd at redhat.com Fri Sep 21 11:38:52 2012 From: david.lloyd at redhat.com (David M. Lloyd) Date: Fri, 21 Sep 2012 13:38:52 -0500 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: <505CB43C.7040601@redhat.com> Not taking sides in the specific implementation debate, I just want to state that I don't think it really matters if there are "hurdles" to learning/using lambdas. In fact it'd be better promote "better" practices if possible. We don't have to *sell* this thing; rather it just has to be *good*. I agree with Brian, and would further say: if it's "good" then the right people will use it when they're ready, and it will give them good results. If it's "bad" (but, say, marginally easier to use as a result of the choices that make it "worse") we're just going to make the greater Java world an uglier place. On 09/21/2012 01:04 PM, Joe Bowbeer wrote: > On Fri, Sep 21, 2012 at 9:25 AM, Brian Goetz wrote: > > I think the "if we don't support it people won't adopt lambda" is > way, way overstated. Its a factor to consider, nothing more. > > But, the reality is that code where there are nulls in collections > is still likely to blow *somewhere*. Yes, it would be great if it > blew where the null was inserted. But that ship has sailed. What > we're discussing now is: > - should we check nulls when the element flows into the stream? > - should we check nulls at the other end, like forEach/into/toArray? > - should we stick our fingers in our ears and say "la la la can't > see those nulls la la la"? > > > > I think there is a strong usability case for supporting nulls. I see it > this way: > > Q: How many Java methods return null? > A: Countless many, and there is no type-system in place to indicate when > they do or don't > > Q: What will lamda users want to do? > A: Apply their methods and collect the results; perform parallel > reduction, stream them, etc. > > We will either need to force these lambda users to create Optional-like > adapters for all of their sources of null in the world, or we will need > to support the most common use cases without throwing exceptions. I > don't like the first option for lambda initiates because it presents > them with a hurdle right away. > > I'm undecided whether Optional is of any use in the bigger design > effort. I'm certain they won't be as useful as they are in Scala, > because the absence of Optional doesn't mean anything in Java -- whereas > the absence in Scala can mean no NPE. In any event, Optional should be > subservient to the bigger design effort. > > Joe > -- - DML From brian.goetz at oracle.com Fri Sep 21 12:07:29 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Sep 2012 15:07:29 -0400 Subject: Spliterator Message-ID: <505CBAF1.8010203@oracle.com> Various subsets of people have gone around various times on the interface for Spliterator (and by extension, StreamAccessor.) After playing with the implementation a bit, I think we have something decent that balances most of the issues raised. These issues include: - Extent and mechanism for participating in the "when to split" decision; - State constraints (i.e., can you split after iterating?) - API decisions that force runtime costs. Doug suggested the following for splitting support: long estimateSize() where the implementation could return MAX_VALUE to mean "I have no clue". The key constraint is that this estimate (a) never increases and (b) eventually decreases, so that eventually, if perhaps jerkily, it will converge to something that is reasonable to compare against a target chunk size. Separately, I suggested the following, also for splitting support: int getNaturalSplits() which indicates what the "natural" number of splits is for the data structure. (For various reasons, it is probably more natural for this to return one less.) So for a non-splittable data structure this is zero; for an array that we plan to split in a binary fashion, this is 1, for a 4-ary tree, this is 3, etc. It is completely advisory and the client is free to ignore it, but if the client takes it into account, will likely get a more balanced computation tree. It also costs almost nothing to support, and if you always return 1, it degenerates to the same binary splits we have now. It is also useful to know if we know the exact size of a split, and moreover if we know we'll know exact sizes all the way down. This is true for array sources, for example, and can be used when the pipeline terminates in a toArray, because then the leaves can write their results directly into a big shared array in exactly the right spot. So, for example: list.stream().map(...).toArray() can get by with allocating only a single array for the result, partitioning the input data set, and each leaf writes the results into the right place, avoiding copying. For this, we have two methods: boolean isExactSplits() which means that "all my child spliterators will commit to knowing their size", and long getSizeIfKnown() // returns -1 if unknown where if isExactSplits() is true, getSizeIfKnown is guaranteed to not return -1. For simplicity, since both size-bearing methods (exact and estimated) have a "I don't know" value, they probably should both be -1 rather than one being -1 and one being MAX_VALUE. Here's where I currently am on the API: public interface Spliterator { // For split decision support int getNaturalSplits(); int estimateSize() default { return getSizeIfKnown(); } // Exact-sizing support int getSizeIfKnown() default { return -1; } boolean isPredictableSplits() default { return false; } // Element access Spliterator split(); Iterator iterator(); void into(Sink sink) default { sink.begin(estimateSize()); Iterator remaining = iterator(); while (remaining.hasNext()) { sink.accept(remaining.next()); } sink.end(); } } As to state constraints, I am currently thinking: - Can't call split() after calling iterator() Still thinking about: - Should sizing methods also become invalid after calling iterator()? (This would reduce the need to keep the count accurate.) (Both of the above are cheap to enforce because they only add code on the splitting path, not the per-element path.) As to cost imposition, it may look like having an explicit Iterator method (instead of extending Iterator) means creating an Iterator, but what I've found is that in all cases, I can have one class that implements StreamAccessor, Spliterator, and Iterator, so the Iterator implementation can just "return this". Also since we have to write a flag to indicated "have started iterating", with an iterator() method, we have a natural place to write that once, rather than writing it each time through hasNext/next. So when you're ready to start iterating, you usually don't have to create an extra Iterator object. I think this is a pretty good balance of all the issues we've discussed so far, and easy enough to implement efficiently. From joe.bowbeer at gmail.com Fri Sep 21 15:03:33 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 21 Sep 2012 15:03:33 -0700 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: The Option type reportedly originated in the last millennium in ML, and has been used in Haskell (as Maybe), and several other languages. null exists in Scala for interoperability with existing Java code, and it is possible, though rare in Scala, to create a Some(null), which I think would be the moral equivalent of Optional(null). The AnyRef type in Scala is essentially Object in Java, and therefore an Option[AnyRef] can hold a Some(null) value. Btw, I think this snippet from James Iry's blog is relevant: For interoperability reasons Scala has a full blown Java-like null. But the > experience that all the Scala developers I know have is that NPE is mostly > only a problem when dealing with existing Java libraries because Scala > programmers and libraries avoid it. Plus, Scala's "Option(foo)" promotes > foo to either Some(foo) or None depending on whether it's null or not that > at least enables the map/flatMap style of "safe invoke." It's far less than > perfect, but about the best that can be done when dealing directly with > Java. http://james-iry.blogspot.com/2010/08/why-scalas-and-haskells-types-will-save.html Joe On Fri, Sep 21, 2012 at 11:26 AM, Tim Peierls wrote: > On Fri, Sep 21, 2012 at 2:04 PM, Joe Bowbeer wrote: > >> On Fri, Sep 21, 2012 at 9:25 AM, Brian Goetz wrote: >> >>> I think the "if we don't support it people won't adopt lambda" is way, >>> way overstated. Its a factor to consider, nothing more. >>> >> > It never occured to me that Optional was being proposed as a sop to lambda > users. I've been using Guava's version of Optional for about a year, in > Java 6 without any thought of lambdas. > > > >> I think there is a strong usability case for supporting nulls. I see it >> this way: >> >> Q: How many Java methods return null? >> A: Countless many, and there is no type-system in place to indicate when >> they do or don't >> >> Q: What will lamda users want to do? >> A: Apply their methods and collect the results; perform parallel >> reduction, stream them, etc. >> >> We will either need to force these lambda users to create Optional-like >> adapters for all of their sources of null in the world, >> > > Seems like a few standard adapters would help a lot. > Optional.fromNullable(nullable), > Collection.fromNullable(collectionWithNulls), etc. > > > >> or we will need to support the most common use cases without throwing >> exceptions. I don't like the first option for lambda initiates because it >> presents them with a hurdle right away. >> > > Maybe there are some very common usages that should be tolerated, but if > everything is tolerated, no one will ever change, and people will continue > to write APIs that return null, and users will continue not to test for > null, and an opportunity to improve things will have been missed. > > > >> I'm undecided whether Optional is of any use in the bigger design effort. >> I'm certain they won't be as useful as they are in Scala, because the >> absence of Optional doesn't mean anything in Java -- whereas the absence in >> Scala can mean no NPE. In any event, Optional should be subservient to the >> bigger design effort. >> > > I had forgotten that Scala has an Optional. The Optional I use is *not *Scala's > Optional. > > --tim > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/56cae6ab/attachment.html From tim at peierls.net Fri Sep 21 15:47:17 2012 From: tim at peierls.net (Tim Peierls) Date: Fri, 21 Sep 2012 18:47:17 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: Yes, I know of these other types. But the Guava Optional javadocs read, in part: This class is not intended as a direct analogue of any existing "option" or "maybe" construct from other programming environments, though it may bear some similarities. It is this class that I have been using for a year and that I think ought to be made a standard class in Java, pretty much as is. It has paid back in spades the extremely small investment needed to learn it. --tim On Fri, Sep 21, 2012 at 6:03 PM, Joe Bowbeer wrote: > The Option type reportedly originated in the last millennium in ML, and > has been used in Haskell (as Maybe), and several other languages. > > null exists in Scala for interoperability with existing Java code, and it > is possible, though rare in Scala, to create a Some(null), which I think > would be the moral equivalent of Optional(null). The AnyRef type in Scala > is essentially Object in Java, and therefore an Option[AnyRef] can hold a > Some(null) value. > > Btw, I think this snippet from James Iry's blog is relevant: > > For interoperability reasons Scala has a full blown Java-like null. But >> the experience that all the Scala developers I know have is that NPE is >> mostly only a problem when dealing with existing Java libraries because >> Scala programmers and libraries avoid it. Plus, Scala's "Option(foo)" >> promotes foo to either Some(foo) or None depending on whether it's null or >> not that at least enables the map/flatMap style of "safe invoke." It's far >> less than perfect, but about the best that can be done when dealing >> directly with Java. > > > > http://james-iry.blogspot.com/2010/08/why-scalas-and-haskells-types-will-save.html > > Joe > > On Fri, Sep 21, 2012 at 11:26 AM, Tim Peierls wrote: > >> On Fri, Sep 21, 2012 at 2:04 PM, Joe Bowbeer wrote: >> >>> On Fri, Sep 21, 2012 at 9:25 AM, Brian Goetz wrote: >>> >>>> I think the "if we don't support it people won't adopt lambda" is way, >>>> way overstated. Its a factor to consider, nothing more. >>>> >>> >> It never occured to me that Optional was being proposed as a sop to >> lambda users. I've been using Guava's version of Optional for about a year, >> in Java 6 without any thought of lambdas. >> >> >> >>> I think there is a strong usability case for supporting nulls. I see it >>> this way: >>> >>> Q: How many Java methods return null? >>> A: Countless many, and there is no type-system in place to indicate when >>> they do or don't >>> >>> Q: What will lamda users want to do? >>> A: Apply their methods and collect the results; perform parallel >>> reduction, stream them, etc. >>> >>> We will either need to force these lambda users to create Optional-like >>> adapters for all of their sources of null in the world, >>> >> >> Seems like a few standard adapters would help a lot. >> Optional.fromNullable(nullable), >> Collection.fromNullable(collectionWithNulls), etc. >> >> >> >>> or we will need to support the most common use cases without throwing >>> exceptions. I don't like the first option for lambda initiates because it >>> presents them with a hurdle right away. >>> >> >> Maybe there are some very common usages that should be tolerated, but if >> everything is tolerated, no one will ever change, and people will continue >> to write APIs that return null, and users will continue not to test for >> null, and an opportunity to improve things will have been missed. >> >> >> >>> I'm undecided whether Optional is of any use in the bigger design >>> effort. I'm certain they won't be as useful as they are in Scala, because >>> the absence of Optional doesn't mean anything in Java -- whereas the >>> absence in Scala can mean no NPE. In any event, Optional should be >>> subservient to the bigger design effort. >>> >> >> I had forgotten that Scala has an Optional. The Optional I use is *not *Scala's >> Optional. >> >> --tim >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120921/30ff8da9/attachment-0001.html From dl at cs.oswego.edu Sat Sep 22 05:00:07 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 22 Sep 2012 08:00:07 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> Message-ID: <505DA847.80207@cs.oswego.edu> One more push for the what I still think is the most defensible (and simplest to explain) set of design rules: 1. Stream operations ignore null elements. 2. Each operation that can return a "nothing there" result has two forms Optional op(...) T op(..., T defaultValueIfNone); Notes: Rule (1) means that stream.forEach(action) must be implemented as x = getNextElement(); if (x != null) action.apply(x) and so on. As I mentioned, it would arguably be better to require that Streams themselves never produce nulls, but this can't be done without losing the ability to rely on iterators for existing collections. The only argument I know against this rule is that it can have the effect of delaying any consequences of using an element that should have been nonnull but was null due to a programming error. I'm sympathetic, but I think that burdening Streams with this is misdirected: Early detection of such errors is one reason why (dense) collections themselves shouldn't allow nulls. If people have chosen to use collections allowing nulls, they have already made a choice about this. Rule (1) removes the ambiguity of an Optional with value null. (And enables the spec for Optional to say that a present optional is never null.) Rule (2) enables fluent styles without requiring them. This reflects the fact an "Optional" type is required only in languages that support "value types" that can never be null. This includes Java, but only for primitive types. However, some types (like String) act so much like value types that using this style is appropriate. As an unrelated byproduct, Optional supports more fluent expression-y style that some people love so much they cannot otherwise cope, and others want to at least sometimes use. My guess is that once some of the newness of fluency wears off, most people will be in the second group, so will want multiple options. The other (default arg) method form applies in contexts where null (or here, extended to arbitrary default values) returns have their traditional meanings without forcing an unneeded second level of wrapping. This in part reflects the fact that Optional and boxing are essentially the same idea, and so an optional around a box is just pure wasted overhead that conscientious developers may wish to avoid. (During Java5 development, some people thought that boxing overheads weren't important enough to provide systematic alternatives to. As it turns out, they were very wrong. We can at least profit from the lessons learned here.) Finally, among the best arguments for these rules is that they apply equally well to value types (primitives, plus any future compound value types). So any API conforming to them has a chance of being specialized in its entirely to, say, streams of doubles. Although one remaining messy part is that "Optional" (if such a thing were legal) is basically an alias for existing class Double, and there seems to be no reasonable way to force them to be the same nominal type. My Numerics posts address one way to reduce impact, but I still don't see a general backward compatible solution. -Doug From tim at peierls.net Sat Sep 22 06:37:21 2012 From: tim at peierls.net (Tim Peierls) Date: Sat, 22 Sep 2012 09:37:21 -0400 Subject: Nulls In-Reply-To: <505DA847.80207@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> Message-ID: On Sat, Sep 22, 2012 at 8:00 AM, Doug Lea
wrote: > 1. Stream operations ignore null elements. > 2. Each operation that can return a "nothing there" result has two forms > Optional op(...) > T op(..., T defaultValueIfNone); > Yup, that does feel like the sweet spot. In my perfect world, (1) would never apply and the second form in (2) would turn out not to have been necessary. But that's unrealistic on both counts. > (And enables the spec for Optional to say that a present > optional is never null.) > Yes, very desirable. > My guess is that once some of the > newness of fluency wears off, most people will be in the > second group, so will want multiple options. > Both forms are "fluent", at least the way I think of fluency. But it's not too terrible having both options, and as Doug points out, the second form handles the null case nicely. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120922/eee5246d/attachment.html From brian.goetz at oracle.com Sat Sep 22 08:02:52 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Sep 2012 11:02:52 -0400 Subject: Nulls In-Reply-To: <505DA847.80207@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> Message-ID: <505DD31C.1090005@oracle.com> > One more push for the what I still think is the most defensible > (and simplest to explain) set of design rules: > > 1. Stream operations ignore null elements. This interacts in an unfortunate way with a property I've been fighting to preserve -- size-preserving ops. I really like that in array.map(...).toArray() I know the exact size of the target and exactly where to put each mapped element before I start. So I prefer to interpret your suggestion as "*may* ignore null elements". So in this case, a reduction can ignore nulls, a find can ignore nulls, but "dense" ops can choose not to. > 2. Each operation that can return a "nothing there" result has two forms > Optional op(...) > T op(..., T defaultValueIfNone); This gets confusing for reduce, since currently we have: T reduce(T base, Reducer) // trying not to upset Doug by Optional reduce(Reducer) // saying BinaryOperator If we add a T reduce(Reducer, T defaultValueIfNone) the user will forever be confused between the first form and the third. In both cases, Optional was introduced to cope not with nulls, but with empty inputs, but of course nulls are a pesky neither-here-nor-there corner case. > Rule (1) removes the ambiguity of an Optional with value null. +1 > (And enables the spec for Optional to say that a present > optional is never null.) +1 > Although one remaining messy part is that "Optional" > (if such a thing were legal) is basically an alias for > existing class Double, and there seems to be no reasonable > way to force them to be the same nominal type. My Numerics posts > address one way to reduce impact, but I still don't see a > general backward compatible solution. Right, OptionalNumeric means we only need one Optional primitive class rather than N. From tim at peierls.net Sat Sep 22 09:43:21 2012 From: tim at peierls.net (Tim Peierls) Date: Sat, 22 Sep 2012 12:43:21 -0400 Subject: Nulls In-Reply-To: <505DD31C.1090005@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> Message-ID: On Sat, Sep 22, 2012 at 11:02 AM, Brian Goetz wrote: > 1. Stream operations ignore null elements. >> > > This interacts in an unfortunate way with a property I've been fighting to > preserve -- size-preserving ops. ... > So I prefer to interpret your suggestion as "*may* ignore null elements". > So in this case, a reduction can ignore nulls, a find can ignore nulls, > but "dense" ops can choose not to. I'd hate to have to put null checks in mapping functions, though. How about just "size-preserving in the absence of nulls"? I think of nulls as a case to be allowed but not encouraged. > This gets confusing for reduce, since currently we have: > > > T reduce(T base, Reducer) // trying not to upset Doug by > Optional reduce(Reducer) // saying BinaryOperator > > If we add a > > T reduce(Reducer, T defaultValueIfNone) > > the user will forever be confused between the first form and the third. > Can we really not find a way to pick just one of the first and third forms? Yes, I realize there is a subtle difference between "base" and "defaultValueIfNone", but I'm having trouble coming up with a realistic way to be bitten by this. (Shouldn't the default value come first, btw, to allow the remaining args to end with a varargs?) --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120922/0d4c7eef/attachment.html From brian.goetz at oracle.com Sat Sep 22 09:47:08 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Sep 2012 12:47:08 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> Message-ID: <505DEB8C.9090207@oracle.com> > I'd hate to have to put null checks in mapping functions, though. That just move the null checks elsewhere. If you don't want null checks in your mapping functions, and we're going to allow nulls in streams, you can add a .filter(e -> e != null) to the top of the chain. > How > about just "size-preserving in the absence of nulls"? I think of nulls > as a case to be allowed but not encouraged. That undermines a lot of valuable copy-avoidance optimizations. > Can we really not find a way to pick just one of the first and third > forms? Yes, I realize there is a subtle difference between "base" and > "defaultValueIfNone", but I'm having trouble coming up with a realistic > way to be bitten by this. Yes, we can have one -- the two effectively mean the same thing. My point was that the general "rule" being proposed only works for one of the two things that currently return Optional. From dl at cs.oswego.edu Sat Sep 22 09:52:57 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 22 Sep 2012 12:52:57 -0400 Subject: Nulls In-Reply-To: <505DD31C.1090005@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> Message-ID: <505DECE9.6030401@cs.oswego.edu> On 09/22/12 11:02, Brian Goetz wrote: >> One more push for the what I still think is the most defensible >> (and simplest to explain) set of design rules: >> >> 1. Stream operations ignore null elements. > > This interacts in an unfortunate way with a property I've been fighting to > preserve -- size-preserving ops. I really like that in > > array.map(...).toArray() > > I know the exact size of the target and exactly where to put each mapped element > before I start. Dealing with possibly-sparse arrays in the Stream API proper seems to be a stretch anyway. As I always say, ConcurrentHashMaps and Arrays are special enough to have their own extended APIs that let you do non-stream-y stuff not directly accessible via Stream API. ArrayLists are a little problematic in that while most people treat them as dense lists, they can also be used as sparse (partially null) arrays. I still think that allowing a (Parallel)Array view of an ArrayList is likely the best move. > > So I prefer to interpret your suggestion as "*may* ignore null elements". So in > this case, a reduction can ignore nulls, a find can ignore nulls, but "dense" > ops can choose not to. But nothing tells anyone whether they are in a dense case? Which blows the opportunity for having a nice simple explainable rule. (Maybe it is the wrong precedent because of different audience, but we got lots of brownie points in j.u.c for having some design rules that are so consistent that some people don't even know they know them. Like: all blocking methods have try- and timed variants. No null elements in queues. etc.) > >> 2. Each operation that can return a "nothing there" result has two forms >> Optional op(...) >> T op(..., T defaultValueIfNone); > > This gets confusing for reduce, since currently we have: > > T reduce(T base, Reducer) // trying not to upset Doug by > Optional reduce(Reducer) // saying BinaryOperator > > If we add a > > T reduce(Reducer, T defaultValueIfNone) Why would you add this method if you already have it? :-) I didn't necessarily mean that the default argument must always be last. -Doug From brian.goetz at oracle.com Sat Sep 22 10:04:36 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Sep 2012 13:04:36 -0400 Subject: Nulls In-Reply-To: <505DECE9.6030401@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> Message-ID: <505DEFA4.9080909@oracle.com> > Dealing with possibly-sparse arrays in the Stream API > proper seems to be a stretch anyway. As I always say, > ConcurrentHashMaps and Arrays are special enough to have > their own extended APIs that let you do non-stream-y stuff > not directly accessible via Stream API. > ArrayLists are a little problematic in that while most > people treat them as dense lists, they can also be used > as sparse (partially null) arrays. I still think that > allowing a (Parallel)Array view of an ArrayList is likely > the best move. > >> So I prefer to interpret your suggestion as "*may* ignore null >> elements". So in >> this case, a reduction can ignore nulls, a find can ignore nulls, but >> "dense" >> ops can choose not to. I would rather not punish everyone because some idiot puts nulls in a collection and that same idiot filters/maps it with lambdas that can't deal with nulls. With the exception of the Option-bearing findXxx methods, we could declare streams to be null-oblivious, and the rest of responsibility for dealing with nulls goes back to the user who used a null-bearing stream source in the first place. So here's a slightly less simple rule, but one which does less damage: - Null values are not treated specially by streams, except by findXxx (still to decide: throw or ignore.) I kind of prefer throw if we're going to do that, since its more consistent -- it is merely enforcing an invariant of Optional, and then Streams has nothing to say about nulls at all. We can (without undermining our copy-avoidance optimizations) add a filterNulls Op, by adding a stream state flag NO_NULLS and collections that already prohibit nulls would have that bit set. So in this case: guaranteedNullFreeSource.filterNulls().map(..).toArray() could still do the copy avoidance (and the filterNulls turns into a no-op.) >> If we add a >> >> T reduce(Reducer, T defaultValueIfNone) > > Why would you add this method if you already have it? :-) > I didn't necessarily mean that the default argument must always be last. It was a too-subtle way of pointing out that this rule felt like we were extrapolating from the (currently) single data point of findXxx. From dl at cs.oswego.edu Sat Sep 22 10:16:06 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 22 Sep 2012 13:16:06 -0400 Subject: Nulls In-Reply-To: <505DEFA4.9080909@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> Message-ID: <505DF256.5050708@cs.oswego.edu> On 09/22/12 13:04, Brian Goetz wrote: > I would rather not punish everyone because some idiot puts nulls in a collection Unless that punishment is reduced to essentially nothing (even (especially?) if it leads to even worse punishment for offenders). Remember that JVMs must do null checks all the time anyway. Keeping track of whether you've even seen one, and thus must throw away and/or repack a destination seems too cheap to stand in the way of having a nicer rule. -Doug From brian.goetz at oracle.com Sat Sep 22 10:39:41 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Sep 2012 13:39:41 -0400 Subject: Nulls In-Reply-To: <505DF256.5050708@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> Message-ID: <505DF7DD.8070900@oracle.com> This is a pretty simple (maybe simpler) rule: - Streams are completely null-oblivious (we don't treat them specially at all) - Option is null-hostile. The Stream just passes values along, null or not, whether it be to user-supplied lambdas, the Option ctor, the add() method of a collection provided to into(), etc; if that recipient can't handle it, it blows up there. If that recipient wants to ignore nulls, that's OK too -- it's outside of the Streams API spec. Then this mostly becomes a property of Optional. (And, if we provide a default-bearing version too, if people want the null, they can use the other version.) On 9/22/2012 1:16 PM, Doug Lea wrote: > On 09/22/12 13:04, Brian Goetz wrote: > >> I would rather not punish everyone because some idiot puts nulls in a >> collection > > Unless that punishment is reduced to essentially nothing > (even (especially?) if it leads to even worse punishment for offenders). > > Remember that JVMs must do null checks all the time anyway. > Keeping track of whether you've even seen one, and thus must > throw away and/or repack a destination seems too cheap to > stand in the way of having a nicer rule. > > -Doug > From joe.bowbeer at gmail.com Sat Sep 22 10:55:15 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Sat, 22 Sep 2012 10:55:15 -0700 Subject: Nulls In-Reply-To: <505DF7DD.8070900@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> Message-ID: On Sat, Sep 22, 2012 at 10:39 AM, Brian Goetz wrote: > This is a pretty simple (maybe simpler) rule: > - Streams are completely null-oblivious (we don't treat them specially at > all) > - Option is null-hostile. > > The Stream just passes values along, null or not, whether it be to > user-supplied lambdas, the Option ctor, the add() method of a collection > provided to into(), etc; if that recipient can't handle it, it blows up > there. If that recipient wants to ignore nulls, that's OK too -- it's > outside of the Streams API spec. Then this mostly becomes a property of > Optional. (And, if we provide a default-bearing version too, if people > want the null, they can use the other version.) I like these rules. The default-bearing version(s) can be emulated with the Option version with an additional Option2Default transform stage, so I would like to avoid the default-bearing version(s) if we can get away with it. Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120922/87ffbf58/attachment.html From brian.goetz at oracle.com Sat Sep 22 11:00:43 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Sep 2012 14:00:43 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> Message-ID: <505DFCCB.4000902@oracle.com> > I like these rules. The default-bearing version(s) can be emulated with > the Option version with an additional Option2Default transform stage, so > I would like to avoid the default-bearing version(s) if we can get away > with it. Right. We can factor out the default-injection by having an ifEmpty op: list.filter(...) .ifEmpty(defaultValue) .optionBearingOp(); Not a big deal currently as we have relatively few Option-bearing ops, but probably better to factor these out rather than ad-hoc fusing. Similarly, we can have a filterNulls() op if users want the first non-null value: list.blah(...) .filterNulls() .findFirst(); (For stream sources where we *know* there are no nulls, such as TreeSet.keys(), the cost of the filterNulls() is O(1) since it can be optimized out of existence when we build the chain.) From tim at peierls.net Sat Sep 22 18:12:29 2012 From: tim at peierls.net (Tim Peierls) Date: Sat, 22 Sep 2012 21:12:29 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> Message-ID: Works for me. Btw, the syntax I use for default values currently: Result result = stream.findFirst(pred).or(defaultResult); I like that. --tim On Sat, Sep 22, 2012 at 1:55 PM, Joe Bowbeer wrote: > On Sat, Sep 22, 2012 at 10:39 AM, Brian Goetz wrote: > >> This is a pretty simple (maybe simpler) rule: >> - Streams are completely null-oblivious (we don't treat them specially >> at all) >> - Option is null-hostile. >> >> The Stream just passes values along, null or not, whether it be to >> user-supplied lambdas, the Option ctor, the add() method of a collection >> provided to into(), etc; if that recipient can't handle it, it blows up >> there. If that recipient wants to ignore nulls, that's OK too -- it's >> outside of the Streams API spec. Then this mostly becomes a property of >> Optional. (And, if we provide a default-bearing version too, if people >> want the null, they can use the other version.) > > > I like these rules. The default-bearing version(s) can be emulated with > the Option version with an additional Option2Default transform stage, so I > would like to avoid the default-bearing version(s) if we can get away with > it. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120922/e8922ed6/attachment.html From dl at cs.oswego.edu Sun Sep 23 05:07:10 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Sep 2012 08:07:10 -0400 Subject: Nulls In-Reply-To: <505DF7DD.8070900@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> Message-ID: <505EFB6E.9030906@cs.oswego.edu> On 09/22/12 13:39, Brian Goetz wrote: > This is a pretty simple (maybe simpler) rule: > - Streams are completely null-oblivious (we don't treat them specially at all) > - Option is null-hostile. > > The Stream just passes values along, null or not, whether it be to user-supplied > lambdas, the Option ctor, the add() method of a collection provided to into(), > etc; if that recipient can't handle it, it blows up there. If that recipient > wants to ignore nulls, that's OK too -- it's outside of the Streams API spec. > Then this mostly becomes a property of Optional. (And, if we provide a > default-bearing version too, if people want the null, they can use the other > version.) The main downside is that findAny is forced to lie (reporting absent) if a null item matches predicate. Unless you want to reconsider whether present Optionals can be null. Which no one seems to want to do. -Doug > > On 9/22/2012 1:16 PM, Doug Lea wrote: >> On 09/22/12 13:04, Brian Goetz wrote: >> >>> I would rather not punish everyone because some idiot puts nulls in a >>> collection >> >> Unless that punishment is reduced to essentially nothing >> (even (especially?) if it leads to even worse punishment for offenders). >> >> Remember that JVMs must do null checks all the time anyway. >> Keeping track of whether you've even seen one, and thus must >> throw away and/or repack a destination seems too cheap to >> stand in the way of having a nicer rule. >> >> -Doug >> > From tim at peierls.net Sun Sep 23 05:38:51 2012 From: tim at peierls.net (Tim Peierls) Date: Sun, 23 Sep 2012 08:38:51 -0400 Subject: Nulls In-Reply-To: <505EFB6E.9030906@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> Message-ID: On Sun, Sep 23, 2012 at 8:07 AM, Doug Lea
wrote: > The main downside is that findAny is forced to lie (reporting absent) if a > null item matches predicate. > Doesn't bother me. > Unless you want to reconsider whether present Optionals can be null. Which > no one seems to want to do. Right. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120923/8810b704/attachment.html From dl at cs.oswego.edu Sun Sep 23 06:12:00 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Sep 2012 09:12:00 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> Message-ID: <505F0AA0.1000003@cs.oswego.edu> On 09/23/12 08:38, Tim Peierls wrote: > On Sun, Sep 23, 2012 at 8:07 AM, Doug Lea
> wrote: > > The main downside is that findAny is forced to lie (reporting absent) if a > null item matches predicate. > > > Doesn't bother me. It encounters the same antipattern seen when you need to establish that a Map key has no mapping: if (map.get(k) == null) // don't know if there is a mapping if (!map.containsKey(k)) // so recheck Which might not seem so terrible in particular cases where you are prepared to cope with mappings to null. But the issues make it impossible to write some generic Map utilities because the need to recheck forces non-atomicity. For findAny etc, the issue is even harder: if (!...findAny(...).isPresent()) // somehow recheck? And the need for recheck is even less obvious. -Doug From dl at cs.oswego.edu Sun Sep 23 06:53:41 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Sep 2012 09:53:41 -0400 Subject: Nulls In-Reply-To: <505F0FED.7010703@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F0FED.7010703@univ-mlv.fr> Message-ID: <505F1465.7010507@cs.oswego.edu> On 09/23/12 09:34, Remi Forax wrote: > On 09/23/2012 03:12 PM, Doug Lea wrote: >> It encounters the same antipattern seen when you >> need to establish that a Map key has no mapping: >> if (map.get(k) == null) // don't know if there is a mapping >> if (!map.containsKey(k)) // so recheck > > yes, for Map, we need a getEntry() but introducing it now as a default method > will have the same problem that the default implementation is not atomic. Mostly an aside: Without further guarantees, even getEntry encounters problems: Map.Entry e = map.getEntry(k); // ... if (e.getValue() == null) // was e removed or is value null? In j.u.c ConcurrentMaps.entrySet().iterators, that return Entry objects, we guarantee snapshot semantics, which along with no-nulls rule, removes this uncertainty. Entry.setValue is still under-constrained though -- we do write-through, which revives an entry if had been removed. -Doug From tim at peierls.net Sun Sep 23 06:54:25 2012 From: tim at peierls.net (Tim Peierls) Date: Sun, 23 Sep 2012 09:54:25 -0400 Subject: Nulls In-Reply-To: <505F0AA0.1000003@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> Message-ID: On Sun, Sep 23, 2012 at 9:12 AM, Doug Lea
wrote: > The main downside is that findAny is forced to lie (reporting absent) >> if a null item matches predicate. >> >> Doesn't bother me. >> > > It encounters the same antipattern seen when you > need to establish that a Map key has no mapping: > if (map.get(k) == null) // don't know if there is a mapping > if (!map.containsKey(k)) // so recheck > Too bad Map.get doesn't return Optional ! :-) Map is a done deal, though. We're talking about weirdness that can arise when you tolerate null elements in streams. I don't want to bend over backwards for such usage. > Which might not seem so terrible in particular cases where > you are prepared to cope with mappings to null. People shouldn't be mapping to null in the first place, any more than they should be mapping to NaN. > But the issues make it impossible to write some generic > Map utilities because the need to recheck forces non-atomicity. > > For findAny etc, the issue is even harder: > if (!...findAny(...).isPresent()) > // somehow recheck? > And the need for recheck is even less obvious. So unobvious that I still don't see it. As long as you aren't treating null as an acceptable value in Streams or as the contents of an Optional, why do you need to re-check? --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120923/25ac9a21/attachment.html From dl at cs.oswego.edu Sun Sep 23 06:58:31 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Sep 2012 09:58:31 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> Message-ID: <505F1587.3080407@cs.oswego.edu> On 09/23/12 09:54, Tim Peierls wrote: > > People shouldn't be mapping to null in the first place, any more than they > should be mapping to NaN. Yes. (This has been my stance for about 25 years straight :-) > > But the issues make it impossible to write some generic > Map utilities because the need to recheck forces non-atomicity. > > For findAny etc, the issue is even harder: > if (!...findAny(...).isPresent()) > // somehow recheck? > And the need for recheck is even less obvious. > > > So unobvious that I still don't see it. As long as you aren't treating null as > an acceptable value in Streams or as the contents of an Optional, why do you > need to re-check? > Suppose there are two elements that match predicate, one null, one nonnull. And suppose the findAny implementation finds the null one first and so reports an absent Optional. -Doug From tim at peierls.net Sun Sep 23 07:10:00 2012 From: tim at peierls.net (Tim Peierls) Date: Sun, 23 Sep 2012 10:10:00 -0400 Subject: Nulls In-Reply-To: <505F1587.3080407@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> Message-ID: On Sun, Sep 23, 2012 at 9:58 AM, Doug Lea
wrote: > So unobvious that I still don't see it. As long as you aren't treating >> null as >> an acceptable value in Streams or as the contents of an Optional, why do >> you >> need to re-check? >> > > Suppose there are two elements that match predicate, one null, one > nonnull. And suppose the findAny implementation finds the null one > first and so reports an absent Optional. Brian said: I would rather not punish everyone because some idiot puts nulls in a > collection and that same idiot filters/maps it with lambdas that can't deal > with nulls. and I'm solidly behind that. (Although for marketing purposes, we should find a better word than "idiot".) What I'm taking away from Doug's message is that there are *no* lambdas that can deal with nulls. So I reject the premise ("Suppose there are two elements that match predicate, one null, one nonnull."). A predicate can't match null meaningfully, so let's not bend over backwards -- I'm using that expression a lot, I know -- to try to find complete consistency when nulls are involved. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120923/6d048d11/attachment.html From dl at cs.oswego.edu Sun Sep 23 07:11:43 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Sep 2012 10:11:43 -0400 Subject: Nulls In-Reply-To: <505F1587.3080407@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> Message-ID: <505F189F.2080806@cs.oswego.edu> Oh, to clarify... On 09/23/12 09:58, Doug Lea wrote: > On 09/23/12 09:54, Tim Peierls wrote: >> For findAny etc, the issue is even harder: >> if (!...findAny(...).isPresent()) >> // somehow recheck? >> And the need for recheck is even less obvious. >> >> >> So unobvious that I still don't see it. As long as you aren't treating null as >> an acceptable value in Streams or as the contents of an Optional, why do you >> need to re-check? But in Brian's proposal, null IS acceptable. >> > > Suppose there are two elements that match predicate, one null, one > nonnull. And suppose the findAny implementation finds the null one > first and so reports an absent Optional. > > -Doug > > From tim at peierls.net Sun Sep 23 07:33:25 2012 From: tim at peierls.net (Tim Peierls) Date: Sun, 23 Sep 2012 10:33:25 -0400 Subject: Nulls In-Reply-To: <505F189F.2080806@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> <505F189F.2080806@cs.oswego.edu> Message-ID: On Sun, Sep 23, 2012 at 10:11 AM, Doug Lea
wrote: > So unobvious that I still don't see it. As long as you aren't treating >>> null as >>> an acceptable value in Streams or as the contents of an Optional, why do >>> you >>> need to re-check? >>> >> > But in Brian's proposal, null IS acceptable. It's tolerated, but what we're talking about -- are we not? -- is the extent to which this toleration distorts the overall design. So when I said "Doesn't bother me", I meant that the implications you point out for null-obliviousness -- e.g., findAny "lying"; repeating Map.get anti-pattern -- aren't important enough to me to warrant abandoning the proposed rules: > > - Streams are completely null-oblivious (we don't treat them specially > at all) > > > - Option is null-hostile. > > Yes, there's a risk that people won't get the "don't use nulls in your collections" memo, but as you point out by bringing up Map.get, it's something that folks have been dealing with for years. At least the ones who do get the memo won't be punished for the potential sins of those who don't. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120923/b1df5474/attachment.html From brian.goetz at oracle.com Sun Sep 23 07:55:21 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Sep 2012 10:55:21 -0400 Subject: Nulls In-Reply-To: <505EFB6E.9030906@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> Message-ID: <505F22D9.5020707@oracle.com> > The main downside is that findAny is forced to lie (reporting absent) > if a null item matches predicate. I'm not sure I'd describe throwing as "reporting absent", but I see the point -- we may make a distinction of "its not the stream throwing, its the optional", but users may not see the difference. But, in any case, as downsides go (and there has to be one), this one doesn't seem as bad as most of the alternatives. If users want a non-null result in cases like this, they can add a .filter(o -> o != null) and order is restored. (The same way they'd deal with any null-hostile target and null-friendly stream; explicitly filter out the nulls.) If we want to make this idiom prettier we can add Objects::isNull / Objects::isNonNull). Giving the user the choice of filtering the nulls explicitly or not seems better than either implicitly filtering or implicitly throwing. (Yes, Option-bearing findXxx is implicitly throwing, which makes for a slightly more complex rule.) In the happy case, it doesn't make a difference, but in the unhappy case, users are in full control. > Unless you want to reconsider > whether present Optionals can be null. Which no one seems to want to do. Not me. From brian.goetz at oracle.com Sun Sep 23 08:00:35 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Sep 2012 11:00:35 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> Message-ID: <505F2413.5070706@oracle.com> > What I'm taking away from Doug's > message is that there are *no* lambdas that can deal with nulls. There's nothing intrinsically null-hostile about lambdas; lambdas are basically methods, you can pass null into them and they can return null. However, many naively-written lambdas (e.g., .map(e -> e.toString()) will throw when null is presented to them. Again, this is not a problem when people expect nulls in their collection. When people don't expect nulls -- which is most of the time -- the reality is they'll get some unexpected NPE somewhere, and then they can put .filter(o != null) in their pipeline. From brian.goetz at oracle.com Sun Sep 23 08:53:34 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Sep 2012 11:53:34 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> <505F189F.2080806@cs.oswego.edu> Message-ID: <505F307E.3060100@oracle.com> Suppose we were designing the foreach loop. Would we consider generating code in the compiler-generated loop header to skip over nulls? If we didn't do that, why should for (T t : collection) { blah(t) } and collection.stream().forEach(t -> { blah(t) }) have different behaviors? (ObNitpicking: They are different in the sense that the lambda can't capture mutable up-level locals, and the foreach loop can.) People write foreach loops all the time that will behave badly if the induction variable is null. People will write lambdas with the same characteristics. If we are doing to do anything special for nulls, I would prefer to ban them/fail-fast, but people have argued that is impractical. Second best seems to be not trying to do anything special to protect people from them or imbue them with any special characteristics? On 9/23/2012 10:33 AM, Tim Peierls wrote: > On Sun, Sep 23, 2012 at 10:11 AM, Doug Lea
> wrote: > > So unobvious that I still don't see it. As long as you > aren't treating null as > an acceptable value in Streams or as the contents of an > Optional, why do you > need to re-check? > > > But in Brian's proposal, null IS acceptable. > > > It's tolerated, but what we're talking about -- are we not? -- is the > extent to which this toleration distorts the overall design. > > So when I said "Doesn't bother me", I meant that the implications you > point out for null-obliviousness -- e.g., findAny "lying"; repeating > Map.get anti-pattern -- aren't important enough to me to warrant > abandoning the proposed rules: > > * Streams are completely null-oblivious (we don't treat them > specially at all) > > * Option is null-hostile. > > Yes, there's a risk that people won't get the "don't use nulls in your > collections" memo, but as you point out by bringing up Map.get, it's > something that folks have been dealing with for years. At least the ones > who do get the memo won't be punished for the potential sins of those > who don't. > > --tim From brian.goetz at oracle.com Sun Sep 23 12:21:14 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Sep 2012 15:21:14 -0400 Subject: Issue tracker Message-ID: <505F612A.3070507@oracle.com> As part of the transition to JCP 2.8, we need a publicly accessible issue tracker. We had, of course, hoped to have our new JIRA instance available for this purpose long ago, but there have been delays with that (though progress is still being made.) In order to not further delay the transition to JCP 2.8, we will use a temporary issue tracker on java.net under the OpenJDK specification terms of use (same as this mailing list.) Please send me (privately) your java.net IDs and I will add you as members of the project. From forax at univ-mlv.fr Sun Sep 23 07:12:19 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 23 Sep 2012 16:12:19 +0200 Subject: Nulls In-Reply-To: <505F1465.7010507@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F0FED.7010703@univ-mlv.fr> <505F1465.7010507@cs.oswego.edu> Message-ID: <505F18C3.6080506@univ-mlv.fr> On 09/23/2012 03:53 PM, Doug Lea wrote: > On 09/23/12 09:34, Remi Forax wrote: >> On 09/23/2012 03:12 PM, Doug Lea wrote: > >>> It encounters the same antipattern seen when you >>> need to establish that a Map key has no mapping: >>> if (map.get(k) == null) // don't know if there is a mapping >>> if (!map.containsKey(k)) // so recheck > >> >> yes, for Map, we need a getEntry() but introducing it now as a >> default method >> will have the same problem that the default implementation is not >> atomic. > > Mostly an aside: Without further guarantees, even getEntry > encounters problems: > Map.Entry e = map.getEntry(k); > // ... > if (e.getValue() == null) // was e removed or is value null? > > In j.u.c ConcurrentMaps.entrySet().iterators, that return > Entry objects, we guarantee snapshot semantics, which along with > no-nulls rule, removes this uncertainty. Entry.setValue is > still under-constrained though -- we do write-through, which > revives an entry if had been removed. I see, getEntry() will have create a new Entry to implement the snapshot semantics, and allocating an object make getEntry() useless. > > -Doug > > R?mi From forax at univ-mlv.fr Sun Sep 23 06:34:37 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 23 Sep 2012 15:34:37 +0200 Subject: Nulls In-Reply-To: <505F0AA0.1000003@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> Message-ID: <505F0FED.7010703@univ-mlv.fr> On 09/23/2012 03:12 PM, Doug Lea wrote: > On 09/23/12 08:38, Tim Peierls wrote: >> On Sun, Sep 23, 2012 at 8:07 AM, Doug Lea
> > wrote: >> >> The main downside is that findAny is forced to lie (reporting >> absent) if a >> null item matches predicate. >> >> >> Doesn't bother me. > > It encounters the same antipattern seen when you > need to establish that a Map key has no mapping: > if (map.get(k) == null) // don't know if there is a mapping > if (!map.containsKey(k)) // so recheck > Which might not seem so terrible in particular cases where > you are prepared to cope with mappings to null. But the > issues make it impossible to write some generic > Map utilities because the need to recheck forces non-atomicity. yes, for Map, we need a getEntry() but introducing it now as a default method will have the same problem that the default implementation is not atomic. > > For findAny etc, the issue is even harder: > if (!...findAny(...).isPresent()) > // somehow recheck? > And the need for recheck is even less obvious. > > > -Doug > R?mi From forax at univ-mlv.fr Sat Sep 22 10:40:42 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 22 Sep 2012 19:40:42 +0200 Subject: Nulls In-Reply-To: <505DF7DD.8070900@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> Message-ID: <505DF81A.10009@univ-mlv.fr> On 09/22/2012 07:39 PM, Brian Goetz wrote: > This is a pretty simple (maybe simpler) rule: > - Streams are completely null-oblivious (we don't treat them > specially at all) > - Option is null-hostile. > > The Stream just passes values along, null or not, whether it be to > user-supplied lambdas, the Option ctor, the add() method of a > collection provided to into(), etc; if that recipient can't handle it, > it blows up there. If that recipient wants to ignore nulls, that's OK > too -- it's outside of the Streams API spec. Then this mostly becomes > a property of Optional. (And, if we provide a default-bearing version > too, if people want the null, they can use the other version.) +1 R?mi > > On 9/22/2012 1:16 PM, Doug Lea wrote: >> On 09/22/12 13:04, Brian Goetz wrote: >> >>> I would rather not punish everyone because some idiot puts nulls in a >>> collection >> >> Unless that punishment is reduced to essentially nothing >> (even (especially?) if it leads to even worse punishment for offenders). >> >> Remember that JVMs must do null checks all the time anyway. >> Keeping track of whether you've even seen one, and thus must >> throw away and/or repack a destination seems too cheap to >> stand in the way of having a nicer rule. >> >> -Doug >> From forax at univ-mlv.fr Sat Sep 22 05:25:30 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 22 Sep 2012 14:25:30 +0200 Subject: Nulls In-Reply-To: <505DA847.80207@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> Message-ID: <505DAE3A.1000702@univ-mlv.fr> On 09/22/2012 02:00 PM, Doug Lea wrote: > > One more push for the what I still think is the most defensible > (and simplest to explain) set of design rules: > > 1. Stream operations ignore null elements. > 2. Each operation that can return a "nothing there" result has two forms > Optional op(...) > T op(..., T defaultValueIfNone); The main issue with rule 1 is that the JDK already uses meaningful null, by example, a class with a null classloader doesn't mean that there is no classloader. List> list = ... Maps, ClassLoader> classLoadedMap = list.mapped(Class::getClassLoader).into(new HashMap<>()); so Optional should be able to store null. rule 2 is in my opinion a good compromise if by default the method that returns an Optional is written using the method that takes a default value as last parameter. R?mi > > Notes: > > Rule (1) means that stream.forEach(action) must be implemented as > x = getNextElement(); if (x != null) action.apply(x) > and so on. As I mentioned, it would arguably be better to require > that Streams themselves never produce nulls, but this can't be done > without losing the ability to rely on iterators for existing collections. > > The only argument I know against this rule is that it can have the > effect of delaying any consequences of using an element that should > have been nonnull but was null due to a programming error. > I'm sympathetic, but I think that burdening Streams with > this is misdirected: Early detection of such errors is one > reason why (dense) collections themselves shouldn't allow nulls. > If people have chosen to use collections allowing nulls, > they have already made a choice about this. > > Rule (1) removes the ambiguity of an Optional with value null. > (And enables the spec for Optional to say that a present > optional is never null.) > > Rule (2) enables fluent styles without requiring them. > This reflects the fact an "Optional" type is required > only in languages that support "value types" that can never be null. > This includes Java, but only for primitive types. However, > some types (like String) act so much like value types that > using this style is appropriate. As an unrelated byproduct, > Optional supports more fluent expression-y style that some people > love so much they cannot otherwise cope, and others want to > at least sometimes use. My guess is that once some of the > newness of fluency wears off, most people will be in the > second group, so will want multiple options. > > The other (default arg) method form applies in contexts where null > (or here, extended to arbitrary default values) returns have > their traditional meanings without forcing an unneeded > second level of wrapping. This in part reflects the fact that > Optional and boxing are essentially the same idea, and so > an optional around a box is just pure wasted overhead that > conscientious developers may wish to avoid. (During Java5 > development, some people thought that boxing overheads weren't > important enough to provide systematic alternatives to. As it > turns out, they were very wrong. We can at least profit > from the lessons learned here.) > > Finally, among the best arguments for these rules is that they > apply equally well to value types (primitives, plus any future > compound value types). So any API conforming to them has a chance > of being specialized in its entirely to, say, streams of doubles. > Although one remaining messy part is that "Optional" > (if such a thing were legal) is basically an alias for > existing class Double, and there seems to be no reasonable > way to force them to be the same nominal type. My Numerics posts > address one way to reduce impact, but I still don't see a > general backward compatible solution. > > -Doug > From dl at cs.oswego.edu Mon Sep 24 04:37:38 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 24 Sep 2012 07:37:38 -0400 Subject: Nulls In-Reply-To: <505DAE3A.1000702@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> Message-ID: <50604602.5040306@cs.oswego.edu> On 09/22/12 08:25, Remi Forax wrote: > On 09/22/2012 02:00 PM, Doug Lea wrote: >> 1. Stream operations ignore null elements. >> 2. Each operation that can return a "nothing there" result has two forms >> Optional op(...) >> T op(..., T defaultValueIfNone); > > The main issue with rule 1 is that the JDK already uses meaningful null, > by example, a class with a null classloader doesn't mean that there is no > classloader. > > List> list = ... > Maps, ClassLoader> classLoadedMap = > list.mapped(Class::getClassLoader).into(new HashMap<>()); > > so Optional should be able to store null. This is how all these discussions seem to go: Some existing or potential abuse of null leads to rules allowing further abuse. Every time I lose the straight "null means nothing there" argument, I figure that I'm complicit in adding a few million dollars to Hoare's billion dollar mistake. (http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare ; http://mattcallanan.blogspot.com/2010/09/tony-hoare-billion-dollar-mistake.html) At least in j.u.c, we pretty much fixed this (although even there I got talked into allowing nulls in the ConcurrentMap interface, but still disallow them in all j.u.c map implementations. Weird.) > > rule 2 is in my opinion a good compromise if by default the method that returns > an Optional is written using the method that takes a default value as last > parameter. If present Optionals can be null, the world is probably better off without them; so methods like findAny ONLY take the valueIfNone form. -Doug From tim at peierls.net Mon Sep 24 05:03:47 2012 From: tim at peierls.net (Tim Peierls) Date: Mon, 24 Sep 2012 08:03:47 -0400 Subject: Nulls In-Reply-To: <50604602.5040306@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> Message-ID: Mon, Sep 24, 2012 at 7:37 AM, Doug Lea
wrote: > >> List> list = ... >> Maps, ClassLoader> classLoadedMap = >> list.mapped(Class::**getClassLoader).into(new HashMap<>()); >> >> so Optional should be able to store null. >> > No, Optional shouldn't store null to be able to support this usage. Use a for loop or write an adapter for getClassLoader, but let's not distort things just to support mapping to null. (Allow, yes; encourage, no.) > This is how all these discussions seem to go: Some existing > or potential abuse of null leads to rules allowing further > abuse. Time to take a stronger stand! :-) > rule 2 is in my opinion a good compromise if by default the method that >> returns >> an Optional is written using the method that takes a default value as last >> parameter. >> > > If present Optionals can be null, the world is probably better off > without them; so methods like findAny ONLY take the valueIfNone form. But present Optionals should not null; there's no need for the valueIfNone form; and poisonous things like getClassLoader should be dealt with closer to the source, and not be allowed to pollute things downstream. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120924/7e748799/attachment-0001.html From dl at cs.oswego.edu Mon Sep 24 06:17:04 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 24 Sep 2012 09:17:04 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DD31C.1090005@oracle.com> <505DECE9.6030401@cs.oswego.edu> <505DEFA4.9080909@oracle.com> <505DF256.5050708@cs.oswego.edu> <505DF7DD.8070900@oracle.com> <505EFB6E.9030906@cs.oswego.edu> <505F0AA0.1000003@cs.oswego.edu> <505F1587.3080407@cs.oswego.edu> Message-ID: <50605D50.1030907@cs.oswego.edu> On 09/23/12 10:10, Tim Peierls wrote: > Brian said: > > I would rather not punish everyone because some idiot puts nulls in a > collection and that same idiot filters/maps it with lambdas that can't deal > with nulls. > > and I'm solidly behind that. Fun fact: ignoring nulls is in general (barely) cheaper than throwing NPEs on null. And JVMs will perform null checks at some point on any dereferenced argument for user-supplied lambda unless subsumed by an explicit check they can piggy-back on. So if "punish" means poorer performance, then I don't see the argument. My suggested implementation strategy of throwing away work in a few ops only on seeing null likely has no measurable performance impact except on those people who do have null elements. -Doug From aleksey.shipilev at oracle.com Mon Sep 24 07:01:28 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Mon, 24 Sep 2012 18:01:28 +0400 Subject: recursive lambdas Message-ID: <506067B8.1040306@oracle.com> Hi guys, What is our current stance on recursive lambdas? I.e. should this code considered to be correct? public static IntUnaryOperator fib = (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); The current EDR spec says nothing about this, except for "Remove support for recursive lambdas" in the changelog, but both latest public b56 and the nightly binary builds of lambda/lambda forest in OpenJDK do accept this as the correct code. The reason I ask is that I keep bugging IDEA guys about their lambda support, fixing a bug here, there, and everywhere, as this helps people to start hacking on lambdas and provide the useful feedback: http://youtrack.jetbrains.com/issue/IDEA-91986 I think everyone realizes the spec is in flux, but it is important to understand what are we leaning towards. Thanks, Aleksey From brian.goetz at oracle.com Mon Sep 24 07:28:04 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Sep 2012 10:28:04 -0400 Subject: recursive lambdas In-Reply-To: <506067B8.1040306@oracle.com> References: <506067B8.1040306@oracle.com> Message-ID: <50606DF4.2060107@oracle.com> We decided to disallow this form for the time being, in part to simplify the metafactory protocol and therefore increase the chance of intrinsification of lambda capture sites. On 9/24/2012 10:01 AM, Aleksey Shipilev wrote: > Hi guys, > > What is our current stance on recursive lambdas? I.e. should this code > considered to be correct? > > public static IntUnaryOperator fib = > (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); > > The current EDR spec says nothing about this, except for "Remove support > for recursive lambdas" in the changelog, but both latest public b56 and > the nightly binary builds of lambda/lambda forest in OpenJDK do accept > this as the correct code. > > The reason I ask is that I keep bugging IDEA guys about their lambda > support, fixing a bug here, there, and everywhere, as this helps people > to start hacking on lambdas and provide the useful feedback: > http://youtrack.jetbrains.com/issue/IDEA-91986 > > I think everyone realizes the spec is in flux, but it is important to > understand what are we leaning towards. > > Thanks, > Aleksey > From aleksey.shipilev at oracle.com Mon Sep 24 07:50:42 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Mon, 24 Sep 2012 18:50:42 +0400 Subject: recursive lambdas In-Reply-To: <50606DF4.2060107@oracle.com> References: <506067B8.1040306@oracle.com> <50606DF4.2060107@oracle.com> Message-ID: <50607342.4010703@oracle.com> Let me be more clear. As I understand the spec prohibits this: void test() { IntUnaryOperator fib = (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); } ...but still allows this (still a recursive lambda): public IntUnaryOperator fib = (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); ...because the current JLS does not prohibit this. Also, the plain old inner classes work as well: public IntUnaryOperator fib = new IntUnaryOperator() { @Override public int operate(int n) { return (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); } }; public interface IntUnaryOperator { int operate(int v); } Am I correct? Thanks, Aleksey. On 09/24/2012 06:28 PM, Brian Goetz wrote: > We decided to disallow this form for the time being, in part to simplify > the metafactory protocol and therefore increase the chance of > intrinsification of lambda capture sites. > > On 9/24/2012 10:01 AM, Aleksey Shipilev wrote: >> Hi guys, >> >> What is our current stance on recursive lambdas? I.e. should this code >> considered to be correct? >> >> public static IntUnaryOperator fib = >> (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); >> >> The current EDR spec says nothing about this, except for "Remove support >> for recursive lambdas" in the changelog, but both latest public b56 and >> the nightly binary builds of lambda/lambda forest in OpenJDK do accept >> this as the correct code. >> >> The reason I ask is that I keep bugging IDEA guys about their lambda >> support, fixing a bug here, there, and everywhere, as this helps people >> to start hacking on lambdas and provide the useful feedback: >> http://youtrack.jetbrains.com/issue/IDEA-91986 >> >> I think everyone realizes the spec is in flux, but it is important to >> understand what are we leaning towards. >> >> Thanks, >> Aleksey >> From forax at univ-mlv.fr Mon Sep 24 07:57:47 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 24 Sep 2012 16:57:47 +0200 Subject: recursive lambdas In-Reply-To: <50607342.4010703@oracle.com> References: <506067B8.1040306@oracle.com> <50606DF4.2060107@oracle.com> <50607342.4010703@oracle.com> Message-ID: <506074EB.7040600@univ-mlv.fr> On 09/24/2012 04:50 PM, Aleksey Shipilev wrote: > Let me be more clear. As I understand the spec prohibits this: > > void test() { > IntUnaryOperator fib = > (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); > } > > ...but still allows this (still a recursive lambda): > > public IntUnaryOperator fib = > (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); > > ...because the current JLS does not prohibit this. Also, the plain old > inner classes work as well: > > public IntUnaryOperator fib = > new IntUnaryOperator() { > @Override > public int operate(int n) { > return (n < 2) ? 1 : > fib.operate(n - 1) + fib.operate(n - 2); > } > }; > > public interface IntUnaryOperator { > int operate(int v); > } > > > Am I correct? yes, the former tries to capture a local variable which is not yet initialized while the later capture this thus can access to this.fib when called. cheers, R?mi From brian.goetz at oracle.com Mon Sep 24 08:11:39 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Sep 2012 11:11:39 -0400 Subject: recursive lambdas In-Reply-To: <50607342.4010703@oracle.com> References: <506067B8.1040306@oracle.com> <50606DF4.2060107@oracle.com> <50607342.4010703@oracle.com> Message-ID: <5060782B.8080609@oracle.com> Yes, it works for fields but not for locals. This is because the init-before-use rules for locals are stricter. On 9/24/2012 10:50 AM, Aleksey Shipilev wrote: > Let me be more clear. As I understand the spec prohibits this: > > void test() { > IntUnaryOperator fib = > (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); > } > > ...but still allows this (still a recursive lambda): > > public IntUnaryOperator fib = > (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); > > ...because the current JLS does not prohibit this. Also, the plain old > inner classes work as well: > > public IntUnaryOperator fib = > new IntUnaryOperator() { > @Override > public int operate(int n) { > return (n < 2) ? 1 : > fib.operate(n - 1) + fib.operate(n - 2); > } > }; > > public interface IntUnaryOperator { > int operate(int v); > } > > > Am I correct? > > Thanks, > Aleksey. > > On 09/24/2012 06:28 PM, Brian Goetz wrote: >> We decided to disallow this form for the time being, in part to simplify >> the metafactory protocol and therefore increase the chance of >> intrinsification of lambda capture sites. >> >> On 9/24/2012 10:01 AM, Aleksey Shipilev wrote: >>> Hi guys, >>> >>> What is our current stance on recursive lambdas? I.e. should this code >>> considered to be correct? >>> >>> public static IntUnaryOperator fib = >>> (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); >>> >>> The current EDR spec says nothing about this, except for "Remove support >>> for recursive lambdas" in the changelog, but both latest public b56 and >>> the nightly binary builds of lambda/lambda forest in OpenJDK do accept >>> this as the correct code. >>> >>> The reason I ask is that I keep bugging IDEA guys about their lambda >>> support, fixing a bug here, there, and everywhere, as this helps people >>> to start hacking on lambdas and provide the useful feedback: >>> http://youtrack.jetbrains.com/issue/IDEA-91986 >>> >>> I think everyone realizes the spec is in flux, but it is important to >>> understand what are we leaning towards. >>> >>> Thanks, >>> Aleksey >>> > From brian.goetz at oracle.com Mon Sep 24 09:08:09 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Sep 2012 12:08:09 -0400 Subject: Nulls In-Reply-To: <50604602.5040306@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> Message-ID: <50608569.7070603@oracle.com> >> so Optional should be able to store null. > > This is how all these discussions seem to go: Some existing > or potential abuse of null leads to rules allowing further > abuse. Every time I lose the straight "null means nothing there" > argument, I figure that I'm complicit in adding a few million > dollars to Hoare's billion dollar mistake. I think we're trying to not encourage further abuse. There's consensus that present Optional should NOT store null. Similarly, the Joe/Tim/Brian cabal is trying to NOT burden the streams API/implementation with any special null-awareness, again for the sake of not coddling abuse. > If present Optionals can be null, the world is probably better off > without them; so methods like findAny ONLY take the valueIfNone form. I don't see any value to present Optionals being able to take null. From sam at sampullara.com Mon Sep 24 09:17:12 2012 From: sam at sampullara.com (Sam Pullara) Date: Mon, 24 Sep 2012 09:17:12 -0700 Subject: recursive lambdas In-Reply-To: <5060782B.8080609@oracle.com> References: <506067B8.1040306@oracle.com> <50606DF4.2060107@oracle.com> <50607342.4010703@oracle.com> <5060782B.8080609@oracle.com> Message-ID: The fact that this works for fields helps. I think we will want to make it work for local variables in the future but it is nice that this is possible right now. Sam On Mon, Sep 24, 2012 at 8:11 AM, Brian Goetz wrote: > Yes, it works for fields but not for locals. This is because the > init-before-use rules for locals are stricter. > > > On 9/24/2012 10:50 AM, Aleksey Shipilev wrote: >> >> Let me be more clear. As I understand the spec prohibits this: >> >> void test() { >> IntUnaryOperator fib = >> (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); >> } >> >> ...but still allows this (still a recursive lambda): >> >> public IntUnaryOperator fib = >> (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); >> >> ...because the current JLS does not prohibit this. Also, the plain old >> inner classes work as well: >> >> public IntUnaryOperator fib = >> new IntUnaryOperator() { >> @Override >> public int operate(int n) { >> return (n < 2) ? 1 : >> fib.operate(n - 1) + fib.operate(n - 2); >> } >> }; >> >> public interface IntUnaryOperator { >> int operate(int v); >> } >> >> >> Am I correct? >> >> Thanks, >> Aleksey. >> >> On 09/24/2012 06:28 PM, Brian Goetz wrote: >>> >>> We decided to disallow this form for the time being, in part to simplify >>> the metafactory protocol and therefore increase the chance of >>> intrinsification of lambda capture sites. >>> >>> On 9/24/2012 10:01 AM, Aleksey Shipilev wrote: >>>> >>>> Hi guys, >>>> >>>> What is our current stance on recursive lambdas? I.e. should this code >>>> considered to be correct? >>>> >>>> public static IntUnaryOperator fib = >>>> (n) -> (n < 2) ? 1 : fib.operate(n - 1) + fib.operate(n - 2); >>>> >>>> The current EDR spec says nothing about this, except for "Remove support >>>> for recursive lambdas" in the changelog, but both latest public b56 and >>>> the nightly binary builds of lambda/lambda forest in OpenJDK do accept >>>> this as the correct code. >>>> >>>> The reason I ask is that I keep bugging IDEA guys about their lambda >>>> support, fixing a bug here, there, and everywhere, as this helps people >>>> to start hacking on lambdas and provide the useful feedback: >>>> http://youtrack.jetbrains.com/issue/IDEA-91986 >>>> >>>> I think everyone realizes the spec is in flux, but it is important to >>>> understand what are we leaning towards. >>>> >>>> Thanks, >>>> Aleksey >>>> >> > From dl at cs.oswego.edu Mon Sep 24 16:21:43 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 24 Sep 2012 19:21:43 -0400 Subject: Nulls In-Reply-To: <50608569.7070603@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> Message-ID: <5060EB07.5060007@cs.oswego.edu> On 09/24/12 12:08, Brian Goetz wrote: > I think we're trying to not encourage further abuse. There's consensus that > present Optional should NOT store null. Similarly, the Joe/Tim/Brian cabal is > trying to NOT burden the streams API/implementation with any special > null-awareness, again for the sake of not coddling abuse. > I'm still a little bit in disbelief about the proposal. The spec for findAny would look like: * @return ... an absent Optional if no element matches * the given predicate, or if the predicate reports true for * a null argument and the stream contains a null element, * possibly even if the predicate also holds for another * nonnull element OK? -Doug From tim at peierls.net Mon Sep 24 17:01:32 2012 From: tim at peierls.net (Tim Peierls) Date: Mon, 24 Sep 2012 20:01:32 -0400 Subject: Nulls In-Reply-To: <5060EB07.5060007@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> Message-ID: As far as I'm concerned it can say that behavior is undefined if the collection contains null. No coddling. On Mon, Sep 24, 2012 at 7:21 PM, Doug Lea
wrote: > On 09/24/12 12:08, Brian Goetz wrote: > >> I think we're trying to not encourage further abuse. There's consensus >> that >> present Optional should NOT store null. Similarly, the Joe/Tim/Brian >> cabal is >> trying to NOT burden the streams API/implementation with any special >> null-awareness, again for the sake of not coddling abuse. >> >> > I'm still a little bit in disbelief about the proposal. > The spec for findAny would look like: > > * @return ... an absent Optional if no element matches > * the given predicate, or if the predicate reports true for > * a null argument and the stream contains a null element, > * possibly even if the predicate also holds for another > * nonnull element > > OK? > > -Doug > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120924/b8631b6c/attachment.html From brian.goetz at oracle.com Mon Sep 24 17:10:35 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Sep 2012 20:10:35 -0400 Subject: Nulls In-Reply-To: <5060EB07.5060007@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> Message-ID: <5060F67B.6060802@oracle.com> Your disbelief is well-founded, since that's not what is being suggested. What we're suggesting is that findAny (or, more precisely, the Optional constructor called by findAny) could throw NPE if the stream contains a null, even if the stream also contains a non-null. So the spec looks more like: @return an absent Optional if the stream is empty, or a present optional containing a selected element from the stream @throw NPE if the selected element of the stream is null On 9/24/2012 7:21 PM, Doug Lea wrote: > On 09/24/12 12:08, Brian Goetz wrote: >> I think we're trying to not encourage further abuse. There's >> consensus that >> present Optional should NOT store null. Similarly, the Joe/Tim/Brian >> cabal is >> trying to NOT burden the streams API/implementation with any special >> null-awareness, again for the sake of not coddling abuse. >> > > I'm still a little bit in disbelief about the proposal. > The spec for findAny would look like: > > * @return ... an absent Optional if no element matches > * the given predicate, or if the predicate reports true for > * a null argument and the stream contains a null element, > * possibly even if the predicate also holds for another > * nonnull element > > OK? > > -Doug > > > > > > From aleksey.shipilev at oracle.com Tue Sep 25 02:45:32 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 25 Sep 2012 13:45:32 +0400 Subject: runOnce Message-ID: <50617D3C.60905@oracle.com> Hi guys, This thread [1] makes me thinking if we want to introduce something like runOnce() to record per-callsite invocation information. I.e. instead of messing with maps and flags to guard this behavior, one can possibly do: Runners.runOnce(() -> System.out.println("Boo!")); ...where runOnce will be desugared by Javac into indy call in which bootstrap method calls the callee, and then emits the constant callsite holding the result. This will effectively record the "invoked" property into callsite, so that this code: Runners.runOnce(() -> System.out.println("foo")); Runners.runOnce(() -> System.out.println("bar")); would yield: foo bar It might be yet another convenient way to make singletons (err... thread-safe identity caches), like: public T getInstance() { return Runners.runOnce(() -> new T()); } Was something like this considered before? -Aleksey. [1] http://mail.openjdk.java.net/pipermail/lambda-dev/2012-September/006015.html From forax at univ-mlv.fr Tue Sep 25 02:59:44 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 25 Sep 2012 11:59:44 +0200 Subject: runOnce In-Reply-To: <50617D3C.60905@oracle.com> References: <50617D3C.60905@oracle.com> Message-ID: <50618090.7010006@univ-mlv.fr> without any javac translation: static class Runners {static { System.out.println("Boo!"); } static void runOnce() {}} public static void foo() { Runners.runOnce(); } R?mi On 09/25/2012 11:45 AM, Aleksey Shipilev wrote: > Hi guys, > > This thread [1] makes me thinking if we want to introduce something like > runOnce() to record per-callsite invocation information. I.e. instead of > messing with maps and flags to guard this behavior, one can possibly do: > > Runners.runOnce(() -> System.out.println("Boo!")); > > ...where runOnce will be desugared by Javac into indy call in which > bootstrap method calls the callee, and then emits the constant callsite > holding the result. This will effectively record the "invoked" property > into callsite, so that this code: > > Runners.runOnce(() -> System.out.println("foo")); > Runners.runOnce(() -> System.out.println("bar")); > > would yield: > foo > bar > > It might be yet another convenient way to make singletons (err... > thread-safe identity caches), like: > > public T getInstance() { return Runners.runOnce(() -> new T()); } > > Was something like this considered before? > > -Aleksey. > > [1] > http://mail.openjdk.java.net/pipermail/lambda-dev/2012-September/006015.html From aleksey.shipilev at oracle.com Tue Sep 25 03:16:21 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Tue, 25 Sep 2012 14:16:21 +0400 Subject: runOnce In-Reply-To: <50618090.7010006@univ-mlv.fr> References: <50617D3C.60905@oracle.com> <50618090.7010006@univ-mlv.fr> Message-ID: <50618475.9090606@oracle.com> This is bulky and requires Runners class per each call site. BTW, realized this seem to be pure jsr292 topic, not jsr335. -Aleksey. On 09/25/2012 01:59 PM, Remi Forax wrote: > without any javac translation: > > static class Runners {static { System.out.println("Boo!"); } static > void runOnce() {}} > public static void foo() { > Runners.runOnce(); > } > > R?mi > > On 09/25/2012 11:45 AM, Aleksey Shipilev wrote: >> Hi guys, >> >> This thread [1] makes me thinking if we want to introduce something like >> runOnce() to record per-callsite invocation information. I.e. instead of >> messing with maps and flags to guard this behavior, one can possibly do: >> >> Runners.runOnce(() -> System.out.println("Boo!")); >> >> ...where runOnce will be desugared by Javac into indy call in which >> bootstrap method calls the callee, and then emits the constant callsite >> holding the result. This will effectively record the "invoked" property >> into callsite, so that this code: >> >> Runners.runOnce(() -> System.out.println("foo")); >> Runners.runOnce(() -> System.out.println("bar")); >> >> would yield: >> foo >> bar >> >> It might be yet another convenient way to make singletons (err... >> thread-safe identity caches), like: >> >> public T getInstance() { return Runners.runOnce(() -> new T()); } >> >> Was something like this considered before? >> >> -Aleksey. >> >> [1] >> http://mail.openjdk.java.net/pipermail/lambda-dev/2012-September/006015.html >> > From forax at univ-mlv.fr Tue Sep 25 03:44:14 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 25 Sep 2012 12:44:14 +0200 Subject: runOnce In-Reply-To: <50618475.9090606@oracle.com> References: <50617D3C.60905@oracle.com> <50618090.7010006@univ-mlv.fr> <50618475.9090606@oracle.com> Message-ID: <50618AFE.7000507@univ-mlv.fr> On 09/25/2012 12:16 PM, Aleksey Shipilev wrote: > This is bulky and requires Runners class per each call site. > > BTW, realized this seem to be pure jsr292 topic, not jsr335. Like every people that have used the jsr292 in Java, you clearly see the lack of support of invokedynamic by javac. Here, your asking the compiler to emit an invokedynamic instead of an invokestatic when calling Runner.runOnce(), I've already discuss with Brian and John Rose to have a way to do exactly that in a type safe way but it will not be scheduled for java 8. So it's not a pure jsr292 issue, it's an issue between the jsr292 and java the language, we have to finish the integration of jsr292 in Java the language. BTW, here is the current way to write what you want in Java with jsr 292 and as you say, you still need a static somewhere. private static final MethodHandle mh = Runner.runOnce(() -> System.out.println("Boo!")); public static void bar() { try { mh.invokeExact(); } catch(Throwable t){} } static class Runner { public static void nop() {} static MethodHandle runOnce(Runnable runnable) { MethodType type = MethodType.methodType(void.class); MutableCallSite cs = new MutableCallSite(type); Lookup lookup = MethodHandles.lookup(); MethodHandle mh, mh2; try { mh = lookup.bind(runnable, "run", type); mh2 = lookup.bind(cs, "setTarget", MethodType.methodType(void.class, MethodHandle.class)) .bindTo(lookup.findStatic(Runner.class, "nop", type)); } catch (NoSuchMethodException | IllegalAccessException e) { throw new AssertionError(e); } cs.setTarget(MethodHandles.filterReturnValue(mh, mh2)); return cs.dynamicInvoker(); } } > > -Aleksey. R?mi > > On 09/25/2012 01:59 PM, Remi Forax wrote: >> without any javac translation: >> >> static class Runners {static { System.out.println("Boo!"); } static >> void runOnce() {}} >> public static void foo() { >> Runners.runOnce(); >> } >> >> R?mi >> >> On 09/25/2012 11:45 AM, Aleksey Shipilev wrote: >>> Hi guys, >>> >>> This thread [1] makes me thinking if we want to introduce something like >>> runOnce() to record per-callsite invocation information. I.e. instead of >>> messing with maps and flags to guard this behavior, one can possibly do: >>> >>> Runners.runOnce(() -> System.out.println("Boo!")); >>> >>> ...where runOnce will be desugared by Javac into indy call in which >>> bootstrap method calls the callee, and then emits the constant callsite >>> holding the result. This will effectively record the "invoked" property >>> into callsite, so that this code: >>> >>> Runners.runOnce(() -> System.out.println("foo")); >>> Runners.runOnce(() -> System.out.println("bar")); >>> >>> would yield: >>> foo >>> bar >>> >>> It might be yet another convenient way to make singletons (err... >>> thread-safe identity caches), like: >>> >>> public T getInstance() { return Runners.runOnce(() -> new T()); } >>> >>> Was something like this considered before? >>> >>> -Aleksey. >>> >>> [1] >>> http://mail.openjdk.java.net/pipermail/lambda-dev/2012-September/006015.html >>> From dl at cs.oswego.edu Tue Sep 25 04:37:48 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Tue, 25 Sep 2012 07:37:48 -0400 Subject: Nulls In-Reply-To: <5060F67B.6060802@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> Message-ID: <5061978C.5040605@cs.oswego.edu> On 09/24/12 20:10, Brian Goetz wrote: > Your disbelief is well-founded, since that's not what is being suggested. What > we're suggesting is that findAny (or, more precisely, the Optional constructor > called by findAny) could throw NPE if the stream contains a null, even if the > stream also contains a non-null. So the spec looks more like: > > @return an absent Optional if the stream is empty, or a present > optional containing a selected element from the stream > @throw NPE if the selected element of the stream is null > Which does make findAny null-aware. And in a less than helpful way, because if a user gets NPE, then they know that the predicate DOES hold for at least one element (null). As usual, my main concern is about impact on composition (aka modular reasoning). Any general-purpose higher-level utility using findAny without knowing if the source may include nulls will need to do something like: boolean present; T x; try { Optional r = ...findAny(...); if (present = r.isPresent()) x = r.get(); } catch(NPE ex) { present = true; x = null; } Not very nice. I hate to be a pest about this, but the only choices I know that compose at all remain: 1. All stream ops throw NPE on any null element 2. All stream ops ignore nulls. 3. No use of Optionals; rely only on valueIfAbsent constructions And of these, choice (2) still seems most defensible. -Doug From joe.bowbeer at gmail.com Tue Sep 25 05:25:42 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 25 Sep 2012 05:25:42 -0700 Subject: Nulls In-Reply-To: <5061978C.5040605@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> Message-ID: No one would write that code. They would add a filter to remove nulls, or to replace nulls with None, or to replace null with another value, and then invoke findAny. Or they would invoke the version of findAny (is there one?) that replaces null with a default value. When I think of ignoring nulls in streams, I think of zipWithIndex, and then I reconsider. On Sep 25, 2012 4:38 AM, "Doug Lea"
wrote: > On 09/24/12 20:10, Brian Goetz wrote: > >> Your disbelief is well-founded, since that's not what is being suggested. >> What >> we're suggesting is that findAny (or, more precisely, the Optional >> constructor >> called by findAny) could throw NPE if the stream contains a null, even if >> the >> stream also contains a non-null. So the spec looks more like: >> >> @return an absent Optional if the stream is empty, or a present >> optional containing a selected element from the stream >> @throw NPE if the selected element of the stream is null >> >> > Which does make findAny null-aware. And in a less than helpful > way, because if a user gets NPE, then they know that the predicate > DOES hold for at least one element (null). > > As usual, my main concern is about impact on composition > (aka modular reasoning). Any general-purpose > higher-level utility using findAny without knowing > if the source may include nulls will need to do > something like: > > boolean present; > T x; > try { > Optional r = ...findAny(...); > if (present = r.isPresent()) x = r.get(); > } catch(NPE ex) { > present = true; > x = null; > } > > Not very nice. > > I hate to be a pest about this, but the only choices > I know that compose at all remain: > > 1. All stream ops throw NPE on any null element > 2. All stream ops ignore nulls. > 3. No use of Optionals; rely only on valueIfAbsent constructions > > And of these, choice (2) still seems most defensible. > > -Doug > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120925/9928c2ae/attachment.html From tim at peierls.net Tue Sep 25 05:27:53 2012 From: tim at peierls.net (Tim Peierls) Date: Tue, 25 Sep 2012 08:27:53 -0400 Subject: Nulls In-Reply-To: <5061978C.5040605@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> Message-ID: On Tue, Sep 25, 2012 at 7:37 AM, Doug Lea
wrote: > As usual, my main concern is about impact on composition (aka modular > reasoning). Any general-purpose higher-level utility using findAny without > knowing if the source may include nulls will need to do something like: > > boolean present; > T x; > try { > Optional r = ...findAny(...); > if (present = r.isPresent()) x = r.get(); > } catch(NPE ex) { > present = true; > x = null; > } > Or just: Optional r = ...filter(notNull()).findAny(...); Not very nice. > The latter *is* pretty nice, reads very clearly, and if you apply filter upstream, you won't have do it every time you findAny(). > I hate to be a pest about this, but the only choices I know that compose > at all remain: > > 1. All stream ops throw NPE on any null element > Bleah: "Someone put a tack on teacher's chair, so the entire class has to do extra homework." > 2. All stream ops ignore nulls. > OK with me, but I'm betting not with Brian (because not size-preserving). 3. No use of Optionals; rely only on valueIfAbsent constructions > Bleah, opportunity for safer coding lost. > And of these, choice (2) still seems most defensible. What's wrong with prophylactic filtering if you think there might be nulls? The "don't do anything special with nulls" approach that Brian is advocating works quite well in practice in Guava. I've been using Guava heavily, and I do not have to use anything like the contorted example that Doug gave above. In a few cases I've had to use filter(notNull()) on a FluentIterable (a "stream"). The analogue of Stream.into(...) in Guava is FluentIterable.toXXX(...), where XXX is an immutable list or set (possibly sorted). Immutable collections don't permit null elements, so you eventually get the NPE downstream if you don't take care of it upstream. This is the best of both worlds: You're free to deal with nulls in the stream as long as you get rid of them by the time you .into(...) it. There is an exception to the "don't do anything special with nulls" rule in Guava, and as you might expect, it's for the analogue to findAny(), FluentIterable.firstMatch(Predicate): *Warning:* avoid using a predicate that matches null. If null is matched in this fluent iterable, a NullPointerException will be thrown. That's not so hard, is it? --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120925/6557a1a6/attachment.html From forax at univ-mlv.fr Tue Sep 25 05:55:31 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 25 Sep 2012 14:55:31 +0200 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> Message-ID: <5061A9C3.9090105@univ-mlv.fr> On 09/25/2012 02:27 PM, Tim Peierls wrote: > On Tue, Sep 25, 2012 at 7:37 AM, Doug Lea
> wrote: > > As usual, my main concern is about impact on composition (aka > modular reasoning). Any general-purpose higher-level utility using > findAny without knowing if the source may include nulls will need > to do something like: > > boolean present; > T x; > try { > Optional r = ...findAny(...); > if (present = r.isPresent()) x = r.get(); > } catch(NPE ex) { > present = true; > x = null; > } > > > Or just: > > Optional r = ...filter(notNull()).findAny(...); > > Not very nice. > > > The latter *is* pretty nice, reads very clearly, and if you apply > filter upstream, you won't have do it every time you findAny(). > > I hate to be a pest about this, but the only choices I know that > compose at all remain: > > 1. All stream ops throw NPE on any null element > > > Bleah: "Someone put a tack on teacher's chair, so the entire class has > to do extra homework." > > 2. All stream ops ignore nulls. > > > OK with me, but I'm betting not with Brian (because not size-preserving). and as Joe said, it doesn't preserve the index too. > > 3. No use of Optionals; rely only on valueIfAbsent constructions > > > Bleah, opportunity for safer coding lost. or it requires to dudplicate calls that returns an Optional, findAny/findAnyOrNull. > > And of these, choice (2) still seems most defensible. > > > What's wrong with prophylactic filtering if you think there might be > nulls? > > The "don't do anything special with nulls" approach that Brian is > advocating works quite well in practice in Guava. I've been using > Guava heavily, and I do not have to use anything like the contorted > example that Doug gave above. In a few cases I've had to use > filter(notNull()) on a FluentIterable (a "stream"). > > The analogue of Stream.into(...) in Guava is > FluentIterable.toXXX(...), where XXX is an immutable list or set > (possibly sorted). Immutable collections don't permit null elements, > so you eventually get the NPE downstream if you don't take care of it > upstream. This is the best of both worlds: You're free to deal with > nulls in the stream as long as you get rid of them by the time you > .into(...) it. > > There is an exception to the "don't do anything special with nulls" > rule in Guava, and as you might expect, it's for the analogue to > findAny(), FluentIterable.firstMatch(Predicate): > > *Warning:* avoid using a |predicate| that matches |null|. If |null| is > matched in this fluent iterable, a |NullPointerException| > will > be thrown. > > That's not so hard, is it? > > --tim R?mi From brian.goetz at oracle.com Tue Sep 25 07:48:28 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 25 Sep 2012 10:48:28 -0400 Subject: Nulls In-Reply-To: <5061978C.5040605@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> Message-ID: <5061C43C.10501@oracle.com> >> Your disbelief is well-founded, since that's not what is being >> suggested. What >> we're suggesting is that findAny (or, more precisely, the Optional >> constructor >> called by findAny) could throw NPE if the stream contains a null, even >> if the >> stream also contains a non-null. So the spec looks more like: >> >> @return an absent Optional if the stream is empty, or a present >> optional containing a selected element from the stream >> @throw NPE if the selected element of the stream is null > > Which does make findAny null-aware. And in a less than helpful > way, because if a user gets NPE, then they know that the predicate > DOES hold for at least one element (null). We could hide behind "its not the op throwing, its the Option", and compare it to .into(nullHostileCollection). But, even not trying to hide, I'm not sure its null-aware as much as simply null-inappropriate -- i.e., don't use this method if you have nulls in your streams -- which means that the user does has to be null-aware. Which I think is your point -- the user has to reason about it. > As usual, my main concern is about impact on composition > (aka modular reasoning). Any general-purpose > higher-level utility using findAny without knowing > if the source may include nulls will need to do > something like: > > boolean present; > T x; > try { > Optional r = ...findAny(...); > if (present = r.isPresent()) x = r.get(); > } catch(NPE ex) { > present = true; > x = null; > } > > Not very nice. Or more nice: ....filter(o -> o != null).findAny(); > I hate to be a pest about this, but the only choices > I know that compose at all remain: > > 1. All stream ops throw NPE on any null element > 2. All stream ops ignore nulls. > 3. No use of Optionals; rely only on valueIfAbsent constructions > > And of these, choice (2) still seems most defensible. Not trying to be clever, but I do think that the behavior we are proposing *is* ignoring nulls. It just that the findAny is fused to another abstraction (Optional) that is null-hostile. I could get behind (1), but the only real justification for it at this point is "eat your vegetables, they're good for you." Which seems a little heavy-handed. At this point it seems your primary objection is that because findXxx (and possibly other methods) are fused to Optional, we have a collision of worlds, and the user has to keep track of the demarcation between the null-friendly world and the null-hostile world. The choices are: - Accept a slightly more complex user model (the user has to reason about which ops are null-safe) - Distort Optional to be null-friendly (I think this has been roundly rejected) - Eliminate the Optional-bearing ops (Tim would call this punishing the innocent) - Distort streams to be null-absorbing (This is your (2) above) - Make streams null-hostile (your (1) above) But there's no free lunch -- there's distortion and complexity everywhere, and the user *still* has to keep track of the demarcation no matter what we do. (If we choose (2), the user may be surprised to wonder "where did my nulls go? why did my size change?".) I guess I still prefer the "make the model slightly more complex to reason about", as long as we can make all that complexity fall at the feet of the null lovers -- which the current proposal does. After all, if you've got nulls in your collection, you *already* have to reason about "I can use FooList as a target but not BarList". Adding "I can't use Optional as a target" to that list of things to reason about does not seem to make it qualitatively harder. And putting all the complexity on those who have nulls means those who don't use nulls get everything they want -- including a simple and safe model. From brian.goetz at oracle.com Tue Sep 25 15:04:55 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 25 Sep 2012 18:04:55 -0400 Subject: This week in the repo Message-ID: <50622A87.8080608@oracle.com> I plan to do a weekly summary of the putbacks to the repo, for those that are not following the putback messages. Here is the first one. *Week of Sept 17, 2012* *Cleanup of size protocols in StreamAccessor and Spliterator (Mike). *Replaced various size(), getSize(), getSizeIfKnown(), getOrEstimateSize() methods with a pair of methods: getSizeIfKnown (returns -1 if unknown) and estimateSize() on Spliterator and StreamAccessor. Estimates are used largely for decomposition decisions and may be inaccurate (in the worst case, return MAX_VALUE); explicit sizes are used for optimizations such as pre-sizing target arrays. Also added getNaturalSplits() to Spliterator to indicate the most natural split arity from the perspective of the data structure; operations are free to ignore this. *Rename of stream shapes (Henry). *The array/collection-like stream shape that had been known as "Linear" was renamed to "Value", and various other supporting classes and methods (ValuePipeline, chainValue) were renamed accordingly. *Op merging (Brian). *For some Op implementations, it was possible to merge across related functions (merge {Any,All,None}MatchOp into MatchOp) and across shapes (merge MatchOp and BiMatchOp into one.) In some cases, this results in dramatic reduction in code duplication; in others (especially stateless intermediate ops), almost none. *More specialized StreamSource implementations (Mike). *Add parallel() specialization to ArrayList, Vector. Still a lot of work left to do here. *Migration to CountedCompleter (Brian). *All currently implemented parallel ops (except cumulate) are now based on CountedCompleter, and many based on an abstract base task class AbstractTask. AbstractTask now supports n-way splits instead of only binary splits, guided by Spliterator.getNaturalSplits. *More parallel implementations (Brian). *Added parallel implementation of short-circuiting {any,all,none}Match. *New ops (Paul). *Added limit(n), skip(n) stateful intermediate operations to Stream and MapStream. Added a stream concatenation operation to Stream and MapStream for serial and parallel operation. *Infinite stream (Paul). *Added exploratory support for infinite stream generators. *Refactored intermediate operation test helpers (Paul)*: Operations can be tested that may perform side-effects such as Tee and Concat. Easier to test a subset of operations. *Refactored Stream to extend BaseStream (Brian)*: Streamable now generic in stream type. Streamable becomes Streamable>, MapStreamable becomes Streamable>. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120925/0d1dd86e/attachment.html From aleksey.shipilev at oracle.com Wed Sep 26 05:53:53 2012 From: aleksey.shipilev at oracle.com (Aleksey Shipilev) Date: Wed, 26 Sep 2012 16:53:53 +0400 Subject: Lazy and memoizers Message-ID: <5062FAE1.1000707@oracle.com> Hi guys, Another thing we've been discussing internally was the library support for memoization. Having the Lazy class would be nice to provide shortcut for memoizing lambda expressions, somewhat similar to C#-ish Lazy, i.e.: class Lazy implements Factory { public static Lazy of(Factory f) { return new Lazy(f); } private final Factory f; private volatile Holder holder; public Lazy(Factory f) { this.f = f; } @Override public T make() { if (holder == null) { T t = f.make(); CAS(holder, null, new Holder(t)); } return holder.value; } static class Holder { public final T value; public Holder(T t) { value = t; } } } There is an open question if we want to make sure f.make() is executed once (this example code does not guarantee that, and guaranteeing would require some sort of locking, so I wonder if this belongs in jsr166 additions), but we can spin another class for that. Then, we can do something like: foo(Factory f); foo(Lazy.of(() -> new MyHeavyAndBoringObject()); ...or even use that to simulate call-by-need in specific places. I'm sure something like that was already considered, is there a history on deciding if this is viable and needed? -Aleksey. From dl at cs.oswego.edu Wed Sep 26 06:18:21 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 26 Sep 2012 09:18:21 -0400 Subject: Nulls In-Reply-To: <5061C43C.10501@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> Message-ID: <5063009D.9060307@cs.oswego.edu> On 09/25/12 10:48, Brian Goetz wrote: > null-aware as much as simply null-inappropriate -- i.e., don't use this method > if you have nulls in your streams -- which means that the user does has to be > null-aware. Which I think is your point -- the user has to reason about it. Yes. If the base framework kicks the nullness problem to its users, then every user must deal with it. But many of the "users" of java.util are not applications programs, but other utilities and frameworks that do not know whether the issues apply or not. >> boolean present; >> T x; >> try { >> Optional r = ...findAny(...); >> if (present = r.isPresent()) x = r.get(); >> } catch(NPE ex) { >> present = true; >> x = null; >> } >> >> Not very nice. > > Or more nice: > > ....filter(o -> o != null).findAny(); No. If this is part of a utility as opposed to application program, then it doesn't know if nulls are allowed or not, so has to be correct in either case; or else advertise its policy as an unchecked precondition. Many people will not do either of these. This is the where most of the mistakes will appear. So in the likely event that you choose these rules anyway, it would be nice to at least alert the bug-detector folks to look for such constructions. -Doug From forax at univ-mlv.fr Wed Sep 26 06:26:35 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 26 Sep 2012 15:26:35 +0200 Subject: Non interference enforcement In-Reply-To: <5061C43C.10501@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> Message-ID: <5063028B.8040800@univ-mlv.fr> We currently ask users to write lambdas that doesn't interfere with the source collection of a stream but it's not enforced in the code. By example, list.stream().forEach(e -> { list.remove(e); }); may works or not depending how the pipeline is implemented. This is a serous departure for the current way java.util collections works and I wonder if we should not keep the fail-fast guarantee for those collections. R?mi From brian.goetz at oracle.com Wed Sep 26 06:28:37 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 09:28:37 -0400 Subject: Non interference enforcement In-Reply-To: <5063028B.8040800@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> Message-ID: <50630305.2000103@oracle.com> We do piggyback on some degree of fail-fast when we use their iterators. On 9/26/2012 9:26 AM, Remi Forax wrote: > We currently ask users to write lambdas that doesn't interfere with the > source collection > of a stream but it's not enforced in the code. > > By example, > list.stream().forEach(e -> { list.remove(e); }); > may works or not depending how the pipeline is implemented. > > This is a serous departure for the current way java.util collections works > and I wonder if we should not keep the fail-fast guarantee for those > collections. > > R?mi > From forax at univ-mlv.fr Wed Sep 26 06:35:05 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 26 Sep 2012 15:35:05 +0200 Subject: Non interference enforcement In-Reply-To: <50630305.2000103@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <50630305.2000103@oracle.com> Message-ID: <50630489.4040708@univ-mlv.fr> On 09/26/2012 03:28 PM, Brian Goetz wrote: > We do piggyback on some degree of fail-fast when we use their iterators. yes, but using an iterator is not mandatory, right ? R?mi > > On 9/26/2012 9:26 AM, Remi Forax wrote: >> We currently ask users to write lambdas that doesn't interfere with the >> source collection >> of a stream but it's not enforced in the code. >> >> By example, >> list.stream().forEach(e -> { list.remove(e); }); >> may works or not depending how the pipeline is implemented. >> >> This is a serous departure for the current way java.util collections >> works >> and I wonder if we should not keep the fail-fast guarantee for those >> collections. >> >> R?mi >> From joe.bowbeer at gmail.com Wed Sep 26 06:36:58 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 26 Sep 2012 06:36:58 -0700 Subject: Nulls In-Reply-To: <5063009D.9060307@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063009D.9060307@cs.oswego.edu> Message-ID: All of the rules have implications for writers of new code that interfaces with legacy code. But only writers of new code are affected, right? On Wed, Sep 26, 2012 at 6:18 AM, Doug Lea wrote: > > On 09/25/12 10:48, Brian Goetz wrote: > >> null-aware as much as simply null-inappropriate -- i.e., don't use this >> method >> if you have nulls in your streams -- which means that the user does has >> to be >> null-aware. Which I think is your point -- the user has to reason about >> it. >> > > Yes. If the base framework kicks the nullness problem to its users, > then every user must deal with it. But many of the "users" of java.util > are not applications programs, but other utilities and frameworks that > do not know whether the issues apply or not. > > > boolean present; >>> T x; >>> try { >>> Optional r = ...findAny(...); >>> if (present = r.isPresent()) x = r.get(); >>> } catch(NPE ex) { >>> present = true; >>> x = null; >>> } >>> >>> Not very nice. >>> >> >> Or more nice: >> >> ....filter(o -> o != null).findAny(); >> > > No. If this is part of a utility as opposed to application program, > then it doesn't know if nulls are allowed or not, so has to be > correct in either case; or else advertise its policy as an > unchecked precondition. Many people will not do either of these. > This is the where most of the mistakes will appear. > So in the likely event that you choose these rules anyway, > it would be nice to at least alert the bug-detector folks > to look for such constructions. > > > -Doug > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/debe022c/attachment-0001.html From brian.goetz at oracle.com Wed Sep 26 06:37:02 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 09:37:02 -0400 Subject: Non interference enforcement In-Reply-To: <50630489.4040708@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <50630305.2000103@oracle.com> <50630489.4040708@univ-mlv.fr> Message-ID: <506304FE.2060002@oracle.com> Right. The existing enforcment uses a nonvolatile count; if we were to check it during a forEach (the other option), it would likely be hoisted out of the loop anyway. On 9/26/2012 9:35 AM, Remi Forax wrote: > On 09/26/2012 03:28 PM, Brian Goetz wrote: >> We do piggyback on some degree of fail-fast when we use their iterators. > > yes, but using an iterator is not mandatory, right ? > > R?mi > >> >> On 9/26/2012 9:26 AM, Remi Forax wrote: >>> We currently ask users to write lambdas that doesn't interfere with the >>> source collection >>> of a stream but it's not enforced in the code. >>> >>> By example, >>> list.stream().forEach(e -> { list.remove(e); }); >>> may works or not depending how the pipeline is implemented. >>> >>> This is a serous departure for the current way java.util collections >>> works >>> and I wonder if we should not keep the fail-fast guarantee for those >>> collections. >>> >>> R?mi >>> > From joe.bowbeer at gmail.com Wed Sep 26 06:45:15 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 26 Sep 2012 06:45:15 -0700 Subject: Non interference enforcement In-Reply-To: <5063028B.8040800@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> Message-ID: Btw, at first glance, this code snippet looked like ListIterator code that would be allowed. What about the following? Is it legal? listIterator.forEach(e -> { listIterator.remove(e); }); On Wed, Sep 26, 2012 at 6:26 AM, Remi Forax wrote: > We currently ask users to write lambdas that doesn't interfere with the > source collection > of a stream but it's not enforced in the code. > > By example, > list.stream().forEach(e -> { list.remove(e); }); > may works or not depending how the pipeline is implemented. > > This is a serous departure for the current way java.util collections works > and I wonder if we should not keep the fail-fast guarantee for those > collections. > > R?mi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/6f8fc2b1/attachment.html From brian.goetz at oracle.com Wed Sep 26 06:52:29 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 09:52:29 -0400 Subject: Non interference enforcement In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> Message-ID: <5063089D.5000804@oracle.com> ListIterator does not currently have a forEach method (nor does Iterator, for that matter.) And its remove method is nilary. So I guess I do not follow your question? On 9/26/2012 9:45 AM, Joe Bowbeer wrote: > Btw, at first glance, this code snippet looked like ListIterator code > that would be allowed. > > What about the following? Is it legal? > > listIterator.forEach(e -> { listIterator.remove(e); }); > > > On Wed, Sep 26, 2012 at 6:26 AM, Remi Forax > wrote: > > We currently ask users to write lambdas that doesn't interfere with > the source collection > of a stream but it's not enforced in the code. > > By example, > list.stream().forEach(e -> { list.remove(e); }); > may works or not depending how the pipeline is implemented. > > This is a serous departure for the current way java.util collections > works > and I wonder if we should not keep the fail-fast guarantee for those > collections. > > R?mi > > From brian.goetz at oracle.com Wed Sep 26 06:52:49 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 09:52:49 -0400 Subject: Nulls In-Reply-To: <5063009D.9060307@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063009D.9060307@cs.oswego.edu> Message-ID: <506308B1.1010107@oracle.com> Which is not wonderful, but not unlike the current status quo with collections, where the general contract may allow nulls, but the individual collections may or may not. Any library method that lets you pass in a collection now has to make the conservative assumptions that (a) you might read nulls from it and (b) you can't write nulls to it. Nulls are a big problem, and this approach does not solve them or even make it any better. Arguably it makes it a little worse, in that the surface area of the existing problem expands slightly, but is it qualitatively worse than the status quo? There's no good answer here. The alternatives are all some form of: - Ban nulls - Distort the API to try and accomodate nulls - Maintain the status quo of benign neglect I could go for the first but the path we're trying to chart here seems less heavy-handed? On 9/26/2012 9:18 AM, Doug Lea wrote: > > On 09/25/12 10:48, Brian Goetz wrote: >> null-aware as much as simply null-inappropriate -- i.e., don't use >> this method >> if you have nulls in your streams -- which means that the user does >> has to be >> null-aware. Which I think is your point -- the user has to reason >> about it. > > Yes. If the base framework kicks the nullness problem to its users, > then every user must deal with it. But many of the "users" of java.util > are not applications programs, but other utilities and frameworks that > do not know whether the issues apply or not. > >>> boolean present; >>> T x; >>> try { >>> Optional r = ...findAny(...); >>> if (present = r.isPresent()) x = r.get(); >>> } catch(NPE ex) { >>> present = true; >>> x = null; >>> } >>> >>> Not very nice. >> >> Or more nice: >> >> ....filter(o -> o != null).findAny(); > > No. If this is part of a utility as opposed to application program, > then it doesn't know if nulls are allowed or not, so has to be > correct in either case; or else advertise its policy as an > unchecked precondition. Many people will not do either of these. > This is the where most of the mistakes will appear. > So in the likely event that you choose these rules anyway, > it would be nice to at least alert the bug-detector folks > to look for such constructions. > > > -Doug > From tim at peierls.net Wed Sep 26 06:53:23 2012 From: tim at peierls.net (Tim Peierls) Date: Wed, 26 Sep 2012 09:53:23 -0400 Subject: Nulls In-Reply-To: <5063009D.9060307@cs.oswego.edu> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063009D.9060307@cs.oswego.edu> Message-ID: On Wed, Sep 26, 2012 at 9:18 AM, Doug Lea
wrote: > If the base framework kicks the nullness problem to its users, > then every user must deal with it. But many of the "users" of java.util > are not applications programs, but other utilities and frameworks that > do not know whether the issues apply or not. > > > ....filter(o -> o != null).findAny(); >> >> > No. If this is part of a utility as opposed to application program, > then it doesn't know if nulls are allowed or not, so has to be > correct in either case; or else advertise its policy as an > unchecked precondition. If such a utility says "Do not pass me nulls or I will do random things", it's up to the author of that utility to decide what random things to do. That could mean checking a collection for nulls when it's passed to the utility or ignoring nulls with filter(o -> o != null) at the last second. Or it could mean not checking at all. But at least the author of the utility has that choice. > Many people will not do either of these. > Writers of utilities and frameworks are a small subset of the total user base. I don't think it's unreasonable to ask them to make a choice between advertising a null-averse policy (with whatever level of checking they feel is appropriate) and a null-tolerant policy that kicks the problem on to their users. > This is the where most of the mistakes will appear. > As Joe points out, we're talking about mistakes in new code mixed with legacy (null-embracing) code. It's probably true that a lot of those mistakes will be due either to a failure to advertise (or to observe) null-averseness or to difficulties defining or following a null-tolerant policy. I bet the vast majority of these will result in runtime exceptions that are easily traced back to the source of the mistake. Is failing a little faster worth ruling out some valid usages and imposing a (tiny) performance cost? My experience with the Guava analogues of these constructs tells me it isn't. > So in the likely event that you choose these rules anyway, > it would be nice to at least alert the bug-detector folks > to look for such constructions. Yes. For filter(o -> o != null) or for the longer try ... catch (NPE) construction? (Maybe both?) --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/1807d588/attachment.html From joe.bowbeer at gmail.com Wed Sep 26 06:55:30 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 26 Sep 2012 06:55:30 -0700 Subject: Non interference enforcement In-Reply-To: <5063089D.5000804@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <5063089D.5000804@oracle.com> Message-ID: I'm asking whether ListIterator can be looped on, and if it can whether its remove method will work inside a loop. My take-away from your response is that ListIterator should *not* have a forEach method? On Wed, Sep 26, 2012 at 6:52 AM, Brian Goetz wrote: > ListIterator does not currently have a forEach method (nor does Iterator, > for that matter.) And its remove method is nilary. So I guess I do not > follow your question? > > On 9/26/2012 9:45 AM, Joe Bowbeer wrote: > >> Btw, at first glance, this code snippet looked like ListIterator code >> that would be allowed. >> >> What about the following? Is it legal? >> >> listIterator.forEach(e -> { listIterator.remove(e); }); >> >> >> On Wed, Sep 26, 2012 at 6:26 AM, Remi Forax > > wrote: >> >> We currently ask users to write lambdas that doesn't interfere with >> the source collection >> of a stream but it's not enforced in the code. >> >> By example, >> list.stream().forEach(e -> { list.remove(e); }); >> may works or not depending how the pipeline is implemented. >> >> This is a serous departure for the current way java.util collections >> works >> and I wonder if we should not keep the fail-fast guarantee for those >> collections. >> >> R?mi >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/a6424ce8/attachment-0001.html From brian.goetz at oracle.com Wed Sep 26 06:58:48 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 09:58:48 -0400 Subject: Non interference enforcement In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <5063089D.5000804@oracle.com> Message-ID: <50630A18.1080404@oracle.com> So far, that's right. We're not currently proposing to add forEach to Iterator, though I wouldn't rule that out either. ListIterator.remove(), though, removes the current element. If we did: interface ListIterator { default void forEach(Block) { for (T t : this) block.accpet(t); } } Then listIterator.forEach(e -> { listIterator.remove(); }); would work but mostly by accident. (If the subclass overrode forEach to, say, do lookahead, then it would not.) On 9/26/2012 9:55 AM, Joe Bowbeer wrote: > I'm asking whether ListIterator can be looped on, and if it can whether > its remove method will work inside a loop. > > My take-away from your response is that ListIterator should *not* have a > forEach method? > > On Wed, Sep 26, 2012 at 6:52 AM, Brian Goetz > wrote: > > ListIterator does not currently have a forEach method (nor does > Iterator, for that matter.) And its remove method is nilary. So I > guess I do not follow your question? > > On 9/26/2012 9:45 AM, Joe Bowbeer wrote: > > Btw, at first glance, this code snippet looked like ListIterator > code > that would be allowed. > > What about the following? Is it legal? > > listIterator.forEach(e -> { listIterator.remove(e); }); > > > On Wed, Sep 26, 2012 at 6:26 AM, Remi Forax > >> wrote: > > We currently ask users to write lambdas that doesn't > interfere with > the source collection > of a stream but it's not enforced in the code. > > By example, > list.stream().forEach(e -> { list.remove(e); }); > may works or not depending how the pipeline is implemented. > > This is a serous departure for the current way java.util > collections > works > and I wonder if we should not keep the fail-fast guarantee > for those > collections. > > R?mi > > > From forax at univ-mlv.fr Wed Sep 26 07:16:47 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 26 Sep 2012 16:16:47 +0200 Subject: Non interference enforcement In-Reply-To: <506304FE.2060002@oracle.com> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <50630305.2000103@oracle.com> <50630489.4040708@univ-mlv.fr> <506304FE.2060002@oracle.com> Message-ID: <50630E4F.8090707@univ-mlv.fr> On 09/26/2012 03:37 PM, Brian Goetz wrote: > Right. > > The existing enforcment uses a nonvolatile count; if we were to check > it during a forEach (the other option), it would likely be hoisted out > of the loop anyway. non-volatile -> not an issue, java.util collection aren't concurrent. Let's restrict ourselves to the case where the forEach loop and the collection mutation appear in the same thread. The count check is done in the forEach and the mutator increments the count so value can be hoisted. cheers, R?mi > > On 9/26/2012 9:35 AM, Remi Forax wrote: >> On 09/26/2012 03:28 PM, Brian Goetz wrote: >>> We do piggyback on some degree of fail-fast when we use their >>> iterators. >> >> yes, but using an iterator is not mandatory, right ? >> >> R?mi >> >>> >>> On 9/26/2012 9:26 AM, Remi Forax wrote: >>>> We currently ask users to write lambdas that doesn't interfere with >>>> the >>>> source collection >>>> of a stream but it's not enforced in the code. >>>> >>>> By example, >>>> list.stream().forEach(e -> { list.remove(e); }); >>>> may works or not depending how the pipeline is implemented. >>>> >>>> This is a serous departure for the current way java.util collections >>>> works >>>> and I wonder if we should not keep the fail-fast guarantee for those >>>> collections. >>>> >>>> R?mi >>>> >> From forax at univ-mlv.fr Wed Sep 26 07:21:43 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 26 Sep 2012 16:21:43 +0200 Subject: Non interference enforcement In-Reply-To: <50630E4F.8090707@univ-mlv.fr> References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063028B.8040800@univ-mlv.fr> <50630305.2000103@oracle.com> <50630489.4040708@univ-mlv.fr> <506304FE.2060002@oracle.com> <50630E4F.8090707@univ-mlv.fr> Message-ID: <50630F77.9040508@univ-mlv.fr> On 09/26/2012 04:16 PM, Remi Forax wrote: > On 09/26/2012 03:37 PM, Brian Goetz wrote: >> Right. >> >> The existing enforcment uses a nonvolatile count; if we were to check >> it during a forEach (the other option), it would likely be hoisted >> out of the loop anyway. > correction: non-volatile -> not an issue, java.util collection aren't concurrent. Let's restrict ourselves to the case where the forEach loop and the collection mutation appear in the same thread. The count check is done in the forEach and the mutator increments the count so value can *NOT* be hoisted. cheers, R?mi From dl at cs.oswego.edu Wed Sep 26 07:27:30 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 26 Sep 2012 10:27:30 -0400 Subject: Nulls In-Reply-To: References: <505399F5.80600@oracle.com> <50548FAB.5070602@cs.oswego.edu> <50573CFC.6070607@oracle.com> <505C626F.3080900@univ-mlv.fr> <505C9196.8020301@univ-mlv.fr> <505C9510.1090606@oracle.com> <505DA847.80207@cs.oswego.edu> <505DAE3A.1000702@univ-mlv.fr> <50604602.5040306@cs.oswego.edu> <50608569.7070603@oracle.com> <5060EB07.5060007@cs.oswego.edu> <5060F67B.6060802@oracle.com> <5061978C.5040605@cs.oswego.edu> <5061C43C.10501@oracle.com> <5063009D.9060307@cs.oswego.edu> Message-ID: <506310D2.80100@cs.oswego.edu> On 09/26/12 09:36, Joe Bowbeer wrote: > All of the rules have implications for writers of new code that interfaces with > legacy code. > > But only writers of new code are affected, right? > I think this mis-states the situation. Collections are not treated as "legacy". Instead a new framework is being offered (interposed), that is most appropriate for dense (streamy/listy) sources, that are often collections. So one useful goal is to have consistent rules that apply in these situations, not necessarily for all possible uses of all possible collections/arrays/maps/aggregates. -Doug From brian.goetz at oracle.com Wed Sep 26 13:00:54 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 16:00:54 -0400 Subject: Nulls Message-ID: <50635EF6.9030300@oracle.com> Trying again to categorize the choices and identify pros/cons... Seems like there are four buckets here: 1. Ban nulls. This means that feeding nulls into a Stream MUST produce an NPE. 2. Ignore nulls. 3. Tolerate nulls. Streams API takes no position on nulls, but may well pass elements to less tolerant destinations (e.g., user-provided lambdas, user-provided collections, Optional constructor.) Nulls may cause NPEs in these cases. 4. Embrace nulls. Ensure that every operation can deal with nulls in a well-defined manner. (This entails, for example, either dropping the Optional-bearing methods or making present Optional deal with null.) I think its safe to say that for each of these, there is some subset of us who finds it undesirable. Doug proposed (2) and (4). I proposed (3). Nearly everyone has some sympathy for (1) but no one really wants to be that intolerant. Attempted summary of pros/cons: 1 PRO: Predictable, simple 1 CON: Might be overly harsh, interferes with when user might actually want to see nulls and can deal accordingly 2 PRO: Simple 2 CON: size() lies, interferes with optimizations, interferes with when user might actually want to see nulls and can deal accordingly 3 PRO: Minimizes distortion on API, implementation in the null-free case 3 CON: more complex reasoning about what might happen, op behavior may change subtly over time as implementation changes 4 PRO: Predictable 4 CON: sacrifices functionality/safety for sake of a corner case From sam at sampullara.com Wed Sep 26 15:13:43 2012 From: sam at sampullara.com (Sam Pullara) Date: Wed, 26 Sep 2012 15:13:43 -0700 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: I choose 3. Easy to filter them out if you like 2 or fail if you like 1. 4 is just weird to me. Sam On Wed, Sep 26, 2012 at 1:00 PM, Brian Goetz wrote: > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST produce an > NPE. > > 2. Ignore nulls. > > 3. Tolerate nulls. Streams API takes no position on nulls, but may well > pass elements to less tolerant destinations (e.g., user-provided lambdas, > user-provided collections, Optional constructor.) Nulls may cause NPEs in > these cases. > > 4. Embrace nulls. Ensure that every operation can deal with nulls in a > well-defined manner. (This entails, for example, either dropping the > Optional-bearing methods or making present Optional deal with null.) > > > I think its safe to say that for each of these, there is some subset of us > who finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > sympathy for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually want > to see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with when user > might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > 3 CON: more complex reasoning about what might happen, op behavior may > change subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > > From crazybob at crazybob.org Wed Sep 26 18:09:25 2012 From: crazybob at crazybob.org (Bob Lee) Date: Wed, 26 Sep 2012 20:09:25 -0500 Subject: Nulls In-Reply-To: References: <50635EF6.9030300@oracle.com> Message-ID: 3. Bob On Wed, Sep 26, 2012 at 5:13 PM, Sam Pullara wrote: > I choose 3. Easy to filter them out if you like 2 or fail if you like > 1. 4 is just weird to me. > > Sam > > On Wed, Sep 26, 2012 at 1:00 PM, Brian Goetz > wrote: > > Trying again to categorize the choices and identify pros/cons... > > > > Seems like there are four buckets here: > > > > 1. Ban nulls. This means that feeding nulls into a Stream MUST produce > an > > NPE. > > > > 2. Ignore nulls. > > > > 3. Tolerate nulls. Streams API takes no position on nulls, but may well > > pass elements to less tolerant destinations (e.g., user-provided lambdas, > > user-provided collections, Optional constructor.) Nulls may cause NPEs > in > > these cases. > > > > 4. Embrace nulls. Ensure that every operation can deal with nulls in a > > well-defined manner. (This entails, for example, either dropping the > > Optional-bearing methods or making present Optional deal with null.) > > > > > > I think its safe to say that for each of these, there is some subset of > us > > who finds it undesirable. > > > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > > sympathy for (1) but no one really wants to be that intolerant. > > > > > > Attempted summary of pros/cons: > > > > 1 PRO: Predictable, simple > > 1 CON: Might be overly harsh, interferes with when user might actually > want > > to see nulls and can deal accordingly > > > > 2 PRO: Simple > > 2 CON: size() lies, interferes with optimizations, interferes with when > user > > might actually want to see nulls and can deal accordingly > > > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > > 3 CON: more complex reasoning about what might happen, op behavior may > > change subtly over time as implementation changes > > > > 4 PRO: Predictable > > 4 CON: sacrifices functionality/safety for sake of a corner case > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/066edbb6/attachment.html From kevinb at google.com Wed Sep 26 15:13:02 2012 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 26 Sep 2012 15:13:02 -0700 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: My teammate Colin and I discussed this some and: On Wed, Sep 26, 2012 at 1:00 PM, Brian Goetz wrote: > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST produce > an NPE. > This is more null-hostile than even we'd be comfortable with. Does anyone actually back this? 2. Ignore nulls. > imho, way too surprising. Is anyone backing this? > 3. Tolerate nulls. Streams API takes no position on nulls, but may well > pass elements to less tolerant destinations (e.g., user-provided lambdas, > user-provided collections, Optional constructor.) Nulls may cause NPEs in > these cases. > This is both reasonable and what Guava does. 4. Embrace nulls. Ensure that every operation can deal with nulls in a > well-defined manner. (This entails, for example, either dropping the > Optional-bearing methods or making present Optional deal with null.) > I'm not sure what exactly this means when I supply a lambda that can't handle null. You're saying the NPE thrown by that predicate should be * caught* and handled somehow? > > > I think its safe to say that for each of these, there is some subset of us > who finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > sympathy for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually > want to see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with when > user might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > 3 CON: more complex reasoning about what might happen, op behavior may > change subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/2762f798/attachment.html From joe.bowbeer at gmail.com Wed Sep 26 18:11:08 2012 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 26 Sep 2012 18:11:08 -0700 Subject: Nulls In-Reply-To: References: <50635EF6.9030300@oracle.com> Message-ID: 3 -- if you can pull it off. If tolerance isn't possible then I'd embrace, ban, and, as a last resort, ignore. On Wed, Sep 26, 2012 at 3:13 PM, Kevin Bourrillion wrote: > My teammate Colin and I discussed this some and: > > On Wed, Sep 26, 2012 at 1:00 PM, Brian Goetz wrote: > >> Trying again to categorize the choices and identify pros/cons... >> >> Seems like there are four buckets here: >> >> 1. Ban nulls. This means that feeding nulls into a Stream MUST produce >> an NPE. >> > > This is more null-hostile than even we'd be comfortable with. Does anyone > actually back this? > > > 2. Ignore nulls. >> > > imho, way too surprising. Is anyone backing this? > > > >> 3. Tolerate nulls. Streams API takes no position on nulls, but may well >> pass elements to less tolerant destinations (e.g., user-provided lambdas, >> user-provided collections, Optional constructor.) Nulls may cause NPEs in >> these cases. >> > > This is both reasonable and what Guava does. > > > 4. Embrace nulls. Ensure that every operation can deal with nulls in a >> well-defined manner. (This entails, for example, either dropping the >> Optional-bearing methods or making present Optional deal with null.) >> > > I'm not sure what exactly this means when I supply a lambda that can't > handle null. You're saying the NPE thrown by that predicate should be * > caught* and handled somehow? > > > >> >> >> I think its safe to say that for each of these, there is some subset of >> us who finds it undesirable. >> >> Doug proposed (2) and (4). I proposed (3). Nearly everyone has some >> sympathy for (1) but no one really wants to be that intolerant. >> >> >> Attempted summary of pros/cons: >> >> 1 PRO: Predictable, simple >> 1 CON: Might be overly harsh, interferes with when user might actually >> want to see nulls and can deal accordingly >> >> 2 PRO: Simple >> 2 CON: size() lies, interferes with optimizations, interferes with when >> user might actually want to see nulls and can deal accordingly >> >> 3 PRO: Minimizes distortion on API, implementation in the null-free case >> 3 CON: more complex reasoning about what might happen, op behavior may >> change subtly over time as implementation changes >> >> 4 PRO: Predictable >> 4 CON: sacrifices functionality/safety for sake of a corner case >> >> >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20120926/82489063/attachment-0001.html From brian.goetz at oracle.com Wed Sep 26 18:23:27 2012 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Sep 2012 21:23:27 -0400 Subject: Nulls In-Reply-To: References: <50635EF6.9030300@oracle.com> Message-ID: <5063AA8F.4040308@oracle.com> > 3 -- if you can pull it off. Having done a brief audit through the code, here are the places where the NPE is triggered not by user-provided code, but situationally: - Option-bearing findXxx -- this has already been discussed; if you have a stream that has nulls and non-nulls, its possible find might select a null, and then try to pass it to the Option ctor, which will NPE. Workaround: filter out nulls with .filter(o -> o != null). - removeDuplicates. As currently written, we'd NPE on null, but this can probably be addressed if we cared. The rest of the cases would involve nulls being sent to lambdas or to collections that couldn't deal with it. From david.lloyd at redhat.com Wed Sep 26 19:04:27 2012 From: david.lloyd at redhat.com (David M. Lloyd) Date: Wed, 26 Sep 2012 21:04:27 -0500 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: <5063B42B.5040706@redhat.com> #3 seems like common sense to me... On 09/26/2012 03:00 PM, Brian Goetz wrote: > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST produce > an NPE. > > 2. Ignore nulls. > > 3. Tolerate nulls. Streams API takes no position on nulls, but may > well pass elements to less tolerant destinations (e.g., user-provided > lambdas, user-provided collections, Optional constructor.) Nulls may > cause NPEs in these cases. > > 4. Embrace nulls. Ensure that every operation can deal with nulls in a > well-defined manner. (This entails, for example, either dropping the > Optional-bearing methods or making present Optional deal with null.) > > > I think its safe to say that for each of these, there is some subset of > us who finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > sympathy for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually > want to see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with when > user might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > 3 CON: more complex reasoning about what might happen, op behavior may > change subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > > -- - DML From forax at univ-mlv.fr Wed Sep 26 23:47:55 2012 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 27 Sep 2012 08:47:55 +0200 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: <5063F69B.6080900@univ-mlv.fr> 3 is in my opinion the only reasonable choice. R?mi On 09/26/2012 10:00 PM, Brian Goetz wrote: > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST > produce an NPE. > > 2. Ignore nulls. > > 3. Tolerate nulls. Streams API takes no position on nulls, but may > well pass elements to less tolerant destinations (e.g., user-provided > lambdas, user-provided collections, Optional constructor.) Nulls may > cause NPEs in these cases. > > 4. Embrace nulls. Ensure that every operation can deal with nulls in > a well-defined manner. (This entails, for example, either dropping > the Optional-bearing methods or making present Optional deal with null.) > > > I think its safe to say that for each of these, there is some subset > of us who finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > sympathy for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually > want to see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with > when user might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > 3 CON: more complex reasoning about what might happen, op behavior may > change subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > > From Donald.Raab at gs.com Thu Sep 27 06:19:56 2012 From: Donald.Raab at gs.com (Raab, Donald) Date: Thu, 27 Sep 2012 09:19:56 -0400 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C05403A01E9655@GSCMAMP09EX.firmwide.corp.gs.com> 3 seems the most reasonable option to me. Don > -----Original Message----- > From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- > libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz > Sent: Wednesday, September 26, 2012 4:01 PM > To: lambda-libs-spec-experts at openjdk.java.net > Subject: Nulls > > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST > produce an NPE. > > 2. Ignore nulls. > > 3. Tolerate nulls. Streams API takes no position on nulls, but may > well pass elements to less tolerant destinations (e.g., user-provided > lambdas, user-provided collections, Optional constructor.) Nulls may > cause NPEs in these cases. > > 4. Embrace nulls. Ensure that every operation can deal with nulls in > a well-defined manner. (This entails, for example, either dropping the > Optional-bearing methods or making present Optional deal with null.) > > > I think its safe to say that for each of these, there is some subset of > us who finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some > sympathy for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually > want to see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with when > user might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free > case > 3 CON: more complex reasoning about what might happen, op behavior may > change subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > From dl at cs.oswego.edu Thu Sep 27 06:51:45 2012 From: dl at cs.oswego.edu (Doug Lea) Date: Thu, 27 Sep 2012 09:51:45 -0400 Subject: Nulls In-Reply-To: <50635EF6.9030300@oracle.com> References: <50635EF6.9030300@oracle.com> Message-ID: <506459F1.6030204@cs.oswego.edu> Just for the record, I'm still sure that (3) is a mistake. Oh well. -Doug On 09/26/12 16:00, Brian Goetz wrote: > Trying again to categorize the choices and identify pros/cons... > > Seems like there are four buckets here: > > 1. Ban nulls. This means that feeding nulls into a Stream MUST produce an NPE. > > 2. Ignore nulls. > > 3. Tolerate nulls. Streams API takes no position on nulls, but may well pass > elements to less tolerant destinations (e.g., user-provided lambdas, > user-provided collections, Optional constructor.) Nulls may cause NPEs in these > cases. > > 4. Embrace nulls. Ensure that every operation can deal with nulls in a > well-defined manner. (This entails, for example, either dropping the > Optional-bearing methods or making present Optional deal with null.) > > > I think its safe to say that for each of these, there is some subset of us who > finds it undesirable. > > Doug proposed (2) and (4). I proposed (3). Nearly everyone has some sympathy > for (1) but no one really wants to be that intolerant. > > > Attempted summary of pros/cons: > > 1 PRO: Predictable, simple > 1 CON: Might be overly harsh, interferes with when user might actually want to > see nulls and can deal accordingly > > 2 PRO: Simple > 2 CON: size() lies, interferes with optimizations, interferes with when user > might actually want to see nulls and can deal accordingly > > 3 PRO: Minimizes distortion on API, implementation in the null-free case > 3 CON: more complex reasoning about what might happen, op behavior may change > subtly over time as implementation changes > > 4 PRO: Predictable > 4 CON: sacrifices functionality/safety for sake of a corner case > >