From brian.goetz at oracle.com Wed May 1 11:01:17 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 01 May 2013 14:01:17 -0400 Subject: Concat and zip Message-ID: <5181586D.80509@oracle.com> We're almost finished syncing the Streams code from the lambda sandbox into the jdk8 repositories. One of the last items on the list are the "concat" and "zip" methods we've currently got in "Streams". First, these aren't "done" because we only have reference versions, not Int/Long/Double versions. So more work would be required to finish these, or we'd have to take the hit for only having reference versions (and endless requests to add specializations. And we're also missing the SAMs that would be needed for primitive versions of zip.) Second, I'm starting to have YAGNI thoughts on these, especially zip. Zip shows up all over the place in functional languages, but the efficacy of these idioms relies on tuples being cheap. Also zip parallelizes basically not at all. From joe.bowbeer at gmail.com Wed May 1 13:03:41 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 1 May 2013 13:03:41 -0700 Subject: Concat and zip In-Reply-To: <5181586D.80509@oracle.com> References: <5181586D.80509@oracle.com> Message-ID: In defense of zip: I think it is fairly important. Ease of expression (and productivity) are still the most valuable benefits of new features, and zip is important for a class of problems. zip exists in Python, Scala, and Groovy (as "transpose" method), to name a few. I would keep zip without Int/Long/Double versions. This also avoids all the varieties of mixed primitive/Object pairings in zipped streams. On Wed, May 1, 2013 at 11:01 AM, Brian Goetz wrote: > We're almost finished syncing the Streams code from the lambda sandbox > into the jdk8 repositories. One of the last items on the list are the > "concat" and "zip" methods we've currently got in "Streams". > > First, these aren't "done" because we only have reference versions, not > Int/Long/Double versions. So more work would be required to finish these, > or we'd have to take the hit for only having reference versions (and > endless requests to add specializations. And we're also missing the SAMs > that would be needed for primitive versions of zip.) > > Second, I'm starting to have YAGNI thoughts on these, especially zip. Zip > shows up all over the place in functional languages, but the efficacy of > these idioms relies on tuples being cheap. Also zip parallelizes basically > not at all. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130501/e27d563d/attachment.html From tim at peierls.net Wed May 1 13:25:02 2013 From: tim at peierls.net (Tim Peierls) Date: Wed, 1 May 2013 16:25:02 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> Message-ID: On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer wrote: > In defense of zip: I think it is fairly important. Ease of expression > (and productivity) are still the most valuable benefits of new features, > and zip is important for a class of problems. zip exists in Python, Scala, > and Groovy (as "transpose" method), to name a few. > > I would keep zip without Int/Long/Double versions. This also avoids all > the varieties of mixed primitive/Object pairings in zipped streams. > That sounds reasonable. I was going to say make the ref/ref version an example in javadoc and then people can roll their own primitive/ref versions, but maybe that's asking too much. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130501/9d878e04/attachment.html From brian.goetz at oracle.com Wed May 1 13:35:57 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 01 May 2013 16:35:57 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> Message-ID: <51817CAD.4040003@oracle.com> > In defense of zip: I think it is fairly important. Ease of expression > (and productivity) are still the most valuable benefits of new features, > and zip is important for a class of problems. zip exists in Python, > Scala, and Groovy (as "transpose" method), to name a few. Yes, that's all good. My worry is that the idiom doesn't transport well, and users will end up having a bad experience? From brian.goetz at oracle.com Wed May 1 13:43:08 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 01 May 2013 16:43:08 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> Message-ID: <51817E5C.1000300@oracle.com> Yes, a key aspect of this YAGNI suggestion is that this is one of the things people CAN do for themselves, possibly with an example. And, because there are multiple ways to do zip (pad out shorter stream to size of longer stream, stop at end of shorter stream, throw if streams are not of equal length) people may well want to roll their own. On 5/1/2013 4:25 PM, Tim Peierls wrote: > On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer > wrote: > > In defense of zip: I think it is fairly important. Ease of > expression (and productivity) are still the most valuable benefits > of new features, and zip is important for a class of problems. zip > exists in Python, Scala, and Groovy (as "transpose" method), to name > a few. > > I would keep zip without Int/Long/Double versions. This also avoids > all the varieties of mixed primitive/Object pairings in zipped streams. > > > That sounds reasonable. I was going to say make the ref/ref version an > example in javadoc and then people can roll their own primitive/ref > versions, but maybe that's asking too much. > > --tim From Donald.Raab at gs.com Wed May 1 13:51:33 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Wed, 1 May 2013 16:51:33 -0400 Subject: Concat and zip In-Reply-To: <51817E5C.1000300@oracle.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> I'm polling to see how many users of GS Collections have used zip internally. We have it on our object API, not on our primitive API. I'll let you know the response I get in the next day or so. > -----Original Message----- > From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- > libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz > Sent: Wednesday, May 01, 2013 4:43 PM > To: Tim Peierls > Cc: lambda-libs-spec-experts at openjdk.java.net > Subject: Re: Concat and zip > > Yes, a key aspect of this YAGNI suggestion is that this is one of the > things people CAN do for themselves, possibly with an example. And, > because there are multiple ways to do zip (pad out shorter stream to > size of longer stream, stop at end of shorter stream, throw if streams > are not of equal length) people may well want to roll their own. > > On 5/1/2013 4:25 PM, Tim Peierls wrote: > > On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer > > wrote: > > > > In defense of zip: I think it is fairly important. Ease of > > expression (and productivity) are still the most valuable > benefits > > of new features, and zip is important for a class of problems. > zip > > exists in Python, Scala, and Groovy (as "transpose" method), to > name > > a few. > > > > I would keep zip without Int/Long/Double versions. This also > avoids > > all the varieties of mixed primitive/Object pairings in zipped > streams. > > > > > > That sounds reasonable. I was going to say make the ref/ref version > an > > example in javadoc and then people can roll their own primitive/ref > > versions, but maybe that's asking too much. > > > > --tim From joe.bowbeer at gmail.com Wed May 1 14:05:00 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 1 May 2013 14:05:00 -0700 Subject: Concat and zip In-Reply-To: <51817E5C.1000300@oracle.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> Message-ID: I do foresee wanting to use zip in Java8, I don't want to write it myself, and I'm not worried about transportation problems (how much worse than Python can it be?). On Wed, May 1, 2013 at 1:43 PM, Brian Goetz wrote: > Yes, a key aspect of this YAGNI suggestion is that this is one of the > things people CAN do for themselves, possibly with an example. And, > because there are multiple ways to do zip (pad out shorter stream to size > of longer stream, stop at end of shorter stream, throw if streams are not > of equal length) people may well want to roll their own. > > > On 5/1/2013 4:25 PM, Tim Peierls wrote: > >> On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer > **> wrote: >> >> In defense of zip: I think it is fairly important. Ease of >> expression (and productivity) are still the most valuable benefits >> of new features, and zip is important for a class of problems. zip >> exists in Python, Scala, and Groovy (as "transpose" method), to name >> a few. >> >> I would keep zip without Int/Long/Double versions. This also avoids >> all the varieties of mixed primitive/Object pairings in zipped >> streams. >> >> >> That sounds reasonable. I was going to say make the ref/ref version an >> example in javadoc and then people can roll their own primitive/ref >> versions, but maybe that's asking too much. >> >> --tim >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130501/f9c2919c/attachment.html From joe.bowbeer at gmail.com Wed May 1 14:05:58 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Wed, 1 May 2013 14:05:58 -0700 Subject: Concat and zip In-Reply-To: <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: I've used zip in Python and Scala, so I'm pretty sure I'll be looking for it in Java8. On Wed, May 1, 2013 at 1:51 PM, Raab, Donald [Tech] wrote: > I'm polling to see how many users of GS Collections have used zip > internally. We have it on our object API, not on our primitive API. I'll > let you know the response I get in the next day or so. > > > -----Original Message----- > > From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- > > libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz > > Sent: Wednesday, May 01, 2013 4:43 PM > > To: Tim Peierls > > Cc: lambda-libs-spec-experts at openjdk.java.net > > Subject: Re: Concat and zip > > > > Yes, a key aspect of this YAGNI suggestion is that this is one of the > > things people CAN do for themselves, possibly with an example. And, > > because there are multiple ways to do zip (pad out shorter stream to > > size of longer stream, stop at end of shorter stream, throw if streams > > are not of equal length) people may well want to roll their own. > > > > On 5/1/2013 4:25 PM, Tim Peierls wrote: > > > On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer > > > wrote: > > > > > > In defense of zip: I think it is fairly important. Ease of > > > expression (and productivity) are still the most valuable > > benefits > > > of new features, and zip is important for a class of problems. > > zip > > > exists in Python, Scala, and Groovy (as "transpose" method), to > > name > > > a few. > > > > > > I would keep zip without Int/Long/Double versions. This also > > avoids > > > all the varieties of mixed primitive/Object pairings in zipped > > streams. > > > > > > > > > That sounds reasonable. I was going to say make the ref/ref version > > an > > > example in javadoc and then people can roll their own primitive/ref > > > versions, but maybe that's asking too much. > > > > > > --tim > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130501/8200b539/attachment-0001.html From brian.goetz at oracle.com Wed May 1 14:10:33 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 01 May 2013 17:10:33 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <518184C9.8070006@oracle.com> Right, but the question is, how badly can we implement it and have it not be worse than nothing? And, with the current performance characteristics (new object per element), are we below that threshold? My problem is the same as with flatMap -- these are idioms from other languages that *translate poorly* to Java because of the lack of tuples and other structural types. (The flatMap we got left with -- which I reluctantly supported as the lesser of evils -- is, coincidentally, the only other stream operation that has allocation-per-element.) At what level of translation-fidelity loss do we say "yeah, it works great in that other environment, but too much is lost in translation"? I don't doubt the utility of zip, or the fact that Joe-alikes will want it, and would be bummed to not find it. My question is whether the crappy zip we can have is better than no zip. (Where better doesn't just mean "better than nothing", but carries its weight.) On 5/1/2013 5:05 PM, Joe Bowbeer wrote: > I've used zip in Python and Scala, so I'm pretty sure I'll be looking > for it in Java8. > > > On Wed, May 1, 2013 at 1:51 PM, Raab, Donald [Tech] > wrote: > > I'm polling to see how many users of GS Collections have used zip > internally. We have it on our object API, not on our primitive API. > I'll let you know the response I get in the next day or so. > > > -----Original Message----- > > From: lambda-libs-spec-experts-bounces at openjdk.java.net > > [mailto:lambda- > > libs-spec-experts-bounces at openjdk.java.net > ] On Behalf Of > Brian Goetz > > Sent: Wednesday, May 01, 2013 4:43 PM > > To: Tim Peierls > > Cc: lambda-libs-spec-experts at openjdk.java.net > > > Subject: Re: Concat and zip > > > > Yes, a key aspect of this YAGNI suggestion is that this is one of the > > things people CAN do for themselves, possibly with an example. And, > > because there are multiple ways to do zip (pad out shorter stream to > > size of longer stream, stop at end of shorter stream, throw if > streams > > are not of equal length) people may well want to roll their own. > > > > On 5/1/2013 4:25 PM, Tim Peierls wrote: > > > On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer > > > > >> > wrote: > > > > > > In defense of zip: I think it is fairly important. Ease of > > > expression (and productivity) are still the most valuable > > benefits > > > of new features, and zip is important for a class of problems. > > zip > > > exists in Python, Scala, and Groovy (as "transpose" method), to > > name > > > a few. > > > > > > I would keep zip without Int/Long/Double versions. This also > > avoids > > > all the varieties of mixed primitive/Object pairings in zipped > > streams. > > > > > > > > > That sounds reasonable. I was going to say make the ref/ref version > > an > > > example in javadoc and then people can roll their own primitive/ref > > > versions, but maybe that's asking too much. > > > > > > --tim > > From spullara at gmail.com Wed May 1 14:12:45 2013 From: spullara at gmail.com (Sam Pullara) Date: Wed, 1 May 2013 14:12:45 -0700 Subject: Concat and zip In-Reply-To: <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: It seems useful enough for the Object case to keep it in but not worth its weight to add it to the primitive stuff. Sam On May 1, 2013, at 1:51 PM, "Raab, Donald [Tech]" wrote: > I'm polling to see how many users of GS Collections have used zip internally. We have it on our object API, not on our primitive API. I'll let you know the response I get in the next day or so. > >> -----Original Message----- >> From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- >> libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz >> Sent: Wednesday, May 01, 2013 4:43 PM >> To: Tim Peierls >> Cc: lambda-libs-spec-experts at openjdk.java.net >> Subject: Re: Concat and zip >> >> Yes, a key aspect of this YAGNI suggestion is that this is one of the >> things people CAN do for themselves, possibly with an example. And, >> because there are multiple ways to do zip (pad out shorter stream to >> size of longer stream, stop at end of shorter stream, throw if streams >> are not of equal length) people may well want to roll their own. >> >> On 5/1/2013 4:25 PM, Tim Peierls wrote: >>> On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer >> > wrote: >>> >>> In defense of zip: I think it is fairly important. Ease of >>> expression (and productivity) are still the most valuable >> benefits >>> of new features, and zip is important for a class of problems. >> zip >>> exists in Python, Scala, and Groovy (as "transpose" method), to >> name >>> a few. >>> >>> I would keep zip without Int/Long/Double versions. This also >> avoids >>> all the varieties of mixed primitive/Object pairings in zipped >> streams. >>> >>> >>> That sounds reasonable. I was going to say make the ref/ref version >> an >>> example in javadoc and then people can roll their own primitive/ref >>> versions, but maybe that's asking too much. >>> >>> --tim From brian.goetz at oracle.com Wed May 1 14:15:25 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 01 May 2013 17:15:25 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <518185ED.6050008@oracle.com> So far I'm seeing: - No one throwing themselves in front of the train for concat - No one throwing themselves in front of the train for primitive zip - Widespread regret if we were to ditch reference zip Is that about right? On 5/1/2013 5:12 PM, Sam Pullara wrote: > It seems useful enough for the Object case to keep it in but not worth its weight to add it to the primitive stuff. > > Sam > > On May 1, 2013, at 1:51 PM, "Raab, Donald [Tech]" wrote: > >> I'm polling to see how many users of GS Collections have used zip internally. We have it on our object API, not on our primitive API. I'll let you know the response I get in the next day or so. >> >>> -----Original Message----- >>> From: lambda-libs-spec-experts-bounces at openjdk.java.net [mailto:lambda- >>> libs-spec-experts-bounces at openjdk.java.net] On Behalf Of Brian Goetz >>> Sent: Wednesday, May 01, 2013 4:43 PM >>> To: Tim Peierls >>> Cc: lambda-libs-spec-experts at openjdk.java.net >>> Subject: Re: Concat and zip >>> >>> Yes, a key aspect of this YAGNI suggestion is that this is one of the >>> things people CAN do for themselves, possibly with an example. And, >>> because there are multiple ways to do zip (pad out shorter stream to >>> size of longer stream, stop at end of shorter stream, throw if streams >>> are not of equal length) people may well want to roll their own. >>> >>> On 5/1/2013 4:25 PM, Tim Peierls wrote: >>>> On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer >>> > wrote: >>>> >>>> In defense of zip: I think it is fairly important. Ease of >>>> expression (and productivity) are still the most valuable >>> benefits >>>> of new features, and zip is important for a class of problems. >>> zip >>>> exists in Python, Scala, and Groovy (as "transpose" method), to >>> name >>>> a few. >>>> >>>> I would keep zip without Int/Long/Double versions. This also >>> avoids >>>> all the varieties of mixed primitive/Object pairings in zipped >>> streams. >>>> >>>> >>>> That sounds reasonable. I was going to say make the ref/ref version >>> an >>>> example in javadoc and then people can roll their own primitive/ref >>>> versions, but maybe that's asking too much. >>>> >>>> --tim > From joe.bowbeer at gmail.com Thu May 2 08:08:04 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 2 May 2013 08:08:04 -0700 Subject: Concat and zip In-Reply-To: <518185ED.6050008@oracle.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> <518185ED.6050008@oracle.com> Message-ID: Can anyone who has used Stream.zip or concat comment on their experience? I thought I remembered some zip uses in Sam's code (twitterprocessor or somewhere else?) but I can't find them now. On Wed, May 1, 2013 at 2:15 PM, Brian Goetz wrote: > So far I'm seeing: > > - No one throwing themselves in front of the train for concat > - No one throwing themselves in front of the train for primitive zip > - Widespread regret if we were to ditch reference zip > > Is that about right? > > > On 5/1/2013 5:12 PM, Sam Pullara wrote: > >> It seems useful enough for the Object case to keep it in but not worth >> its weight to add it to the primitive stuff. >> >> Sam >> >> On May 1, 2013, at 1:51 PM, "Raab, Donald [Tech]" >> wrote: >> >> I'm polling to see how many users of GS Collections have used zip >>> internally. We have it on our object API, not on our primitive API. I'll >>> let you know the response I get in the next day or so. >>> >>> -----Original Message----- >>>> From: lambda-libs-spec-experts-**bounces at openjdk.java.net[mailto: >>>> lambda- >>>> libs-spec-experts-bounces@**openjdk.java.net] >>>> On Behalf Of Brian Goetz >>>> Sent: Wednesday, May 01, 2013 4:43 PM >>>> To: Tim Peierls >>>> Cc: lambda-libs-spec-experts@**openjdk.java.net >>>> Subject: Re: Concat and zip >>>> >>>> Yes, a key aspect of this YAGNI suggestion is that this is one of the >>>> things people CAN do for themselves, possibly with an example. And, >>>> because there are multiple ways to do zip (pad out shorter stream to >>>> size of longer stream, stop at end of shorter stream, throw if streams >>>> are not of equal length) people may well want to roll their own. >>>> >>>> On 5/1/2013 4:25 PM, Tim Peierls wrote: >>>> >>>>> On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer >>>> **> wrote: >>>>> >>>>> In defense of zip: I think it is fairly important. Ease of >>>>> expression (and productivity) are still the most valuable >>>>> >>>> benefits >>>> >>>>> of new features, and zip is important for a class of problems. >>>>> >>>> zip >>>> >>>>> exists in Python, Scala, and Groovy (as "transpose" method), to >>>>> >>>> name >>>> >>>>> a few. >>>>> >>>>> I would keep zip without Int/Long/Double versions. This also >>>>> >>>> avoids >>>> >>>>> all the varieties of mixed primitive/Object pairings in zipped >>>>> >>>> streams. >>>> >>>>> >>>>> >>>>> That sounds reasonable. I was going to say make the ref/ref version >>>>> >>>> an >>>> >>>>> example in javadoc and then people can roll their own primitive/ref >>>>> versions, but maybe that's asking too much. >>>>> >>>>> --tim >>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130502/beb67742/attachment.html From Donald.Raab at gs.com Thu May 2 09:41:09 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Thu, 2 May 2013 12:41:09 -0400 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F16AF@GSCMAMP09EX.firmwide.corp.gs.com> OK, so a day later, we had 3 folks report back with 4 instances of using zip with GS Collections. Two of the folks are fluent in Python and Lisp. There are probably more instances. One of the interesting things about zip is that unless you have programmed in a functional language, you probably won't know it even exists as a pattern. But if you have, you might go looking for it. > -----Original Message----- > From: Sam Pullara [mailto:spullara at gmail.com] > Sent: Wednesday, May 01, 2013 5:13 PM > To: Raab, Donald [Tech] > Cc: 'Brian Goetz'; 'Tim Peierls'; 'lambda-libs-spec- > experts at openjdk.java.net' > Subject: Re: Concat and zip > > It seems useful enough for the Object case to keep it in but not worth > its weight to add it to the primitive stuff. > > Sam > > On May 1, 2013, at 1:51 PM, "Raab, Donald [Tech]" > wrote: > > > I'm polling to see how many users of GS Collections have used zip > internally. We have it on our object API, not on our primitive API. > I'll let you know the response I get in the next day or so. > > > >> -----Original Message----- > >> From: lambda-libs-spec-experts-bounces at openjdk.java.net > >> [mailto:lambda- libs-spec-experts-bounces at openjdk.java.net] On > Behalf > >> Of Brian Goetz > >> Sent: Wednesday, May 01, 2013 4:43 PM > >> To: Tim Peierls > >> Cc: lambda-libs-spec-experts at openjdk.java.net > >> Subject: Re: Concat and zip > >> > >> Yes, a key aspect of this YAGNI suggestion is that this is one of > the > >> things people CAN do for themselves, possibly with an example. And, > >> because there are multiple ways to do zip (pad out shorter stream to > >> size of longer stream, stop at end of shorter stream, throw if > >> streams are not of equal length) people may well want to roll their > own. > >> > >> On 5/1/2013 4:25 PM, Tim Peierls wrote: > >>> On Wed, May 1, 2013 at 4:03 PM, Joe Bowbeer >>> > wrote: > >>> > >>> In defense of zip: I think it is fairly important. Ease of > >>> expression (and productivity) are still the most valuable > >> benefits > >>> of new features, and zip is important for a class of problems. > >> zip > >>> exists in Python, Scala, and Groovy (as "transpose" method), to > >> name > >>> a few. > >>> > >>> I would keep zip without Int/Long/Double versions. This also > >> avoids > >>> all the varieties of mixed primitive/Object pairings in zipped > >> streams. > >>> > >>> > >>> That sounds reasonable. I was going to say make the ref/ref version > >> an > >>> example in javadoc and then people can roll their own primitive/ref > >>> versions, but maybe that's asking too much. > >>> > >>> --tim From paul.sandoz at oracle.com Thu May 2 12:41:23 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 2 May 2013 21:41:23 +0200 Subject: Ranges Message-ID: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Hi, Any thoughts below on the following? use-cases? experiences? Paul. -- At the moment we have the {Int, Long, Double}Stream.range methods for a half open range of a step of 1 or a configurable step >= 0. Do we really require the configurable step methods? Perhaps they are not that common, if so should we remove them? Certainly one can achieve a step > 1 using map(i -> i * n), albeit less efficiently. Note that this will not check for overflow when the upper bound declared to be the maximum value. -- The fact that we only have a half open range confuses some. Developers i think tend to prefer: IntStream.rangeClosed('A', 'Z') rather than: IntStream.range('A', 'Z' + 1) So it seems useful to add {Int, Long}Stream.rangeClosed(). -- Do we require a method for descending ranges, for example {Int, Long}Stream.rangeDec? what about for closed descending ranges? Again those can be achieved using map, albeit less efficiently and with overflow errors, and perhaps developers will make simple errors expressing the mapping? -- For DoubleStream.range i think supporting +ve/-ve steps is useful e.g. [0, 2 * pi, pi / 180] and [2 * pi, 0, -pi / 180] Should the range always be closed, half open or both? Half open ranges for Int/Long make sense because of the correlation with array indexes. -- Should we support sizes for LongStream.range that are > Long.MAX_VALUE? for example: LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE) The top-level Spliterator of such a stream cannot report SIZED or SUBSIZED and would report an estimate of Long.MAX_VALUE. DoubleStream.range is restricted to a maximum of Long.MAX_VALUE. This simplifies the implementation, since it is equivalent to: double size = Math.ceil((endExclusive - startInclusive) / step); LongStream.range(0, (long) size).mapToDouble(i -> startInclusive + i * step); and a simplified version of the long range spliterator is used. From paul.sandoz at oracle.com Fri May 3 01:46:21 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 3 May 2013 10:46:21 +0200 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: Following on from this are the idiomatic stream creation methods, we have previously discussed: IntStream.ints(); LongStream.longs(); DoubleStream.doubles(); Half-open, or closed? I think a closed range would be the most likely expectation, and could be implemented using rangeClosed. doubles() would be restricted to [0.0, 2.0^53]. Paul. On May 2, 2013, at 9:41 PM, Paul Sandoz wrote: > Hi, > > Any thoughts below on the following? use-cases? experiences? > > Paul. > > -- > > At the moment we have the {Int, Long, Double}Stream.range methods for a half open range of a step of 1 or a configurable step >= 0. > > Do we really require the configurable step methods? Perhaps they are not that common, if so should we remove them? > > Certainly one can achieve a step > 1 using map(i -> i * n), albeit less efficiently. Note that this will not check for overflow when the upper bound declared to be the maximum value. > > -- > > The fact that we only have a half open range confuses some. Developers i think tend to prefer: > > IntStream.rangeClosed('A', 'Z') > > rather than: > > IntStream.range('A', 'Z' + 1) > > So it seems useful to add {Int, Long}Stream.rangeClosed(). > > -- > > Do we require a method for descending ranges, for example {Int, Long}Stream.rangeDec? what about for closed descending ranges? > > Again those can be achieved using map, albeit less efficiently and with overflow errors, and perhaps developers will make simple errors expressing the mapping? > > -- > > For DoubleStream.range i think supporting +ve/-ve steps is useful e.g. [0, 2 * pi, pi / 180] and [2 * pi, 0, -pi / 180] > > Should the range always be closed, half open or both? > > Half open ranges for Int/Long make sense because of the correlation with array indexes. > > -- > > Should we support sizes for LongStream.range that are > Long.MAX_VALUE? for example: > > LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE) > > The top-level Spliterator of such a stream cannot report SIZED or SUBSIZED and would report an estimate of Long.MAX_VALUE. > > DoubleStream.range is restricted to a maximum of Long.MAX_VALUE. This simplifies the implementation, since it is equivalent to: > > double size = Math.ceil((endExclusive - startInclusive) / step); > LongStream.range(0, (long) size).mapToDouble(i -> startInclusive + i * step); > > and a simplified version of the long range spliterator is used. > From tim at peierls.net Fri May 3 05:56:12 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 08:56:12 -0400 Subject: Ranges In-Reply-To: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: On Thu, May 2, 2013 at 3:41 PM, Paul Sandoz wrote: > At the moment we have the {Int, Long, Double}Stream.range methods for a > half open range of a step of 1 or a configurable step >= 0. > > Do we really require the configurable step methods? Perhaps they are not > that common, if so should we remove them? > > Certainly one can achieve a step > 1 using map(i -> i * n), albeit less > efficiently. Note that this will not check for overflow when the upper > bound declared to be the maximum value. > Feels like unnecessary clutter in the API. Make it into a one-liner example in the javadoc and you kill two birds with one stone: (1) Teach a person to fish and (2) make it easy to find a fishing pole. > The fact that we only have a half open range confuses some. Developers i > think tend to prefer: > IntStream.rangeClosed('A', 'Z') > rather than: > IntStream.range('A', 'Z' + 1) > So it seems useful to add {Int, Long}Stream.rangeClosed(). > Yes. My default expectation in general is half-open, e.g., [0, n), but in certain contexts, like the ['A', 'Z'] case, my expectation is different. This does *not* seem like clutter to me. > Do we require a method for descending ranges, for example {Int, > Long}Stream.rangeDec? what about for closed descending ranges? > > Again those can be achieved using map, albeit less efficiently and with > overflow errors, and perhaps developers will make simple errors expressing > the mapping? > Maybe, but like the step case, it would be clutter for no compelling reason. Another great example for javadoc. Should we support sizes for LongStream.range that are > Long.MAX_VALUE? for > example: > > LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE) > > The top-level Spliterator of such a stream cannot report SIZED or SUBSIZED > and would report an estimate of Long.MAX_VALUE. > Put a warning on the method that for ranges > Long.MAX_VALUE, the implementation will split badly? --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/512ba39a/attachment.html From tim at peierls.net Fri May 3 06:00:35 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 09:00:35 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: On Fri, May 3, 2013 at 4:46 AM, Paul Sandoz wrote: > Following on from this are the idiomatic stream creation methods, we have > previously discussed: > > IntStream.ints(); > LongStream.longs(); > DoubleStream.doubles(); > > Half-open, or closed? I think a closed range would be the most likely > expectation, and could be implemented using rangeClosed. > > doubles() would be restricted to [0.0, 2.0^53]. > I've missed something. How will the user see a difference between ints() implemented as a closed range vs. a half-open range? --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/835b49cd/attachment.html From paul.sandoz at oracle.com Fri May 3 07:53:03 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 3 May 2013 16:53:03 +0200 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: On May 3, 2013, at 3:00 PM, Tim Peierls wrote: > On Fri, May 3, 2013 at 4:46 AM, Paul Sandoz wrote: > Following on from this are the idiomatic stream creation methods, we have previously discussed: > > IntStream.ints(); > LongStream.longs(); > DoubleStream.doubles(); > > Half-open, or closed? I think a closed range would be the most likely expectation, and could be implemented using rangeClosed. > > doubles() would be restricted to [0.0, 2.0^53]. > > I've missed something. How will the user see a difference between ints() implemented as a closed range vs. a half-open range? > The closed range will be for all non-negative int values, where as the half open will be for all non-negative int values except Integer.MAX_VALUE. e.g. the difference between: for (int i = 0; i < Integer.MAX_VALUE; i++) { ... } and: for (int i = 0; i <= Integer.MAX_VALUE && i >= 0; i++) { ... } Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/6b00968b/attachment.html From paul.sandoz at oracle.com Fri May 3 08:09:59 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 3 May 2013 17:09:59 +0200 Subject: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: <58B728C8-1906-4E06-9D77-13BB4CCA5B88@oracle.com> On May 3, 2013, at 2:56 PM, Tim Peierls wrote: > On Thu, May 2, 2013 at 3:41 PM, Paul Sandoz wrote: > At the moment we have the {Int, Long, Double}Stream.range methods for a half open range of a step of 1 or a configurable step >= 0. > > Do we really require the configurable step methods? Perhaps they are not that common, if so should we remove them? > > Certainly one can achieve a step > 1 using map(i -> i * n), albeit less efficiently. Note that this will not check for overflow when the upper bound declared to be the maximum value. > > Feels like unnecessary clutter in the API. Make it into a one-liner example in the javadoc and you kill two birds with one stone: (1) Teach a person to fish and (2) make it easy to find a fishing pole. > > > > The fact that we only have a half open range confuses some. Developers i think tend to prefer: > IntStream.rangeClosed('A', 'Z') > rather than: > IntStream.range('A', 'Z' + 1) > So it seems useful to add {Int, Long}Stream.rangeClosed(). > > Yes. My default expectation in general is half-open, e.g., [0, n), but in certain contexts, like the ['A', 'Z'] case, my expectation is different. This does not seem like clutter to me. > > > Do we require a method for descending ranges, for example {Int, Long}Stream.rangeDec? what about for closed descending ranges? > > Again those can be achieved using map, albeit less efficiently and with overflow errors, and perhaps developers will make simple errors expressing the mapping? > > Maybe, but like the step case, it would be clutter for no compelling reason. Another great example for javadoc. > Yes, i don't know how common overflow cases might be, but in general we can provide simple examples in JavaDoc for the very common cases where map can be used. Unless there are no further objections i propose to remove the configurable step on int/long ranges. The same argument can be made for removing DoubleStream.range since one can use LongStream and map, but i am hesitant to go that far since we might be able to do something useful handling many edge cases associated with floating point numbers. > > > Should we support sizes for LongStream.range that are > Long.MAX_VALUE? for example: > > LongStream.range(Long.MIN_VALUE, Long.MAX_VALUE) > > The top-level Spliterator of such a stream cannot report SIZED or SUBSIZED and would report an estimate of Long.MAX_VALUE. > > Put a warning on the method that for ranges > Long.MAX_VALUE, the implementation will split badly? > It will split OK, since it can be divided into the negative and non-negative sub-ranges [Long.MIN_VALUE, 0) and [0, Long.MAX_VALUE). But i agree a warning on the behaviour of the estimated size is useful. FWIW it is edge cases like this that can complicate the implementation at the expense of the most common cases. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/ac0294e3/attachment-0001.html From tim at peierls.net Fri May 3 09:06:54 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 12:06:54 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: On Fri, May 3, 2013 at 10:53 AM, Paul Sandoz wrote: > On May 3, 2013, at 3:00 PM, Tim Peierls wrote: > > I've missed something. How will the user see a difference between ints() > implemented as a closed range vs. a half-open range? > > The closed range will be for all non-negative int values, where as the > half open will be for all non-negative int values except Integer.MAX_VALUE. > If you were originally asking whether ints() should include MAX_VALUE, then my answer is yes. *How* its implementation accomplishes this isn't that important to me, whether as rangeClosed(0, Integer.MAX_VALUE) or something else. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/c34d9efd/attachment.html From tim at peierls.net Fri May 3 09:11:29 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 12:11:29 -0400 Subject: Ranges In-Reply-To: <58B728C8-1906-4E06-9D77-13BB4CCA5B88@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <58B728C8-1906-4E06-9D77-13BB4CCA5B88@oracle.com> Message-ID: On Fri, May 3, 2013 at 11:09 AM, Paul Sandoz wrote: > The same argument can be made for removing DoubleStream.range since one > can use LongStream and map, but i am hesitant to go that far since we might > be able to do something useful handling many edge cases associated with > floating point numbers. > Yes, that sounds like trickier stuff. > FWIW it is edge cases like this [LongStream.range()] that can complicate > the implementation at the expense of the most common cases. > Can't you have two implementations, one for ranges wider than Long.MAX_VALUE, one for the rest (the common case), and pick which one to use at runtime? --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/ba6df01a/attachment.html From paul.sandoz at oracle.com Fri May 3 09:11:48 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 3 May 2013 18:11:48 +0200 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: <74C95924-D694-4779-B3DA-E0DC10DD27A5@oracle.com> On May 3, 2013, at 6:06 PM, Tim Peierls wrote: > On Fri, May 3, 2013 at 10:53 AM, Paul Sandoz wrote: > On May 3, 2013, at 3:00 PM, Tim Peierls wrote: >> I've missed something. How will the user see a difference between ints() implemented as a closed range vs. a half-open range? > > The closed range will be for all non-negative int values, where as the half open will be for all non-negative int values except Integer.MAX_VALUE. > > If you were originally asking whether ints() should include MAX_VALUE, then my answer is yes. I was :-) Paul. > How its implementation accomplishes this isn't that important to me, whether as rangeClosed(0, Integer.MAX_VALUE) or something else. > > --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/83822c3f/attachment.html From brian.goetz at oracle.com Fri May 3 09:20:06 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 03 May 2013 12:20:06 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> Message-ID: <5183E3B6.5010508@oracle.com> There are two situations that give rise to ranges: math problems (which generally want closed ranges) and indexing (which generally wants open ranges.) And its hard to satisfy both with one method. One question is: should they be spelled the same way? Would indexes(start, bound) and range(start, end) be better than having range(start, bound) and rangeClosed(start, end) ? On 5/3/2013 12:06 PM, Tim Peierls wrote: > On Fri, May 3, 2013 at 10:53 AM, Paul Sandoz > wrote: > > On May 3, 2013, at 3:00 PM, Tim Peierls > wrote: >> I've missed something. How will the user see a difference between >> ints() implemented as a closed range vs. a half-open range? > The closed range will be for all non-negative int values, where as > the half open will be for all non-negative int values except > Integer.MAX_VALUE. > > If you were originally asking whether ints() should include MAX_VALUE, > then my answer is yes. *How* its implementation accomplishes this isn't > that important to me, whether as rangeClosed(0, Integer.MAX_VALUE) or > something else. > > --tim From paul.sandoz at oracle.com Fri May 3 09:24:34 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 3 May 2013 18:24:34 +0200 Subject: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <58B728C8-1906-4E06-9D77-13BB4CCA5B88@oracle.com> Message-ID: <350E3D26-BD06-4FF9-8858-DBCE2343A8A0@oracle.com> On May 3, 2013, at 6:11 PM, Tim Peierls wrote: > FWIW it is edge cases like this [LongStream.range()] that can complicate the implementation at the expense of the most common cases. > > Can't you have two implementations, one for ranges wider than Long.MAX_VALUE, one for the rest (the common case), and pick which one to use at runtime? > Yes. IMO i don't think it should stop us doing the right thing API wise. I am just complaining a little bit :-) Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/5e026e5f/attachment.html From tim at peierls.net Fri May 3 10:24:16 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 13:24:16 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: <5183E3B6.5010508@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <5183E3B6.5010508@oracle.com> Message-ID: On Fri, May 3, 2013 at 12:20 PM, Brian Goetz wrote: > There are two situations that give rise to ranges: math problems (which > generally want closed ranges) and indexing (which generally wants open > ranges.) And its hard to satisfy both with one method. > > One question is: should they be spelled the same way? Would > indexes(start, bound) and range(start, end) be better than having > range(start, bound) and rangeClosed(start, end) ? I don't think it would. While "indexes" does capture the essence of a common use of this method, I think of [start, bound) as a range, and would look it up under that name. range and rangeClosed are very similar concepts; they should have similar names. The potential for confusion doesn't seem that significant to me. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/38ec75a0/attachment.html From brian.goetz at oracle.com Fri May 3 10:26:47 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 03 May 2013 13:26:47 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <5183E3B6.5010508@oracle.com> Message-ID: <5183F357.7060603@oracle.com> > I don't think it would. While "indexes" does capture the essence of a > common use of this method, I think of [start, bound) as a range, and > would look it up under that name. range and rangeClosed are very similar > concepts; they should have similar names. The potential for confusion > doesn't seem that significant to me. With two forms, I agree. But the next two questions are: - what about decreasing ranges? - what about decreasing closed/half-open ranges (whichever the above isnt)? Now, if the answer to these incremental questions is "you lose", then range and rangeClosed are fine, and we're done. From tim at peierls.net Fri May 3 10:35:31 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 3 May 2013 13:35:31 -0400 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: <5183F357.7060603@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <5183E3B6.5010508@oracle.com> <5183F357.7060603@oracle.com> Message-ID: On Fri, May 3, 2013 at 1:26 PM, Brian Goetz wrote: > I don't think it would. While "indexes" does capture the essence of a >> common use of this method, I think of [start, bound) as a range, and >> would look it up under that name. range and rangeClosed are very similar >> concepts; they should have similar names. The potential for confusion >> doesn't seem that significant to me. >> > > With two forms, I agree. But the next two questions are: > - what about decreasing ranges? > - what about decreasing closed/half-open ranges (whichever the above > isnt)? > > Now, if the answer to these incremental questions is "you lose", then > range and rangeClosed are fine, and we're done. > All of these variations seem like things that aren't hard to roll for yourself, especially with a few good examples in the javadoc. But if range and rangeClosed really don't feel like enough to you, then retain the stepped versions (with a different, uglier name) but add an open/closed argument so it's only one extra method for each of the primitive types and not two. rangeFarAfield(start, bound, step, HALF_OPEN); Ugly enough that people won't use it unless they have to. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130503/e5c871c9/attachment.html From daniel.smith at oracle.com Fri May 3 15:31:10 2013 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 3 May 2013 16:31:10 -0600 Subject: Concat and zip In-Reply-To: <518185ED.6050008@oracle.com> References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> <518185ED.6050008@oracle.com> Message-ID: On May 1, 2013, at 3:15 PM, Brian Goetz wrote: > So far I'm seeing: > > - No one throwing themselves in front of the train for concat That's crazy talk. _Of course_ we need concat. IMHO, one of the most important use cases for lazy lists is being able to paste things together on the front or back without iterating/making copies. Is there a reasonable workaround if the libraries don't provide it? It would be more natural if concat were a default method: Stream.andThen, Stream.after. I also think primitive concat is pretty fundamental for primitive streams. > - No one throwing themselves in front of the train for primitive zip > - Widespread regret if we were to ditch reference zip I'm much less adamant about this, but as a PL guy, I often find myself wanting to match up two lists (e.g., arguments with parameters). Again, what's the workaround? The current code in Streams does not look particularly easy to get right if I'm rolling my own. Once possible concern that hasn't been mentioned: will this play well with a future BiStream version, or will it look out of place at that point? (Presumably we'll be able to say 'Streams.zip(x, y).combine(SomeClass::new)' someday.) > Right, but the question is, how badly can we implement it and have it not be worse than nothing? And, with the current performance characteristics (new object per element), are we below that threshold? > Zip shows up all over the place in functional languages, but the efficacy of these idioms relies on tuples being cheap. Also zip parallelizes basically not at all. - What's great about the BiFunction style (vs. a Pair-generating style) is that you don't need any tuples at all. Hence, I'm not seeing "new object per element" in the code. Am I missing something? - I wouldn't take away too many points if a feature that is primarily about expressiveness does not parallelize nicely. --- I'm pretty sure you're not asking for new suggestions :-), but I wouldn't complain about a cross-product method to complement zip. ?Dan From joe.bowbeer at gmail.com Mon May 6 00:08:40 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 6 May 2013 00:08:40 -0700 Subject: ints(), longs(), doubles() Re: Ranges In-Reply-To: References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <5183E3B6.5010508@oracle.com> <5183F357.7060603@oracle.com> Message-ID: I like the approach Tim suggests. "range" should behave one way (e.g., half-open), but there should also be some way to generate a closed range -- and the programmer who needs one will probably need to consult the javadoc every time it is needed. Joe On Fri, May 3, 2013 at 10:35 AM, Tim Peierls wrote: > On Fri, May 3, 2013 at 1:26 PM, Brian Goetz wrote: > >> I don't think it would. While "indexes" does capture the essence of a >>> common use of this method, I think of [start, bound) as a range, and >>> would look it up under that name. range and rangeClosed are very similar >>> concepts; they should have similar names. The potential for confusion >>> doesn't seem that significant to me. >>> >> >> With two forms, I agree. But the next two questions are: >> - what about decreasing ranges? >> - what about decreasing closed/half-open ranges (whichever the above >> isnt)? >> >> Now, if the answer to these incremental questions is "you lose", then >> range and rangeClosed are fine, and we're done. >> > > All of these variations seem like things that aren't hard to roll for > yourself, especially with a few good examples in the javadoc. > > But if range and rangeClosed really don't feel like enough to you, then > retain the stepped versions (with a different, uglier name) but add an > open/closed argument so it's only one extra method for each of the > primitive types and not two. > > rangeFarAfield(start, bound, step, HALF_OPEN); > > Ugly enough that people won't use it unless they have to. > > --tim > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/dfb638c5/attachment.html From joe.bowbeer at gmail.com Mon May 6 00:10:14 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 6 May 2013 00:10:14 -0700 Subject: Concat and zip In-Reply-To: References: <5181586D.80509@oracle.com> <51817E5C.1000300@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1632@GSCMAMP09EX.firmwide.corp.gs.com> <518185ED.6050008@oracle.com> Message-ID: I like concat, too, but AFAICT zip was the only one on the chopping block, so that's the only one I posted a defense for. On Fri, May 3, 2013 at 3:31 PM, Dan Smith wrote: > On May 1, 2013, at 3:15 PM, Brian Goetz wrote: > > > So far I'm seeing: > > > > - No one throwing themselves in front of the train for concat > > That's crazy talk. _Of course_ we need concat. IMHO, one of the most > important use cases for lazy lists is being able to paste things together > on the front or back without iterating/making copies. > > Is there a reasonable workaround if the libraries don't provide it? > > It would be more natural if concat were a default method: Stream.andThen, > Stream.after. > > I also think primitive concat is pretty fundamental for primitive streams. > > > - No one throwing themselves in front of the train for primitive zip > > - Widespread regret if we were to ditch reference zip > > I'm much less adamant about this, but as a PL guy, I often find myself > wanting to match up two lists (e.g., arguments with parameters). > > Again, what's the workaround? The current code in Streams does not look > particularly easy to get right if I'm rolling my own. > > Once possible concern that hasn't been mentioned: will this play well with > a future BiStream version, or will it look out of place at that point? > (Presumably we'll be able to say 'Streams.zip(x, > y).combine(SomeClass::new)' someday.) > > > Right, but the question is, how badly can we implement it and have it > not be worse than nothing? And, with the current performance > characteristics (new object per element), are we below that threshold? > > > Zip shows up all over the place in functional languages, but the > efficacy of these idioms relies on tuples being cheap. Also zip > parallelizes basically not at all. > > - What's great about the BiFunction style (vs. a Pair-generating style) is > that you don't need any tuples at all. Hence, I'm not seeing "new object > per element" in the code. Am I missing something? > > - I wouldn't take away too many points if a feature that is primarily > about expressiveness does not parallelize nicely. > > --- > > I'm pretty sure you're not asking for new suggestions :-), but I wouldn't > complain about a cross-product method to complement zip. > > ?Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/58c259bd/attachment.html From joe.bowbeer at gmail.com Mon May 6 00:18:34 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 6 May 2013 00:18:34 -0700 Subject: Static methods on Stream and friends In-Reply-To: <51742DB7.1000302@oracle.com> References: <51742DB7.1000302@oracle.com> Message-ID: I just updated to b88 and I needed to update my code to adjust for the migration of methods out of Streams. Now in b88, Streams contains only concat and zip, which seems odd. Isn't there more for Streams to do? Or is Streams going away? http://download.java.net/lambda/b88/docs/api/java/util/stream/Streams.html Joe On Sun, Apr 21, 2013 at 11:19 AM, Brian Goetz wrote: > I moved the following from Streams to Stream: > > Stream.builder() > Stream.empty() > Stream.singleton(T) > Stream.of(T...) > Stream.iterate(T, T -> T) > Stream.generate(i -> T) > > with the same on {Int,Long,Double}Stream, and also > > {Int,Long,Double}Stream.range(**start, end) > {Int,Long,Double}Stream.range(**start, end, step) > > It was suggested on lambda-dev that we should rename singleton to simply > be an overload of "of": > > Stream.of(T) > Stream.of(T...) > > which seems reasonable. > > Remaining open issues: > - Some people are unhappy that range is half-open (which also means > people are constrained to ranges topping out at MAX_VALUE-1 rather than > MAX_VALUE). Some options: > - Add XxxStream.rangeExclusive(**start, end) > - Further doc hints, such as renaming the parameters to startInclusive > / endExclusive > - Nothing > - Paul has suggested that generate be finite. While this is kind of > yucky, the practical difference between infinite and long-sized is pretty > much negligible, and the version based on LongStream.range().map() > parallellizes much better. > > I propose to accept the suggestion of s/singleton/of/, go the "doc hint" > route on range, and go finite on generate. > > Also never closed on whether there was value to ints() / longs() -- these > show up in lots of teaching examples, though less so in real-world code. > Still, teaching people how to think about this stuff is important. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/d29287fe/attachment.html From joe.bowbeer at gmail.com Mon May 6 00:19:49 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 6 May 2013 00:19:49 -0700 Subject: StringJoiner in b88 Message-ID: In b88, StringJoiner does not implement a stream destination? Is the "into" example in the javadoc no longer valid? http://download.java.net/lambda/b88/docs/api/java/util/StringJoiner.html Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/361d1a79/attachment.html From paul.sandoz at oracle.com Mon May 6 00:39:56 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 6 May 2013 09:39:56 +0200 Subject: Concat Re: Ranges In-Reply-To: <350E3D26-BD06-4FF9-8858-DBCE2343A8A0@oracle.com> References: <868E9A0F-D4A0-4F80-AAD5-CE43B4F1406E@oracle.com> <58B728C8-1906-4E06-9D77-13BB4CCA5B88@oracle.com> <350E3D26-BD06-4FF9-8858-DBCE2343A8A0@oracle.com> Message-ID: <7B1A9F1D-8AAB-412A-B028-562CDE894969@oracle.com> On May 3, 2013, at 6:24 PM, Paul Sandoz wrote: > On May 3, 2013, at 6:11 PM, Tim Peierls wrote: >> FWIW it is edge cases like this [LongStream.range()] that can complicate the implementation at the expense of the most common cases. >> >> Can't you have two implementations, one for ranges wider than Long.MAX_VALUE, one for the rest (the common case), and pick which one to use at runtime? >> > > Yes. IMO i don't think it should stop us doing the right thing API wise. I am just complaining a little bit :-) > FWIW it just occurred to me if we keep Streams.concat then that can be used to concatenate the two streams for ranges of known size. Paul. From paul.sandoz at oracle.com Mon May 6 01:28:42 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 6 May 2013 10:28:42 +0200 Subject: StringJoiner in b88 In-Reply-To: References: Message-ID: On May 6, 2013, at 9:19 AM, Joe Bowbeer wrote: > In b88, StringJoiner does not implement a stream destination? > > Is the "into" example in the javadoc no longer valid? > Right, it's an error. > http://download.java.net/lambda/b88/docs/api/java/util/StringJoiner.html > It should be something like: String commaSeparatedNames = people.stream().map(p -> p.getName()).collect(Collectors.toStringJoiner(", ")).toString(); I quickly updated the docs to: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/3a44a6038054 Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/a99142a0/attachment-0001.html From brian.goetz at oracle.com Mon May 6 05:56:43 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 06 May 2013 08:56:43 -0400 Subject: StringJoiner in b88 In-Reply-To: References: Message-ID: <5187A88B.6000601@oracle.com> There has not been a "stream destination" type or an "into" method for a very long time. But, if you want to use a string joiner as a stream target, do: stream.collect(toStringJoiner()); On 5/6/2013 3:19 AM, Joe Bowbeer wrote: > In b88, StringJoiner does not implement a stream destination? > > Is the "into" example in the javadoc no longer valid? > > http://download.java.net/lambda/b88/docs/api/java/util/StringJoiner.html > > Joe From brian.goetz at oracle.com Mon May 6 06:07:47 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 06 May 2013 09:07:47 -0400 Subject: Static methods on Stream and friends In-Reply-To: References: <51742DB7.1000302@oracle.com> Message-ID: <5187AB23.9010603@oracle.com> > Now in b88, Streams contains only concat and zip, which seems odd. > Isn't there more for Streams to do? Or is Streams going away? > > http://download.java.net/lambda/b88/docs/api/java/util/stream/Streams.html And this was my next question. There's a lot of package-private code, but at this point, no other public entry points. With static methods in interfaces, a lot of code that used to go in Foos can now go in Foo. But I think it is definitely possible to take this too far (think of the 5000 LoC currently in Collections; would we want to move it *all* to the various Collection interfaces?) Still, concat() has a decidedly different flavor to the other Stream methods; all the other stream methods take stateless (and therefore reusable) behavioral descriptions of how to produce transform a stream, whereas concat() takes another *stream*, which will be consumed in the process. So my inclination would be, if we keep these methods, to leave them in Streams. From paul.sandoz at oracle.com Mon May 6 09:56:48 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 6 May 2013 18:56:48 +0200 Subject: Spliterator.OfPrimitive Message-ID: Hi, Brian and I re-introduced a simpler form of Spliterator.OfPrimitive. public interface OfPrimitive> extends Spliterator This enables more sharing of primitive-based spliterator code, for example: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fad441986ad6 This should also reduce the bloat of primitive-based concat spliterator implementations, if we choose to add concatenation of primitive-based streams. We could go further and provide limits for the types declared by OfPrimitive: - T extends PrimitiveWrapper, and require Integer extends PrimitiveWrapper - T_CONS extends PrimitiveConsumer, and require IntConsumer extends PrimitiveConsumer But i am inclined to leave it as is. Thoughts? Paul. From joe.bowbeer at gmail.com Mon May 6 10:12:45 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Mon, 6 May 2013 10:12:45 -0700 Subject: StringJoiner in b88 In-Reply-To: <5187A88B.6000601@oracle.com> References: <5187A88B.6000601@oracle.com> Message-ID: I think these doc snippets at the destinations are very useful. Are there others that are missing or incorrect? On May 6, 2013 5:56 AM, "Brian Goetz" wrote: > There has not been a "stream destination" type or an "into" method for a > very long time. > > But, if you want to use a string joiner as a stream target, do: > > stream.collect(toStringJoiner(**)); > > On 5/6/2013 3:19 AM, Joe Bowbeer wrote: > >> In b88, StringJoiner does not implement a stream destination? >> >> Is the "into" example in the javadoc no longer valid? >> >> http://download.java.net/**lambda/b88/docs/api/java/util/** >> StringJoiner.html >> >> Joe >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130506/9f79984a/attachment.html From paul.sandoz at oracle.com Wed May 8 12:05:35 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 8 May 2013 21:05:35 +0200 Subject: Ranges redux Message-ID: Hi, I think we reached consensus on: - {Int, Long}String.range/rangeClosed for step of 1 IMO that seems good enough, and we can provide examples using map for step > 1 and descending ranges. We have still to converge on DoubleStream.range, which might suggest no strong opinions or we lack the use-cases. There are some tricky edge-cases to deal with. I am very close to proposing we drop it... Paul. From brian.goetz at oracle.com Wed May 8 12:24:37 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 08 May 2013 15:24:37 -0400 Subject: Ranges redux In-Reply-To: References: Message-ID: <518AA675.5000501@oracle.com> I am OK with cutting back on ranges further. range/rangeClosed seem good enough to me for integral ranges (along with convenience ints/longs methods). On 5/8/2013 3:05 PM, Paul Sandoz wrote: > Hi, > > I think we reached consensus on: > > - {Int, Long}String.range/rangeClosed for step of 1 > > IMO that seems good enough, and we can provide examples using map for step > 1 and descending ranges. > > We have still to converge on DoubleStream.range, which might suggest no strong opinions or we lack the use-cases. There are some tricky edge-cases to deal with. I am very close to proposing we drop it... > > Paul. > From dl at cs.oswego.edu Wed May 8 12:27:16 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 08 May 2013 15:27:16 -0400 Subject: Ranges redux In-Reply-To: <518AA675.5000501@oracle.com> References: <518AA675.5000501@oracle.com> Message-ID: <518AA714.8000402@cs.oswego.edu> On 05/08/13 15:24, Brian Goetz wrote: > I am OK with cutting back on ranges further. range/rangeClosed seem good enough rangeClosed -> closedRange? -Doug From tim at peierls.net Wed May 8 12:40:39 2013 From: tim at peierls.net (Tim Peierls) Date: Wed, 8 May 2013 15:40:39 -0400 Subject: Ranges redux In-Reply-To: <518AA714.8000402@cs.oswego.edu> References: <518AA675.5000501@oracle.com> <518AA714.8000402@cs.oswego.edu> Message-ID: On Wed, May 8, 2013 at 3:27 PM, Doug Lea
wrote: > On 05/08/13 15:24, Brian Goetz wrote: > >> I am OK with cutting back on ranges further. range/rangeClosed seem good >> enough >> > > rangeClosed -> closedRange? Maybe, but it's nice when they sort together in the javadoc method list. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130508/c6b8d4cf/attachment.html From brian.goetz at oracle.com Thu May 9 12:14:38 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 09 May 2013 15:14:38 -0400 Subject: Loose-ends wrapup Message-ID: <518BF59E.1090405@oracle.com> The majority of the lambda libraries code has been put back to the jdk8 repositories. I'm gathering a list of loose ends that we might want to circle back to. The bar for nontrivial new features at this point is high, but there are plenty of things in the "small tweaks" category that we can do. There's also lots of work remaining in improving the implementation and especially the documentation and specification. This is a really great time to contribute improvements in this area. Streams -- lingering feature ideas - Additional tweaking on range generators (see Paul's emails) - Convenience ints() and longs() generator methods? (ditto) - Collector for frequency counting? - Support for state-based cancelation (e.g., cancelWhen(BooleanSupplier)) - Support for content-based limiting (takeWhile, skipUntil) - Convenience methods like toList() on Stream SAMs - Additional static or default methods on standard SAMs? Point lambdafications - Gotta be more of these? Helper classes - (I hesitate to even suggest): Optional.{filter,map,flatMap} Now that Stream.flatMap is settled, it becomes reasonable to consider these. What others have I missed? From brian.goetz at oracle.com Thu May 9 12:42:13 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 09 May 2013 15:42:13 -0400 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <518BFC15.8020503@oracle.com> > Streams -- lingering feature ideas > > - Additional tweaking on range generators (see Paul's emails) > - Convenience ints() and longs() generator methods? (ditto) > - Collector for frequency counting? > - Support for state-based cancelation (e.g., cancelWhen(BooleanSupplier)) > - Support for content-based limiting (takeWhile, skipUntil) > - Convenience methods like toList() on Stream Plus: - Moving stream() / parallelStream() to Iterable From david.lloyd at redhat.com Thu May 9 12:51:15 2013 From: david.lloyd at redhat.com (David M. Lloyd) Date: Thu, 09 May 2013 14:51:15 -0500 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <518BFE33.3010008@redhat.com> On 05/09/2013 02:14 PM, Brian Goetz wrote: > The majority of the lambda libraries code has been put back to the jdk8 > repositories. I'm gathering a list of loose ends that we might want to > circle back to. The bar for nontrivial new features at this point is > high, but there are plenty of things in the "small tweaks" category that > we can do. > > There's also lots of work remaining in improving the implementation and > especially the documentation and specification. This is a really great > time to contribute improvements in this area. > > Streams -- lingering feature ideas > > - Additional tweaking on range generators (see Paul's emails) > - Convenience ints() and longs() generator methods? (ditto) > - Collector for frequency counting? > - Support for state-based cancelation (e.g., cancelWhen(BooleanSupplier)) > - Support for content-based limiting (takeWhile, skipUntil) > - Convenience methods like toList() on Stream > > SAMs > - Additional static or default methods on standard SAMs? > > Point lambdafications > - Gotta be more of these? > > Helper classes > - (I hesitate to even suggest): Optional.{filter,map,flatMap} > Now that Stream.flatMap is settled, it becomes reasonable to > consider these. > > What others have I missed? Bringing it up for the last time: we should not allow serialization of capturing lambdas; instead they should be proactively rejected, as they are even less stable than anonymous classes, relying on order of capture rather than (at least somewhat semantically relevant) name; representing an EE vendor I'm certain we will end up dealing with fallout otherwise. This is assuming that switching to name-based serialization is still off the table, which is also not great but it is at least no worse than what we have today. -- - DML From Donald.Raab at gs.com Thu May 9 16:53:04 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Thu, 9 May 2013 19:53:04 -0400 Subject: Performance of default Spliterators Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> Apologies if this was already discussed, thought about, planned or in progress. Right now in the build I am using (about a week old) spliterator returns the following: default Spliterator spliterator() { return Spliterators.spliterator(this, Spliterator.ORDERED); } For ArrayList this is overridden to return an ArrayListSpliterator. I think there should be an instance of check in spliterator to check for RandomAccess so performance is better for other RandomAccess lists that might be implemented in other libraries. So the following code would need to be changed from this: public static Spliterator spliterator(Collection c, int additionalCharacteristics) { return new IteratorSpliterator<>(Objects.requireNonNull(c), additionalCharacteristics); } To this: public static Spliterator spliterator(Collection c, int additionalCharacteristics) { if (c instanceof RandomAccess) return new RandomAccessSpliterator<>(c, additionalCharacteristics); return new IteratorSpliterator<>(Objects.requireNonNull(c), additionalCharacteristics); } RandomAccessSpliterator would of course need to be implemented. Thoughts? Make sense? Already planned? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130509/1e361ba1/attachment.html From brian.goetz at oracle.com Thu May 9 17:53:46 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 09 May 2013 20:53:46 -0400 Subject: Performance of default Spliterators In-Reply-To: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <518C451A.9080902@oracle.com> Entirely reasonable. Actually, might not even need much implementing. You could do: return IntStream.range(0, size()).map(List::get); (if you could tolerate the early binding to size.) The efficacy question is: what List implementations implement RA that don't already have their own specialized spliterator? On 5/9/2013 7:53 PM, Raab, Donald [Tech] wrote: > Apologies if this was already discussed, thought about, planned or in > progress. > Right now in the build I am using (about a week old) spliterator returns > the following: > default Spliterator spliterator() { > return Spliterators.spliterator(this, Spliterator.ORDERED); > } > For ArrayList this is overridden to return an ArrayListSpliterator. I > think there should be an instance of check in spliterator to check for > RandomAccess so performance is better for other RandomAccess lists that > might be implemented in other libraries. So the following code would > need to be changed from this: > public static Spliterator spliterator(Collection c, > int > additionalCharacteristics) { > return new IteratorSpliterator<>(Objects.requireNonNull(c), > additionalCharacteristics); > } > To this: > public static Spliterator spliterator(Collection c, > int > additionalCharacteristics) { > if (c instanceof RandomAccess) > return new RandomAccessSpliterator<>(c, additionalCharacteristics); > return new IteratorSpliterator<>(Objects.requireNonNull(c), > additionalCharacteristics); > } > RandomAccessSpliterator would of course need to be implemented. > Thoughts? Make sense? Already planned? From Donald.Raab at gs.com Thu May 9 18:14:59 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Thu, 9 May 2013 21:14:59 -0400 Subject: Performance of default Spliterators In-Reply-To: <518C451A.9080902@oracle.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <518C451A.9080902@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F1A43@GSCMAMP09EX.firmwide.corp.gs.com> Any RandomAccess Lists not currently in the JDK repository. For example: http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/collect/ImmutableList.html GS Collections has quite a few as well. I'd like for parallel streams to be performant when used there. :-) > -----Original Message----- > From: Brian Goetz [mailto:brian.goetz at oracle.com] > Sent: Friday, May 10, 2013 1:54 AM > To: Raab, Donald [Tech] > Cc: lambda-libs-spec-experts at openjdk.java.net > Subject: Re: Performance of default Spliterators > > Entirely reasonable. Actually, might not even need much implementing. > You could do: > > return IntStream.range(0, size()).map(List::get); > > (if you could tolerate the early binding to size.) > > The efficacy question is: what List implementations implement RA that > don't already have their own specialized spliterator? > > On 5/9/2013 7:53 PM, Raab, Donald [Tech] wrote: > > Apologies if this was already discussed, thought about, planned or in > > progress. > > Right now in the build I am using (about a week old) spliterator > > returns the following: > > default Spliterator spliterator() { > > return Spliterators.spliterator(this, Spliterator.ORDERED); > > } > > For ArrayList this is overridden to return an ArrayListSpliterator. > I > > think there should be an instance of check in spliterator to check > for > > RandomAccess so performance is better for other RandomAccess lists > > that might be implemented in other libraries. So the following code > > would need to be changed from this: > > public static Spliterator spliterator(Collection extends T> c, > > int > > additionalCharacteristics) { > > return new IteratorSpliterator<>(Objects.requireNonNull(c), > > additionalCharacteristics); > > } > > To this: > > public static Spliterator spliterator(Collection extends T> c, > > int > > additionalCharacteristics) { > > if (c instanceof RandomAccess) > > return new RandomAccessSpliterator<>(c, > additionalCharacteristics); > > return new IteratorSpliterator<>(Objects.requireNonNull(c), > > additionalCharacteristics); > > } > > RandomAccessSpliterator would of course need to be implemented. > > Thoughts? Make sense? Already planned? From Donald.Raab at gs.com Thu May 9 18:28:48 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Thu, 9 May 2013 21:28:48 -0400 Subject: Performance of default Spliterators In-Reply-To: <518C451A.9080902@oracle.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <518C451A.9080902@oracle.com> Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F1A44@GSCMAMP09EX.firmwide.corp.gs.com> Here are some performance tests I was trying out tonight with GSC. The results here are for filtering a list of size 1 million 3 times with different predicates on a 2 core Windows machine. The tests here were warmed up 100 times then executed 200 times so a human doesn't mind waiting. The result of the tests for JDK-Parallel were significantly different enough for me to notice so I started digging into the implementation details a bit. FastList is our mutable RandomAccess List which is getting the IteratorSpliterator for JDK-Parallel. **JDK-Serial Filter: FastList size: 1,000,000 Count: 200 Total(ms): 17,735 Avg(ms): 88.679 GSC-Serial** Select: FastList size: 1,000,000 Count: 200 Total(ms): 12,754 Avg(ms): 63.771 JDK-Parallel Filter: FastList size: 1,000,000 Count: 200 Total(ms): 23,415 Avg(ms): 117.076 GSC-Parallel Select: FastList size: 1,000,000 Count: 200 Total(ms): 7,121 Avg(ms): 35.607 GSC-ForkJoin Select: FastList size: 1,000,000 Count: 200 Total(ms): 6,524 Avg(ms): 32.62 Compared to java.util.ArrayList which had a much better JDK-Parallel result because of the specialized override. **JDK-Serial Filter: ArrayList size: 1,000,000 Count: 200 Total(ms): 14,019 Avg(ms): 70.097 GSC-Serial** Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 12,398 Avg(ms): 61.99 JDK-Parallel Filter: ArrayList size: 1,000,000 Count: 200 Total(ms): 7,591 Avg(ms): 37.959 GSC-ForkJoin Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 6,612 Avg(ms): 33.064 GSC-Parallel Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 6,638 Avg(ms): 33.195 > -----Original Message----- > From: Brian Goetz [mailto:brian.goetz at oracle.com] > Sent: Friday, May 10, 2013 1:54 AM > To: Raab, Donald [Tech] > Cc: lambda-libs-spec-experts at openjdk.java.net > Subject: Re: Performance of default Spliterators > > Entirely reasonable. Actually, might not even need much implementing. > You could do: > > return IntStream.range(0, size()).map(List::get); > > (if you could tolerate the early binding to size.) > > The efficacy question is: what List implementations implement RA that > don't already have their own specialized spliterator? > > On 5/9/2013 7:53 PM, Raab, Donald [Tech] wrote: > > Apologies if this was already discussed, thought about, planned or in > > progress. > > Right now in the build I am using (about a week old) spliterator > > returns the following: > > default Spliterator spliterator() { > > return Spliterators.spliterator(this, Spliterator.ORDERED); > > } > > For ArrayList this is overridden to return an ArrayListSpliterator. > I > > think there should be an instance of check in spliterator to check > for > > RandomAccess so performance is better for other RandomAccess lists > > that might be implemented in other libraries. So the following code > > would need to be changed from this: > > public static Spliterator spliterator(Collection extends T> c, > > int > > additionalCharacteristics) { > > return new IteratorSpliterator<>(Objects.requireNonNull(c), > > additionalCharacteristics); > > } > > To this: > > public static Spliterator spliterator(Collection extends T> c, > > int > > additionalCharacteristics) { > > if (c instanceof RandomAccess) > > return new RandomAccessSpliterator<>(c, > additionalCharacteristics); > > return new IteratorSpliterator<>(Objects.requireNonNull(c), > > additionalCharacteristics); > > } > > RandomAccessSpliterator would of course need to be implemented. > > Thoughts? Make sense? Already planned? From paul.sandoz at oracle.com Fri May 10 01:33:21 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 10 May 2013 10:33:21 +0200 Subject: Performance of default Spliterators In-Reply-To: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <1403BEE0-393C-47B7-AF80-974C2524F1E4@oracle.com> On May 10, 2013, at 1:53 AM, "Raab, Donald [Tech]" wrote: > Apologies if this was already discussed, thought about, planned or in progress. > > Right now in the build I am using (about a week old) spliterator returns the following: > > default Spliterator spliterator() { > return Spliterators.spliterator(this, Spliterator.ORDERED); > } > > For ArrayList this is overridden to return an ArrayListSpliterator. I think there should be an instance of check in spliterator to check for RandomAccess so performance is better for other RandomAccess lists that might be implemented in other libraries. So the following code would need to be changed from this: > > public static Spliterator spliterator(Collection c, > int additionalCharacteristics) { > return new IteratorSpliterator<>(Objects.requireNonNull(c), > additionalCharacteristics); > } > > To this: > > public static Spliterator spliterator(Collection c, > int additionalCharacteristics) { > if (c instanceof RandomAccess) > return new RandomAccessSpliterator<>(c, additionalCharacteristics); > > return new IteratorSpliterator<>(Objects.requireNonNull(c), > additionalCharacteristics); > } > Perhaps it would be better to change the default spliterator() implementation in List, since Collection & RandomAccess is not very useful: @Override default Spliterator spliterator() { if (this instanceof RandomAccess) { return new RandomAccessListSpliterator(this); } else { return Spliterators.spliterator(this, Spliterator.ORDERED); } } > RandomAccessSpliterator would of course need to be implemented. > > Thoughts? Make sense? Already planned? > Seems OK to me. Paul. From paul.sandoz at oracle.com Fri May 10 02:02:16 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 10 May 2013 11:02:16 +0200 Subject: Performance of default Spliterators In-Reply-To: <6712820CB52CFB4D842561213A77C05404CC6F1A44@GSCMAMP09EX.firmwide.corp.gs.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <518C451A.9080902@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1A44@GSCMAMP09EX.firmwide.corp.gs.com> Message-ID: <04B9B057-0EBB-45AF-AFB2-89D852DC89AD@oracle.com> Hi Donald, I dunno if you work from the lambda source; attached is a patch implementing the random access list spliterator. If you don't work from source, you could copy the spliterator and explicitly hook it up using the constructors in StreamSupport to see if the performance improves. Also, what terminal operation are you using for the JDK test code? collect(toList()) ? Paul. -------------- next part -------------- On May 10, 2013, at 3:28 AM, "Raab, Donald [Tech]" wrote: > Here are some performance tests I was trying out tonight with GSC. The results here are for filtering a list of size 1 million 3 times with different predicates on a 2 core Windows machine. The tests here were warmed up 100 times then executed 200 times so a human doesn't mind waiting. The result of the tests for JDK-Parallel were significantly different enough for me to notice so I started digging into the implementation details a bit. > > FastList is our mutable RandomAccess List which is getting the IteratorSpliterator for JDK-Parallel. > > **JDK-Serial Filter: FastList size: 1,000,000 Count: 200 Total(ms): 17,735 Avg(ms): 88.679 > GSC-Serial** Select: FastList size: 1,000,000 Count: 200 Total(ms): 12,754 Avg(ms): 63.771 > > JDK-Parallel Filter: FastList size: 1,000,000 Count: 200 Total(ms): 23,415 Avg(ms): 117.076 > GSC-Parallel Select: FastList size: 1,000,000 Count: 200 Total(ms): 7,121 Avg(ms): 35.607 > GSC-ForkJoin Select: FastList size: 1,000,000 Count: 200 Total(ms): 6,524 Avg(ms): 32.62 > > Compared to java.util.ArrayList which had a much better JDK-Parallel result because of the specialized override. > > **JDK-Serial Filter: ArrayList size: 1,000,000 Count: 200 Total(ms): 14,019 Avg(ms): 70.097 > GSC-Serial** Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 12,398 Avg(ms): 61.99 > > JDK-Parallel Filter: ArrayList size: 1,000,000 Count: 200 Total(ms): 7,591 Avg(ms): 37.959 > GSC-ForkJoin Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 6,612 Avg(ms): 33.064 > GSC-Parallel Select: ArrayList size: 1,000,000 Count: 200 Total(ms): 6,638 Avg(ms): 33.195 From paul.sandoz at oracle.com Fri May 10 02:18:57 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 10 May 2013 11:18:57 +0200 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <43524DA3-63E7-4588-A4BC-74D1D3179B98@oracle.com> On May 9, 2013, at 9:14 PM, Brian Goetz wrote: > The majority of the lambda libraries code has been put back to the jdk8 repositories. I'm gathering a list of loose ends that we might want to circle back to. The bar for nontrivial new features at this point is high, but there are plenty of things in the "small tweaks" category that we can do. > > There's also lots of work remaining in improving the implementation and especially the documentation and specification. This is a really great time to contribute improvements in this area. > > Streams -- lingering feature ideas > > - Additional tweaking on range generators (see Paul's emails) > - Convenience ints() and longs() generator methods? (ditto) > - Collector for frequency counting? > - Support for state-based cancelation (e.g., cancelWhen(BooleanSupplier)) > - Support for content-based limiting (takeWhile, skipUntil) > - Convenience methods like toList() on Stream > - Support for primitive stream concatenation (now that it is easier to share code for primitive spliterators) - Concatenation for > 2 streams? Paul. > SAMs > - Additional static or default methods on standard SAMs? > > Point lambdafications > - Gotta be more of these? > > Helper classes > - (I hesitate to even suggest): Optional.{filter,map,flatMap} > Now that Stream.flatMap is settled, it becomes reasonable to consider these. > > What others have I missed? > > > From paul.sandoz at oracle.com Fri May 10 04:51:28 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 10 May 2013 13:51:28 +0200 Subject: Performance of default Spliterators In-Reply-To: <04B9B057-0EBB-45AF-AFB2-89D852DC89AD@oracle.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <518C451A.9080902@oracle.com> <6712820CB52CFB4D842561213A77C05404CC6F1A44@GSCMAMP09EX.firmwide.corp.gs.com> <04B9B057-0EBB-45AF-AFB2-89D852DC89AD@oracle.com> Message-ID: <3B1F7DE8-92BF-4E35-9C2A-FB3D47C90E1D@oracle.com> Grrr... attachments are silently stripped. Paul. # HG changeset patch # Parent 0a76b1e789a9725ecb89c7ab3566150f5f12359c diff -r 0a76b1e789a9 -r 0ef8e2fc2de3 src/share/classes/java/util/AbstractList.java --- a/src/share/classes/java/util/AbstractList.java Fri May 10 00:36:02 2013 -0700 +++ b/src/share/classes/java/util/AbstractList.java Fri May 10 10:55:38 2013 +0200 @@ -25,6 +25,8 @@ package java.util; +import java.util.function.Consumer; + /** * This class provides a skeletal implementation of the {@link List} * interface to minimize the effort required to implement this interface @@ -608,6 +610,72 @@ private String outOfBoundsMsg(int index) { return "Index: "+index+", Size: "+size(); } + + /** Index-based split-by-two, lazily initialized Spliterator */ + static class RandomAccessListSpliterator implements Spliterator { + private final List list; + private int index; // current index, modified on advance/split + private int fence; // -1 until used; then one past last index + + RandomAccessListSpliterator(List list) { + this(list, 0, -1); + assert list instanceof RandomAccess; + } + + /** Create new spliterator covering the given range */ + private RandomAccessListSpliterator(List list, int origin, int fence) { + this.list = list; + this.index = origin; + this.fence = fence; + } + + private int getFence() { // initialize fence to size on first use + int hi; + if ((hi = fence) < 0) { + hi = fence = list.size(); + } + return hi; + } + + public Spliterator trySplit() { + int hi = getFence(), lo = index, mid = (lo + hi) >>> 1; + return (lo >= mid) ? null : // divide range in half unless too small + new RandomAccessListSpliterator<>(list, lo, index = mid); + } + + public boolean tryAdvance(Consumer action) { + if (action == null) + throw new NullPointerException(); + int hi = getFence(), i = index; + if (i < hi) { + index = i + 1; + action.accept(list.get(i)); + return true; + } + return false; + } + + public void forEachRemaining(Consumer action) { + if (action == null) + throw new NullPointerException(); + int hi = getFence(), i = index; + List lst = list; + if (i < hi) { + index = hi; + for (; i < hi; ++i) { + action.accept(lst.get(i)); + } + } + } + + public long estimateSize() { + return (long) (getFence() - index); + } + + public int characteristics() { + return Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED; + } + } } class SubList extends AbstractList { diff -r 0a76b1e789a9 -r 0ef8e2fc2de3 src/share/classes/java/util/List.java --- a/src/share/classes/java/util/List.java Fri May 10 00:36:02 2013 -0700 +++ b/src/share/classes/java/util/List.java Fri May 10 10:55:38 2013 +0200 @@ -680,7 +680,11 @@ */ @Override default Spliterator spliterator() { - return Spliterators.spliterator(this, Spliterator.ORDERED); + if (this instanceof RandomAccess) { + return new AbstractList.RandomAccessListSpliterator<>(this); + } else { + return Spliterators.spliterator(this, Spliterator.ORDERED); + } } } diff -r 0a76b1e789a9 -r 0ef8e2fc2de3 test/java/util/Spliterator/SpliteratorTraversingAndSplittingTest.java --- a/test/java/util/Spliterator/SpliteratorTraversingAndSplittingTest.java Fri May 10 00:36:02 2013 -0700 +++ b/test/java/util/Spliterator/SpliteratorTraversingAndSplittingTest.java Fri May 10 10:55:38 2013 +0200 @@ -50,6 +50,7 @@ import java.util.List; import java.util.Map; import java.util.PriorityQueue; +import java.util.RandomAccess; import java.util.Set; import java.util.SortedSet; import java.util.Spliterator; @@ -322,6 +323,25 @@ db.add("Arrays.asList().spliterator()", () -> Spliterators.spliterator(Arrays.asList(exp.toArray(new Integer[0])), 0)); + class RandomAccessList extends AbstractList implements RandomAccess { + List list; + + RandomAccessList(Collection list) { + this.list = new ArrayList<>(list); + } + + @Override + public Integer get(int index) { + return list.get(index); + } + + @Override + public int size() { + return list.size(); + } + } + db.addList(RandomAccessList::new); + db.addList(ArrayList::new); db.addList(LinkedList::new); On May 10, 2013, at 11:02 AM, Paul Sandoz wrote: > Hi Donald, > > I dunno if you work from the lambda source; attached is a patch implementing the random access list spliterator. If you don't work from source, you could copy the spliterator and explicitly hook it up using the constructors in StreamSupport to see if the performance improves. > > Also, what terminal operation are you using for the JDK test code? collect(toList()) ? > > Paul. From dl at cs.oswego.edu Fri May 10 05:00:20 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 08:00:20 -0400 Subject: Performance of default Spliterators In-Reply-To: <1403BEE0-393C-47B7-AF80-974C2524F1E4@oracle.com> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <1403BEE0-393C-47B7-AF80-974C2524F1E4@oracle.com> Message-ID: <518CE154.9010207@cs.oswego.edu> On 05/10/13 04:33, Paul Sandoz wrote: > Perhaps it would be better to change the default spliterator() implementation in List, since Collection & RandomAccess is not very useful: > > @Override > default Spliterator spliterator() { > if (this instanceof RandomAccess) { > return new RandomAccessListSpliterator(this); > } else { > return Spliterators.spliterator(this, Spliterator.ORDERED); > } > } > > >> RandomAccessSpliterator would of course need to be implemented. >> >> Thoughts? Make sense? Already planned? >> > > Seems OK to me. > My first thought was: anyone able to create a new RandomAccess List is surely able to write a Spliterator for it, and almost surely a better one than we could supply as a default, so why burden all other cases with an extra instanceof check to dispatch to it? On the other hand, since most/all commonly used forms of Lists will override default anyway, the impact is surely too small to notice, so it would be fine to do this. (The main problem here that we have seen many times is that there is no named type RandomAccessList, so there is no good place to put a default.) -Doug From Donald.Raab at gs.com Fri May 10 05:10:33 2013 From: Donald.Raab at gs.com (Raab, Donald [Tech]) Date: Fri, 10 May 2013 08:10:33 -0400 Subject: Performance of default Spliterators In-Reply-To: <518CE154.9010207@cs.oswego.edu> References: <6712820CB52CFB4D842561213A77C05404CC6F1A42@GSCMAMP09EX.firmwide.corp.gs.com> <1403BEE0-393C-47B7-AF80-974C2524F1E4@oracle.com> <518CE154.9010207@cs.oswego.edu> Message-ID: <6712820CB52CFB4D842561213A77C05404CC6F1A5B@GSCMAMP09EX.firmwide.corp.gs.com> > (The main problem here that we have seen many times is that there is no > named type RandomAccessList, so there is no good place to put a > default.) > > -Doug So why don't we add it? From paul.sandoz at oracle.com Fri May 10 08:53:45 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 10 May 2013 17:53:45 +0200 Subject: Stream.generate redux Message-ID: <0FB13BF1-F629-47AA-B729-6F369A135EC0@oracle.com> We have three choices for implementations for Int/Long/Double/Stream.generate. Note that the expectation is Stream.generate will be used in conjunction with a short-circuiting operation, such as limit, and generate serves as the implementation for generating streams of random numbers with Random. The 3 choices: 1) infinite, ordered, iterator-based, right-balanced tree This is the current state of affairs and tends to parallelize poorly due to it being iterator-based, which results in the production of right-balanced (unbalanced) trees. Infinite is potentially a desirable property, but infinite and right-balanced combined are a poor combination. Ordered is not intrinsically bad. It depends on what constraints are applied to invoking the supplier. If the constraint is the supplier may be called concurrently or in no particular order then the stream should not report any encounter order since non-deterministic results could be produced, contradicting that the stream has an encounter order. Consider this example: LongStream.range(0, Long.MAX_VALUE).parallel().map(i -> supplier.get()); The stream input to the map has an encounter but the supplier is called in a temporal order, so the elements in the stream output from the map will be jumbled up in no particular order. It just so happens for 1) that the supplier is not called concurrently, this is a consequence of being an iterator-based supplier. 2) finite of known size, unordered, using LongStream.range(0, Long.MAX_VALUE).map(i -> s.get()).unordered() A maximum of Long.MAX_VALUE elements will be generated. We need to explicitly make the stream unordered, since map, reasonably, preserves order, assuming there is a correlation between the input and output. This implementation can now be combined with limit with out throwing OOMEs, since we recently optimized limit with sized streams in the lambda repo. 3) infinite, unordered, using infinite supplying spliterator This can also be combined with limit with out throwing OOMEs, pending changes to the limit implementation (I have a patch). Previously i thought it would be tricky but Brian and I have managed to optimize the limit case for unordered streams. I like the fact that 3) is now possible :-) there is a price to be paid though for limiting since it requires some buffering of elements and CASing on an AtomicLong (optimistically buffering reduces the number of CASes). 2) requires no buffering or use of concurrent data structures. So... i think it primarily comes down to: finite; or infinite with some extra cost ? Is Long.MAX_VALUE generated elements sufficient for most needs? FWIW perhaps an infinite source of generated values makes more sense if there were additional ways of short-circuiting such as cancellation (as we previously had and could add back) or if there are a limitWhile/takeWhile operations. Paul. From joe.bowbeer at gmail.com Fri May 10 09:33:23 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 10 May 2013 09:33:23 -0700 Subject: Stream.generate redux In-Reply-To: <0FB13BF1-F629-47AA-B729-6F369A135EC0@oracle.com> References: <0FB13BF1-F629-47AA-B729-6F369A135EC0@oracle.com> Message-ID: Concerning the question regarding Long.MAX_VALUE values, I bet this is enough for most use cases, but is that the right question -- when one is dealing with an infinite stream? I "might" be OK with limiting the number of values in the case of primitive streams, because of the inherent constraints on primitive values, but I would not want to limit the number of values generated for an infinite non-primitive Stream. Btw, what would happen after Long.MAX_VALUES were generated? The stream-driven computation would stop? Joe On Fri, May 10, 2013 at 8:53 AM, Paul Sandoz wrote: > We have three choices for implementations for > Int/Long/Double/Stream.generate. Note that the expectation is > Stream.generate will be used in conjunction with a short-circuiting > operation, such as limit, and generate serves as the implementation for > generating streams of random numbers with Random. > > The 3 choices: > > 1) infinite, ordered, iterator-based, right-balanced tree > > This is the current state of affairs and tends to parallelize poorly due > to it being iterator-based, which results in the production of > right-balanced (unbalanced) trees. > > Infinite is potentially a desirable property, but infinite and > right-balanced combined are a poor combination. > > Ordered is not intrinsically bad. It depends on what constraints are > applied to invoking the supplier. If the constraint is the supplier may be > called concurrently or in no particular order then the stream should not > report any encounter order since non-deterministic results could be > produced, contradicting that the stream has an encounter order. > > Consider this example: > > LongStream.range(0, Long.MAX_VALUE).parallel().map(i -> supplier.get()); > > The stream input to the map has an encounter but the supplier is called in > a temporal order, so the elements in the stream output from the map will be > jumbled up in no particular order. > > It just so happens for 1) that the supplier is not called concurrently, > this is a consequence of being an iterator-based supplier. > > > 2) finite of known size, unordered, using LongStream.range(0, > Long.MAX_VALUE).map(i -> s.get()).unordered() > > A maximum of Long.MAX_VALUE elements will be generated. > > We need to explicitly make the stream unordered, since map, reasonably, > preserves order, assuming there is a correlation between the input and > output. > > This implementation can now be combined with limit with out throwing > OOMEs, since we recently optimized limit with sized streams in the lambda > repo. > > > 3) infinite, unordered, using infinite supplying spliterator > > This can also be combined with limit with out throwing OOMEs, pending > changes to the limit implementation (I have a patch). Previously i thought > it would be tricky but Brian and I have managed to optimize the limit case > for unordered streams. > > > I like the fact that 3) is now possible :-) there is a price to be paid > though for limiting since it requires some buffering of elements and CASing > on an AtomicLong (optimistically buffering reduces the number of CASes). 2) > requires no buffering or use of concurrent data structures. > > So... i think it primarily comes down to: > > finite; or infinite with some extra cost > > ? > > Is Long.MAX_VALUE generated elements sufficient for most needs? > > FWIW perhaps an infinite source of generated values makes more sense if > there were additional ways of short-circuiting such as cancellation (as we > previously had and could add back) or if there are a limitWhile/takeWhile > operations. > > Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130510/48d77166/attachment.html From brian.goetz at oracle.com Fri May 10 09:47:48 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 10 May 2013 12:47:48 -0400 Subject: Stream.generate redux In-Reply-To: References: <0FB13BF1-F629-47AA-B729-6F369A135EC0@oracle.com> Message-ID: <518D24B4.7070407@oracle.com> Mostly this comes down to splitting characteristics, which has a significant effect on whether parallel pipelines do as you expect them to. We can easily generate an infinite stream from an Iterator, and even can parallelize it some for problems with enough Q, but the splitting for that will always suck. The naive split is (first, rest), and while there are less naive versions, they still come down to (a few, a lot), which results in a lopsided computation tree that is very deep on the right. These right-heavy trees really suck. They often give poor parallelism, and have bad space utilization behaviors (Paul has been chasing OOMEs over these for weeks now.) The main nice property about the finite generator is that it can be implemented as basically: range(0, BIG).map(i -> f()) What we gain here is that now the source spliterator for our stream is balanced and sized; this means we'll get nicely balanced trees and predictable splits. This enables some big optimizations, and avoids some big potholes. You are right that there's a semantic reason to limit primitive streams (the wraparound will likely cause errors) which is not present with an object stream. But the splitting behavior is still a big problem. As to your question, yes, after Long.MAX_VALUES elements, the stream would terminate. It would be a finite stream. On 5/10/2013 12:33 PM, Joe Bowbeer wrote: > Concerning the question regarding Long.MAX_VALUE values, I bet this is > enough for most use cases, but is that the right question -- when one is > dealing with an infinite stream? > > I "might" be OK with limiting the number of values in the case of > primitive streams, because of the inherent constraints on primitive > values, but I would not want to limit the number of values generated for > an infinite non-primitive Stream. Btw, what would happen after > Long.MAX_VALUES were generated? The stream-driven computation would stop? > > Joe > > > On Fri, May 10, 2013 at 8:53 AM, Paul Sandoz > wrote: > > We have three choices for implementations for > Int/Long/Double/Stream.generate. Note that the expectation is > Stream.generate will be used in conjunction with a short-circuiting > operation, such as limit, and generate serves as the implementation > for generating streams of random numbers with Random. > > The 3 choices: > > 1) infinite, ordered, iterator-based, right-balanced tree > > This is the current state of affairs and tends to parallelize poorly > due to it being iterator-based, which results in the production of > right-balanced (unbalanced) trees. > > Infinite is potentially a desirable property, but infinite and > right-balanced combined are a poor combination. > > Ordered is not intrinsically bad. It depends on what constraints are > applied to invoking the supplier. If the constraint is the supplier > may be called concurrently or in no particular order then the stream > should not report any encounter order since non-deterministic > results could be produced, contradicting that the stream has an > encounter order. > > Consider this example: > > LongStream.range(0, Long.MAX_VALUE).parallel().map(i -> > supplier.get()); > > The stream input to the map has an encounter but the supplier is > called in a temporal order, so the elements in the stream output > from the map will be jumbled up in no particular order. > > It just so happens for 1) that the supplier is not called > concurrently, this is a consequence of being an iterator-based supplier. > > > 2) finite of known size, unordered, using LongStream.range(0, > Long.MAX_VALUE).map(i -> s.get()).unordered() > > A maximum of Long.MAX_VALUE elements will be generated. > > We need to explicitly make the stream unordered, since map, > reasonably, preserves order, assuming there is a correlation between > the input and output. > > This implementation can now be combined with limit with out throwing > OOMEs, since we recently optimized limit with sized streams in the > lambda repo. > > > 3) infinite, unordered, using infinite supplying spliterator > > This can also be combined with limit with out throwing OOMEs, > pending changes to the limit implementation (I have a patch). > Previously i thought it would be tricky but Brian and I have > managed to optimize the limit case for unordered streams. > > > I like the fact that 3) is now possible :-) there is a price to be > paid though for limiting since it requires some buffering of > elements and CASing on an AtomicLong (optimistically buffering > reduces the number of CASes). 2) requires no buffering or use of > concurrent data structures. > > So... i think it primarily comes down to: > > finite; or infinite with some extra cost > > ? > > Is Long.MAX_VALUE generated elements sufficient for most needs? > > FWIW perhaps an infinite source of generated values makes more sense > if there were additional ways of short-circuiting such as > cancellation (as we previously had and could add back) or if there > are a limitWhile/takeWhile operations. > > Paul. > > From dl at cs.oswego.edu Fri May 10 10:02:39 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 13:02:39 -0400 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <518D282F.4040405@cs.oswego.edu> On 05/09/13 15:14, Brian Goetz wrote: > What others have I missed? The lambda-dev post by John Rose reminded me that we were going to revisit the need for ConcurrentHashBag: A (massive) simplification of CHM that only conforms to Collection interface (so among other things, duplicates are allowed), and is handy for shoving unordered elements for concurrent aggregation. I've had a version of this sitting around for a year or so... -Doug From brian.goetz at oracle.com Fri May 10 10:51:59 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 10 May 2013 13:51:59 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D282F.4040405@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> Message-ID: <518D33BF.3090908@oracle.com> Many slippery-slope questions come to mind -- doesn't this beg for: - Bag interface - Bag decorators (unmodifiableBag, synchronizedBag) - Non-concurrent implementation, perhaps based on HashSet Given all that, though, a toBag() collector is nice, and sidesteps issues of merge functions. On 5/10/2013 1:02 PM, Doug Lea wrote: > On 05/09/13 15:14, Brian Goetz wrote: > >> What others have I missed? > > The lambda-dev post by John Rose reminded me that we were going > to revisit the need for ConcurrentHashBag: A (massive) simplification > of CHM that only conforms to Collection interface (so among > other things, duplicates are allowed), and is handy > for shoving unordered elements for concurrent aggregation. > I've had a version of this sitting around for a year or so... > > -Doug > > From kevinb at google.com Fri May 10 10:55:34 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 10 May 2013 10:55:34 -0700 Subject: Loose-ends wrapup In-Reply-To: <518D282F.4040405@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> Message-ID: Hey Doug, Can you compare/contrast the API and behavior you have in mind with ConcurrentHashMultiset ? On Fri, May 10, 2013 at 10:02 AM, Doug Lea
wrote: > On 05/09/13 15:14, Brian Goetz wrote: > > What others have I missed? >> > > The lambda-dev post by John Rose reminded me that we were going > to revisit the need for ConcurrentHashBag: A (massive) simplification > of CHM that only conforms to Collection interface (so among > other things, duplicates are allowed), and is handy > for shoving unordered elements for concurrent aggregation. > I've had a version of this sitting around for a year or so... > > -Doug > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130510/bffe5613/attachment.html From dl at cs.oswego.edu Fri May 10 11:09:17 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 14:09:17 -0400 Subject: Loose-ends wrapup In-Reply-To: References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> Message-ID: <518D37CD.3030807@cs.oswego.edu> On 05/10/13 13:55, Kevin Bourrillion wrote: > Hey Doug, > > Can you compare/contrast the API and behavior you have in mind with > ConcurrentHashMultiset > ? > > It only supports the methods defined in the Collection API. It uses JDK8-CHM-based segmentless mechanics. It is the most efficient class I know for unordered concurrent aggregation. -Doug From dl at cs.oswego.edu Fri May 10 11:09:54 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 14:09:54 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D33BF.3090908@oracle.com> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D33BF.3090908@oracle.com> Message-ID: <518D37F2.902@cs.oswego.edu> On 05/10/13 13:51, Brian Goetz wrote: > Many slippery-slope questions come to mind -- doesn't this beg for: > - Bag interface YAGNI. "Bag" just means "not a Set or List or other Collection subinterface" > - Bag decorators (unmodifiableBag, synchronizedBag) unmodifiableCollection would work fine. > - Non-concurrent implementation, perhaps based on HashSet Yes. This is one thing that stalled previous discussion. It is plausible, and would be a good alternative to ArrayList for uses that call contains() frequently, but not all that compelling otherwise. -Doug > > Given all that, though, a toBag() collector is nice, and sidesteps issues of > merge functions. > > On 5/10/2013 1:02 PM, Doug Lea wrote: >> On 05/09/13 15:14, Brian Goetz wrote: >> >>> What others have I missed? >> >> The lambda-dev post by John Rose reminded me that we were going >> to revisit the need for ConcurrentHashBag: A (massive) simplification >> of CHM that only conforms to Collection interface (so among >> other things, duplicates are allowed), and is handy >> for shoving unordered elements for concurrent aggregation. >> I've had a version of this sitting around for a year or so... >> >> -Doug >> >> > From brian.goetz at oracle.com Fri May 10 11:12:23 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 10 May 2013 14:12:23 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D37F2.902@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D33BF.3090908@oracle.com> <518D37F2.902@cs.oswego.edu> Message-ID: <518D3887.2030007@oracle.com> >> Many slippery-slope questions come to mind -- doesn't this beg for: >> - Bag interface > > YAGNI. "Bag" just means "not a Set or List or other Collection > subinterface" Maybe. A multiset typically implements a method like: int count(Object) > unmodifiableCollection would work fine. except then, even if CHB implemented a count(o) method, it becomes hidden by the wrapper. So, are you saying that count(Object) is a YAGNI issue too? From dl at cs.oswego.edu Fri May 10 11:16:56 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 14:16:56 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D3887.2030007@oracle.com> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D33BF.3090908@oracle.com> <518D37F2.902@cs.oswego.edu> <518D3887.2030007@oracle.com> Message-ID: <518D3998.8060301@cs.oswego.edu> On 05/10/13 14:12, Brian Goetz wrote: >>> Many slippery-slope questions come to mind -- doesn't this beg for: >>> - Bag interface >> >> YAGNI. "Bag" just means "not a Set or List or other Collection >> subinterface" > > Maybe. A multiset typically implements a method like: > > int count(Object) > >> unmodifiableCollection would work fine. > > except then, even if CHB implemented a count(o) method, it becomes hidden by the > wrapper. > > So, are you saying that count(Object) is a YAGNI issue too? > Yes. Please no count method. For the same reason we don't have a count method in List either, even though it would be equally applicable. -Doug From dl at cs.oswego.edu Fri May 10 11:53:54 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 14:53:54 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D37CD.3030807@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D37CD.3030807@cs.oswego.edu> Message-ID: <518D4242.1000200@cs.oswego.edu> On 05/10/13 14:09, Doug Lea wrote: > On 05/10/13 13:55, Kevin Bourrillion wrote: >> Hey Doug, >> >> Can you compare/contrast the API and behavior you have in mind with >> ConcurrentHashMultiset >> ? >> >> >> > > It only supports the methods defined in the Collection API. > It uses JDK8-CHM-based segmentless mechanics. > It is the most efficient class I know for unordered concurrent > aggregation. > Oh, and it does not allow add(null). -Doug From dl at cs.oswego.edu Fri May 10 12:07:52 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 15:07:52 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D4242.1000200@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D37CD.3030807@cs.oswego.edu> <518D4242.1000200@cs.oswego.edu> Message-ID: <518D4588.6080008@cs.oswego.edu> On 05/10/13 14:53, Doug Lea wrote: > On 05/10/13 14:09, Doug Lea wrote: >> On 05/10/13 13:55, Kevin Bourrillion wrote: >>> Hey Doug, >>> >>> Can you compare/contrast the API and behavior you have in mind with >>> ConcurrentHashMultiset >>> ? >>> >>> >>> >>> >> >> It only supports the methods defined in the Collection API. >> It uses JDK8-CHM-based segmentless mechanics. >> It is the most efficient class I know for unordered concurrent >> aggregation. >> > > Oh, and it does not allow add(null). > But after dusting off this code, I changed to do so with very little cost, which removes the need for streams to figure out whether it could use it. Probably the best move, even though I cringe about it. -Doug From kevinb at google.com Fri May 10 12:32:43 2013 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 10 May 2013 12:32:43 -0700 Subject: Loose-ends wrapup In-Reply-To: <518D37CD.3030807@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D37CD.3030807@cs.oswego.edu> Message-ID: On Fri, May 10, 2013 at 11:09 AM, Doug Lea
wrote: It only supports the methods defined in the Collection API. > Oh. I believe that severely curtails its usefulness, but could try to back that up with stats from Google's codebase if necessary. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130510/83675c40/attachment.html From dl at cs.oswego.edu Fri May 10 13:00:32 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Fri, 10 May 2013 16:00:32 -0400 Subject: Loose-ends wrapup In-Reply-To: References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D37CD.3030807@cs.oswego.edu> Message-ID: <518D51E0.5060604@cs.oswego.edu> On 05/10/13 15:32, Kevin Bourrillion wrote: > On Fri, May 10, 2013 at 11:09 AM, Doug Lea
> wrote: > > It only supports the methods defined in the Collection API. > > > Oh. I believe that severely curtails its usefulness, but could try to back that > up with stats from Google's codebase if necessary. > Are you referring to the usages for which we decided to recommend (pasting from javadoc...) *

A ConcurrentHashMap can be used as scalable frequency map (a * form of histogram or multiset) by using {@link * java.util.concurrent.atomic.LongAdder} values and initializing via * {@link #computeIfAbsent computeIfAbsent}. For example, to add a count * to a {@code ConcurrentHashMap freqs}, you can use * {@code freqs.computeIfAbsent(k -> new LongAdder()).increment();} * -Doug From brian.goetz at oracle.com Fri May 10 13:25:37 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 10 May 2013 16:25:37 -0400 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <518D57C1.3000308@oracle.com> Note that to move forward on these, someone will need to champion them. Otherwise most will implicitly fall to the YAGNI axe. On 5/9/2013 3:14 PM, Brian Goetz wrote: > The majority of the lambda libraries code has been put back to the jdk8 > repositories. I'm gathering a list of loose ends that we might want to > circle back to. The bar for nontrivial new features at this point is > high, but there are plenty of things in the "small tweaks" category that > we can do. > > There's also lots of work remaining in improving the implementation and > especially the documentation and specification. This is a really great > time to contribute improvements in this area. > > Streams -- lingering feature ideas > > - Additional tweaking on range generators (see Paul's emails) > - Convenience ints() and longs() generator methods? (ditto) > - Collector for frequency counting? > - Support for state-based cancelation (e.g., cancelWhen(BooleanSupplier)) > - Support for content-based limiting (takeWhile, skipUntil) > - Convenience methods like toList() on Stream > > SAMs > - Additional static or default methods on standard SAMs? > > Point lambdafications > - Gotta be more of these? > > Helper classes > - (I hesitate to even suggest): Optional.{filter,map,flatMap} > Now that Stream.flatMap is settled, it becomes reasonable to > consider these. > > What others have I missed? > > > From dl at cs.oswego.edu Sat May 11 03:33:05 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 11 May 2013 06:33:05 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D33BF.3090908@oracle.com> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518D33BF.3090908@oracle.com> Message-ID: <518E1E61.2090306@cs.oswego.edu> On 05/10/13 13:51, Brian Goetz wrote: > Many slippery-slope questions come to mind -- doesn't this beg for: > - Bag interface > - Bag decorators (unmodifiableBag, synchronizedBag) > - Non-concurrent implementation, perhaps based on HashSet > > Given all that, though, a toBag() collector is nice, and sidesteps issues of > merge functions. If it causes people to be less prone to misinterpret, this could instead be called ConcurrentHashBuffer. -Doug > > On 5/10/2013 1:02 PM, Doug Lea wrote: >> On 05/09/13 15:14, Brian Goetz wrote: >> >>> What others have I missed? >> >> The lambda-dev post by John Rose reminded me that we were going >> to revisit the need for ConcurrentHashBag: A (massive) simplification >> of CHM that only conforms to Collection interface (so among >> other things, duplicates are allowed), and is handy >> for shoving unordered elements for concurrent aggregation. >> I've had a version of this sitting around for a year or so... >> >> -Doug >> >> > From dl at cs.oswego.edu Sat May 11 08:02:12 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sat, 11 May 2013 11:02:12 -0400 Subject: Loose-ends wrapup In-Reply-To: <518D282F.4040405@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> Message-ID: <518E5D74.3060702@cs.oswego.edu> Yet another self-reply... On 05/10/13 13:02, Doug Lea wrote: > On 05/09/13 15:14, Brian Goetz wrote: > >> What others have I missed? > > The lambda-dev post by John Rose reminded me that we were going > to revisit the need for ConcurrentHashBag: A (massive) simplification > of CHM that only conforms to Collection interface (so among > other things, duplicates are allowed), and is handy > for shoving unordered elements for concurrent aggregation. > I've had a version of this sitting around for a year or so... > The empirical question is what a (renamed) ConcurrentHashBuffer buys you. How common is collecting into an unordered non-Set, non-Map destination? (CHM and CHM.newKeySet suffice for the others.) And of those, what is the likelihood that these collections have mostly-distinct elements? If they are mostly the same, then hashing into a buffer will often be worse than other options because of all the collisions. Which all together, still seems on the marginal side for JDK inclusion. Maybe I should package up ConcurrentHashBuffer as one of our jsr166e.extra classes. (Aside: it would still be nice if there were a convenient way for people to collect into something providing our scalable frequency histogram idioms.) -Doug From brian.goetz at oracle.com Sat May 11 08:20:22 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 11 May 2013 11:20:22 -0400 Subject: Loose-ends wrapup In-Reply-To: <518E5D74.3060702@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <518D282F.4040405@cs.oswego.edu> <518E5D74.3060702@cs.oswego.edu> Message-ID: <518E61B6.9060002@oracle.com> Well, try it -- write a trivial Collector and compare parallel collect(toList()) vs collect(toConcurrentBag()). My guess is that concurrent bag will win, largely because toList() still does too much copying (and if it didn't, it would create view trees with higher per-element-access costs.) On 5/11/2013 11:02 AM, Doug Lea wrote: > Yet another self-reply... > > On 05/10/13 13:02, Doug Lea wrote: >> On 05/09/13 15:14, Brian Goetz wrote: >> >>> What others have I missed? >> >> The lambda-dev post by John Rose reminded me that we were going >> to revisit the need for ConcurrentHashBag: A (massive) simplification >> of CHM that only conforms to Collection interface (so among >> other things, duplicates are allowed), and is handy >> for shoving unordered elements for concurrent aggregation. >> I've had a version of this sitting around for a year or so... >> > > The empirical question is what a (renamed) ConcurrentHashBuffer > buys you. How common is collecting into an unordered > non-Set, non-Map destination? (CHM and CHM.newKeySet suffice > for the others.) And of those, what is the likelihood that these > collections have mostly-distinct elements? If they are mostly > the same, then hashing into a buffer will often be worse than other > options because of all the collisions. > > Which all together, still seems on the marginal side for JDK > inclusion. Maybe I should package up ConcurrentHashBuffer as > one of our jsr166e.extra classes. > > (Aside: it would still be nice if there were a convenient way > for people to collect into something providing our scalable > frequency histogram idioms.) > > -Doug > > > > > > From paul.sandoz at oracle.com Mon May 13 02:53:39 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 13 May 2013 11:53:39 +0200 Subject: Ranges redux In-Reply-To: References: <518AA675.5000501@oracle.com> <518AA714.8000402@cs.oswego.edu> Message-ID: On May 8, 2013, at 9:40 PM, Tim Peierls wrote: > On Wed, May 8, 2013 at 3:27 PM, Doug Lea

wrote: > >> On 05/08/13 15:24, Brian Goetz wrote: >> >>> I am OK with cutting back on ranges further. range/rangeClosed seem good >>> enough >>> >> >> rangeClosed -> closedRange? > > > Maybe, but it's nice when they sort together in the javadoc method list. > Yes, although like the readability, i think sorting and grouping for auto-completion are useful. Paul. From paul.sandoz at oracle.com Mon May 13 03:30:08 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 13 May 2013 12:30:08 +0200 Subject: Ranges redux In-Reply-To: References: Message-ID: On May 8, 2013, at 9:05 PM, Paul Sandoz wrote: > Hi, > > I think we reached consensus on: > > - {Int, Long}String.range/rangeClosed for step of 1 > > IMO that seems good enough, and we can provide examples using map for step > 1 and descending ranges. > In the lambda repo: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/0e0e19f03f63 IntStream.ints() and LongStream.longs() are also included. Bikeshed opportunity? the static method LongStream.longs() could be confusing with the method IntStream.longs() for widening from ints to longs. > We have still to converge on DoubleStream.range, which might suggest no strong opinions or we lack the use-cases. There are some tricky edge-cases to deal with. I am very close to proposing we drop it... > Perhaps a strong sign of whether we keep this or not is whether we can agree about the upper bound and half-open/closed ranges. Sequentially DoubleStream.range is currently equivalent to the following: * long size = (long) Math.ceil((startInclusive - endExclusive) / step); * long i = 0 * for (double v = startInclusive; i < size; i++, v = startInclusive + step * i) { * ... * } If startInclusive + step * size == endExclusive then the range could be closed, otherwise half-open. Is that a reasonable expectation? Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130513/49cd7610/attachment.html From tim at peierls.net Mon May 13 06:02:05 2013 From: tim at peierls.net (Tim Peierls) Date: Mon, 13 May 2013 09:02:05 -0400 Subject: Ranges redux In-Reply-To: References: Message-ID: On Mon, May 13, 2013 at 6:30 AM, Paul Sandoz wrote: > Sequentially DoubleStream.range is currently equivalent to the following: > > * long size = (long) Math.ceil((startInclusive - endExclusive) / > step); > * long i = 0 > * for (double v = startInclusive; i < size; i++, v = > startInclusive + step * i) { > * ... > * } > > If startInclusive + step * size == endExclusive then the range could be > closed, otherwise half-open. > > Is that a reasonable expectation? > This feels like it would surprise people more often than not. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130513/eb5c834b/attachment.html From paul.sandoz at oracle.com Mon May 13 06:48:38 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Mon, 13 May 2013 15:48:38 +0200 Subject: Ranges redux In-Reply-To: References: Message-ID: <8C3493E9-79A6-4F75-BBFD-B66F7FA02781@oracle.com> On May 13, 2013, at 3:02 PM, Tim Peierls wrote: > On Mon, May 13, 2013 at 6:30 AM, Paul Sandoz wrote: > Sequentially DoubleStream.range is currently equivalent to the following: > > * long size = (long) Math.ceil((startInclusive - endExclusive) / step); > * long i = 0 > * for (double v = startInclusive; i < size; i++, v = startInclusive + step * i) { > * ... > * } > > If startInclusive + step * size == endExclusive then the range could be closed, otherwise half-open. > > Is that a reasonable expectation? > > This feels like it would surprise people more often than not. > Here is what Mathematica does for Range[0.0, 1.0, 1 / 3] vs. Range[0.0, 1.0 - 0.01, 1 / 3]: http://www.wolframalpha.com/input/?i=Range%5B0.0%2C+1.0%2C+1+%2F+3%5D+vs.+Range%5B0.0%2C+1.0+-+0.01%2C+1+%2F+3%5D R appears to do the same, you can experiment with: http://alpha.kloudstat.com/index.php?do=/console/ seq(0.0, 1.0, by=1/3) However, i am struggling to find more precise details on the implementations of the above for inexact values and the upper bound. Perhaps the best thing to do is get out of the way and let numerical libraries integrate with streams... Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130513/55155294/attachment.html From dl at cs.oswego.edu Wed May 15 08:02:52 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 15 May 2013 11:02:52 -0400 Subject: Loose-ends wrapup In-Reply-To: <518BF59E.1090405@oracle.com> References: <518BF59E.1090405@oracle.com> Message-ID: <5193A39C.2040408@cs.oswego.edu> One more, that Brian and I have discussed off an on for a year or so, that always ends up as too hard to decide. Do we want to overload Collection.parallelStream(int minParSize)? minParSize is the minimum number of elements to process in parallel, else sequential. And/or overload Stream.parallel(). Arguments for: * Parallel performance for the combination of small N and cheap lambdas (like adding numbers) is crummy. There's little hope of automating decisions unless/until we can automate cost metrics of lambdas, which is not going to happen any time soon. * People might want to write code that uses parallelStrerams only if they have a lot of elements, that they might not know ahead of time. Against: * Most people do not know what values are reasonable. * The "best" values are likely to change in the future. One compromise is to do this only for the custom CHM parallel task methods, that are most likely to be used only by people who are also most likely to want to tune performance. (I'm on my way out to a conference program committee meeting for two days, so replies might be slow.) -Doug From brian.goetz at oracle.com Wed May 15 08:14:21 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 15 May 2013 11:14:21 -0400 Subject: Loose-ends wrapup In-Reply-To: <5193A39C.2040408@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <5193A39C.2040408@cs.oswego.edu> Message-ID: <5193A64D.8080902@oracle.com> Even if we accept that hinting is needed, its not clear what the best hints are; hints about Q might be more useful than attempting to set splitting parameters. (And, there are more than one splitting parameter we might want to set. For example, if the pipeline has a highly selective filter, we might want to adjust target chunk size.) And hints beget more hints. So I still don't think we know what we'd want even if we knew we wanted something :( On 5/15/2013 11:02 AM, Doug Lea wrote: > > One more, that Brian and I have discussed off an on for a year or so, > that always ends up as too hard to decide. > > Do we want to overload Collection.parallelStream(int minParSize)? > minParSize is the minimum number of elements to process in parallel, > else sequential. > > And/or overload Stream.parallel(). > > Arguments for: > > * Parallel performance for the combination of small N and > cheap lambdas (like adding numbers) is crummy. There's > little hope of automating decisions unless/until we can > automate cost metrics of lambdas, which is not going to happen > any time soon. > > * People might want to write code that uses parallelStrerams > only if they have a lot of elements, that they might not > know ahead of time. > > Against: > > * Most people do not know what values are reasonable. > > * The "best" values are likely to change in the future. > > One compromise is to do this only for the custom CHM parallel > task methods, that are most likely to be used only by people > who are also most likely to want to tune performance. > > (I'm on my way out to a conference program committee meeting > for two days, so replies might be slow.) > > -Doug > > > From joe.bowbeer at gmail.com Fri May 17 00:32:14 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Fri, 17 May 2013 00:32:14 -0700 Subject: StringJoiner in b88 In-Reply-To: <5187A88B.6000601@oracle.com> References: <5187A88B.6000601@oracle.com> Message-ID: Previously, when StringJoiner was a stream destination, I could collect into one using the three-arg variant: s.into(new StringJoiner(delimiter, prefix, suffix)); For example: elements.into(new StringJoiner(", ", "[", "]")); But I can't now because Collectors.toStringJoiner(delimiter) is the only option. What's the recommended workaround? Should Collectors.toStringJoiner(delimiter, prefix, suffix) be added? On Mon, May 6, 2013 at 5:56 AM, Brian Goetz wrote: > There has not been a "stream destination" type or an "into" method for a > very long time. > > But, if you want to use a string joiner as a stream target, do: > > stream.collect(toStringJoiner(**)); > > > On 5/6/2013 3:19 AM, Joe Bowbeer wrote: > >> In b88, StringJoiner does not implement a stream destination? >> >> Is the "into" example in the javadoc no longer valid? >> >> http://download.java.net/**lambda/b88/docs/api/java/util/** >> StringJoiner.html >> >> Joe >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130517/38ff5492/attachment.html From paul.sandoz at oracle.com Fri May 17 02:10:14 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 17 May 2013 11:10:14 +0200 Subject: StringJoiner in b88 In-Reply-To: References: <5187A88B.6000601@oracle.com> Message-ID: <886901D7-AC26-41E9-B212-4E94CC751835@oracle.com> On May 17, 2013, at 9:32 AM, Joe Bowbeer wrote: > Previously, when StringJoiner was a stream destination, I could collect > into one using the three-arg variant: > > s.into(new StringJoiner(delimiter, prefix, suffix)); > > For example: > > elements.into(new StringJoiner(", ", "[", "]")); > > But I can't now because Collectors.toStringJoiner(delimiter) is the only > option. > > What's the recommended workaround? > > Should Collectors.toStringJoiner(delimiter, prefix, suffix) be added? > The problem is, when executing in parallel, how to combine two such StringJoiner instances, since the prefix and suffix are only relevant to the final result and not the intermediate results. We would need another methods e.g.: StringJoiner merge(StringJoiner that) { return add(that.value.substring(that.prefix.length())); } i.e. throw away the prefix characters of "that". Paul. From brian.goetz at oracle.com Fri May 17 07:28:14 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 17 May 2013 10:28:14 -0400 Subject: StringJoiner in b88 In-Reply-To: References: <5187A88B.6000601@oracle.com> Message-ID: <51963E7E.7040704@oracle.com> Yes, this is a regrettable weakness of Collector which I spent some time trying to address and ultimately concluded it wasn't worth the added complexity. Here's the problem: the delimiter version of StringJoiner is basically a sequential construct. When we parallelize a mutable reduction, the Collector primitives are: - make new result container - incorporate a new result into the container - merge two containers The invariant that is the equivalent of the associativity constraint for reduction is: add(new(), t1, t2) == combine(add(new(), t1), add(new(), t2)) Stringjoiner with no prefix/suffix meets these rules; we can create two StringJoiners and stuff elements into them: sj1: a, b, c, d sj2: e, f, g, h and then combine them with sj1.add(sj2): sj1: a, b, c, d, e, f, g, h With prefix/suffix, it does not, because we'd have extra prefixes and suffixes: sj1: [ a, b, c, d ] sj2: [ e, f, g, h ] joined: [ a, b, c, d, [ e, f, g, h ] ] In order to fix this, Collector would have to know whether we're at the root of the computation tree, and if so, invoke new with different parameters. The workaround is simple, but not as pretty as you want: StringJoiner sj = new StringJoiner(",", "[", "]"); stream.blah.blah.blah.forEach(sj::add); String result = sj.toString(); In other words, you can still do exactly what you want with relatively few additional lines of code, but the penalty is dropping out of the fluent model. On 5/17/2013 3:32 AM, Joe Bowbeer wrote: > Previously, when StringJoiner was a stream destination, I could collect > into one using the three-arg variant: > > s.into(new StringJoiner(delimiter, prefix, suffix)); > > For example: > > elements.into(new StringJoiner(", ", "[", "]")); > > But I can't now because Collectors.toStringJoiner(delimiter) is the only > option. > > What's the recommended workaround? > > Should Collectors.toStringJoiner(delimiter, prefix, suffix) be added? > > > > On Mon, May 6, 2013 at 5:56 AM, Brian Goetz > wrote: > > There has not been a "stream destination" type or an "into" method > for a very long time. > > But, if you want to use a string joiner as a stream target, do: > > stream.collect(toStringJoiner(__)); > > > On 5/6/2013 3:19 AM, Joe Bowbeer wrote: > > In b88, StringJoiner does not implement a stream destination? > > Is the "into" example in the javadoc no longer valid? > > http://download.java.net/__lambda/b88/docs/api/java/util/__StringJoiner.html > > > Joe > > From dl at cs.oswego.edu Sun May 19 12:25:40 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 19 May 2013 15:25:40 -0400 Subject: Loose-ends wrapup In-Reply-To: <5193A64D.8080902@oracle.com> References: <518BF59E.1090405@oracle.com> <5193A39C.2040408@cs.oswego.edu> <5193A64D.8080902@oracle.com> Message-ID: <51992734.6020008@cs.oswego.edu> On 05/15/13 11:14, Brian Goetz wrote: > Even if we accept that hinting is needed, its not clear what the best hints are; > hints about Q might be more useful than attempting to set splitting parameters. > (And, there are more than one splitting parameter we might want to set. For > example, if the pipeline has a highly selective filter, we might want to adjust > target chunk size.) And hints beget more hints. > > So I still don't think we know what we'd want even if we knew we wanted > something :( What about the follow-up question. Suppose we did this in CHM only. That is, instead of xxxSequentially(fn, ...) vs xxxInParallel(fn, ...), there were only xxx(minParSize, fn, ,,,) Because there's no stream pipeline, there's no cascade of hinting. -Doug > > > On 5/15/2013 11:02 AM, Doug Lea wrote: >> >> One more, that Brian and I have discussed off an on for a year or so, >> that always ends up as too hard to decide. >> >> Do we want to overload Collection.parallelStream(int minParSize)? >> minParSize is the minimum number of elements to process in parallel, >> else sequential. >> >> And/or overload Stream.parallel(). >> >> Arguments for: >> >> * Parallel performance for the combination of small N and >> cheap lambdas (like adding numbers) is crummy. There's >> little hope of automating decisions unless/until we can >> automate cost metrics of lambdas, which is not going to happen >> any time soon. >> >> * People might want to write code that uses parallelStrerams >> only if they have a lot of elements, that they might not >> know ahead of time. >> >> Against: >> >> * Most people do not know what values are reasonable. >> >> * The "best" values are likely to change in the future. >> >> One compromise is to do this only for the custom CHM parallel >> task methods, that are most likely to be used only by people >> who are also most likely to want to tune performance. >> >> (I'm on my way out to a conference program committee meeting >> for two days, so replies might be slow.) >> >> -Doug >> >> >> > From brian.goetz at oracle.com Sun May 19 12:55:04 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 19 May 2013 15:55:04 -0400 Subject: Loose-ends wrapup In-Reply-To: <51992734.6020008@cs.oswego.edu> References: <518BF59E.1090405@oracle.com> <5193A39C.2040408@cs.oswego.edu> <5193A64D.8080902@oracle.com> <51992734.6020008@cs.oswego.edu> Message-ID: <51992E18.4030403@oracle.com> Our new mantra is "explicit but unobtrusive parallelism." As a low-level component, I think the xxx(minParSize...) is a perfectly reasonable API. And xxx(minParSize...) could be argued to meets the mantra, but only just barely, because of the Concurrent in the class name. If the bulk ops on CHM were cordoned off via some naming convention to make it clear which subset were possibly-parallel, that would help further. On 5/19/2013 3:25 PM, Doug Lea wrote: > On 05/15/13 11:14, Brian Goetz wrote: >> Even if we accept that hinting is needed, its not clear what the best >> hints are; >> hints about Q might be more useful than attempting to set splitting >> parameters. >> (And, there are more than one splitting parameter we might want to >> set. For >> example, if the pipeline has a highly selective filter, we might want >> to adjust >> target chunk size.) And hints beget more hints. >> >> So I still don't think we know what we'd want even if we knew we wanted >> something :( > > What about the follow-up question. Suppose we did this > in CHM only. That is, instead of xxxSequentially(fn, ...) vs > xxxInParallel(fn, ...), there were only xxx(minParSize, fn, ,,,) > > Because there's no stream pipeline, there's no cascade of > hinting. > > -Doug > > >> >> >> On 5/15/2013 11:02 AM, Doug Lea wrote: >>> >>> One more, that Brian and I have discussed off an on for a year or so, >>> that always ends up as too hard to decide. >>> >>> Do we want to overload Collection.parallelStream(int minParSize)? >>> minParSize is the minimum number of elements to process in parallel, >>> else sequential. >>> >>> And/or overload Stream.parallel(). >>> >>> Arguments for: >>> >>> * Parallel performance for the combination of small N and >>> cheap lambdas (like adding numbers) is crummy. There's >>> little hope of automating decisions unless/until we can >>> automate cost metrics of lambdas, which is not going to happen >>> any time soon. >>> >>> * People might want to write code that uses parallelStrerams >>> only if they have a lot of elements, that they might not >>> know ahead of time. >>> >>> Against: >>> >>> * Most people do not know what values are reasonable. >>> >>> * The "best" values are likely to change in the future. >>> >>> One compromise is to do this only for the custom CHM parallel >>> task methods, that are most likely to be used only by people >>> who are also most likely to want to tune performance. >>> >>> (I'm on my way out to a conference program committee meeting >>> for two days, so replies might be slow.) >>> >>> -Doug >>> >>> >>> >> > From dl at cs.oswego.edu Sun May 19 14:14:41 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 19 May 2013 17:14:41 -0400 Subject: Loose-ends wrapup In-Reply-To: <51992E18.4030403@oracle.com> References: <518BF59E.1090405@oracle.com> <5193A39C.2040408@cs.oswego.edu> <5193A64D.8080902@oracle.com> <51992734.6020008@cs.oswego.edu> <51992E18.4030403@oracle.com> Message-ID: <519940C1.8080403@cs.oswego.edu> On 05/19/13 15:55, Brian Goetz wrote: > Our new mantra is "explicit but unobtrusive parallelism." > > As a low-level component, I think the xxx(minParSize...) is a perfectly > reasonable API. And xxx(minParSize...) could be argued to meets the mantra, but > only just barely, because of the Concurrent in the class name. If the bulk ops > on CHM were cordoned off via some naming convention to make it clear which > subset were possibly-parallel, that would help further. They will be the only ones that have a leading, non-optional argument, so it's about as explicit as you can get. An example below. I'm about to do this. People using CHM bulk tasks have already seen a few API changes, so one final one is not likely to cause them to get especially angry, and less angry than they'll someday be if they don't have this simple tuning mechanism. ... class-level doc ... *

These bulk operations accept a {@code parallelismThreshold} * argument. Methods proceed sequentially if the current map size is * estimated to be less than the given threshold. Using a value of * {@code Long.MAX_VALUE} suppresses all parallelism. Using a value * of {@code 1} results in maximal parallelism. In-between values can * be used to trade off overhead versus throughput. * /** * Returns the result of accumulating the given transformation * of all (key, value) pairs using the given reducer to * combine values, and the given basis as an identity value. * * @param parallelismThreshold the (estimated) number of elements * needed for this operation to be executed in parallel. * @param transformer a function returning the transformation * for an element * @param basis the identity (initial default value) for the reduction * @param reducer a commutative associative combining function * @return the result of accumulating the given transformation * of all (key, value) pairs */ public long reduceToLong(long parallelismThreshold, ToLongBiFunction transformer, long basis, LongBinaryOperator reducer) { From brian.goetz at oracle.com Mon May 20 09:34:36 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 20 May 2013 12:34:36 -0400 Subject: iterator()/spliterator() and stateful operations Message-ID: <519A509C.2090900@oracle.com> "Well-behaved" pipelines such as IntStream.range(1,1000).filter(i -> i % 2 == 0) .map(i -> i*2) are fully lazy. We do not start generating, filtering, or mapping elements until the terminal operation begins, and thereafter, we only generate, filter, and map as many elements as are needed to satisfy the elements demanded by whatever is consuming the pipeline. These pipelines are well-behaved in both serial and parallel environments, largely because the well-behaved (stateless) operations satisfy the homomorphism f(a||b) = f(a) || f(b) Pipelines with stateful operations, such as sorted(), distinct(), or limit(), lack this nice property. You cannot consume any elements from a sorted pipeline until all the elements upstream of the sorting operation have been seen. The same is true to varying lesser degrees with distinct() and limit(). (For example, if the pipeline is *unordered*, distinct() can proceed lazily, backed by a ConcurrentHashSet, but not if it is ordered.) Recall that the primary purpose of these methods is that they are "escape hatches" so that users can perform lazy traversals that are not directly supported by the library. This escape hatch mechanism is also key to being able to add default methods to Stream interfaces later who might not be directly implementable in terms of other stream methods. For example, we might implement in IntStream: default IntStream everyElementTwice() { IntStream asIntPipeline = new IntPipeline(() -> spliterator(), getSourceAndOpFlags(), isParallel()); return new IntStream.StatelessOp(asIntPipeline, ...); } In other words, for IntStream implementations not already backed by the standard implementation (IntPipeline) which will co-evolve with the Stream interface, we get the spliterator() and other metadata and use that to create a new IntPipeline lazily backed by the existing non-IntPipeline stream, and then chain a new lazy operation off of the new pipeline. With this as background, we need to make some decisions about the iterator() and spliterator() stream methods. Here's how spliterator() currently works now (iterator() is based on spliterator()). 1. If the stream is composed only of source, just return the source spliterator. 2. If the stream is composed only of source and stateless operations, create a new spliterator that wraps the source spliterator and the operations. For example, intRange(...).map(...) will return a spliterator that delegates splitting and traversal to the intRange spliterator, but applies the mapping function to elements as they pass through. Spliterators generated in this manner are fully lazy. 3. If the stream has stateful operations, slice the pipeline into "chunks" at stateful operation boundaries. For example: intRange(...).filter(...) .sorted(...) // end of chunk 1 .map(...) .distinct() // end of chunk 2 .flatMap(...) .filter(...) // end of chunk 3 .spliterator() For each chunk, we describe it with a spliterator as in (2), where the output spliterator of chunk k is the source of chunk k-1. If (2) can produce a fully lazy spliterator (such as distinct() on an unordered stream, or limit() on a subsized stream), great; if not the spliterator will, at first traversal, evaluate the upstream pipeline and describe the result with a spliterator. We attempt to preserve as much laziness as possible, but for cases like: stream.sorted().spliterator() we're going to have to traversal the whole input when the caller finally gets around to asking for some elements. If the source is infinite, that's going to be a problem (but sorting an infinite stream is always a problem.) Here are some cases that currently loop infinitely: random.doubles().parallel().distinct().iterator().next(); random.doubles().parallel().distinct().spliterator().tryAdvance(c); The reason is that the spliterator wants to evaluate doubles().distinct() before it knows how to split it (splitting cannot block.) There are some things we can do here. - We can make Random.doubles() and similar generated streams be UNORDERED. The parallel implementations for distinct() and limit() have optimized paths for unordered streams that allow them to proceed lazily. This will fix both of these two particular problems for a lot of sources, but we'd still have the same failure for: Stream.iterate(0, f).parallel().distinct().iterator().next(); since this stream has to be ordered. - We can force the stream on which iterator() is invoked to be sequential. This seems fairly natural, as calling iterator() is a committment to sequential iteration of the results. This fixes the iterator variant of this problem, but not the spliterator problem. These first two seem pretty natural and are likely uncontroversial. But do we want to go farther to address to address the spliterator variant like: Stream.iterate(0, f).parallel() .distinct().spliterator().tryAdvance(c); A similar issue happens with limit() applied to large parallel ordered non-subsized streams with large limits (as you can see, we've done quite a bit of work already to narrow the damage.) Because limit is tied intrinsically to encounter order, parallel implementations sometimes need to do some sort of speculative evaluation and buffering. This can still be a win for high-Q problems like: bigNumbers.parallel().filter(n -> isPrime(n)).limit(1000) because the expensive operation -- filtering -- can still be parallelized. But for low-Q problems with large bounds, the buffering required can cause parallel pipelines with limit in them to blow up with OOME. An analogue of above, but with poor space bounds instead of time bounds: Stream.iterate(0, f).parallel() .filter(...).limit(10M).spliterator().tryAdvance(c); This will likely OOME. At some point, the user asked for this, as in: infinite.parallel().sorted()... So, questions: - Should we go ahead with the proposed changes above for generators (make them unsized) and iterator (make it sequential)? - Should we go further, trimming the behavior of spliterator() to reduce time/space consumption, or are we OK here given its likely narrower usage profile? From paul.sandoz at oracle.com Tue May 21 00:49:54 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Tue, 21 May 2013 08:49:54 +0100 Subject: iterator()/spliterator() and stateful operations In-Reply-To: <519A509C.2090900@oracle.com> References: <519A509C.2090900@oracle.com> Message-ID: <7D0754AF-7122-4D5C-9F8C-6079DED9E4B8@oracle.com> Note that sorted() is problematic for both sequential and parallel pipelines: Stream.generate(() -> "A").limit(WITH_REALLY_LARGE_VALUE).sorted().iterator().next(); since it is a barrier in both cases. On May 20, 2013, at 5:34 PM, Brian Goetz wrote: > So, questions: > - Should we go ahead with the proposed changes above for generators (make them unsized) Yes. In a previous email on Stream.generate we had the choice of SIZED of Long.MAX_VALUE or infinite. I think infinite makes more sense and it allows us to produce an infinite, unordered and IMMUTABLE stream of generated values (verified in patch) thereby working better in conjunction with distinct and substream/limit. > and iterator (make it sequential)? Yes. > - Should we go further, trimming the behavior of spliterator() to reduce time/space consumption, or are we OK here given its likely narrower usage profile? I am inclined not to interfere with it too much and trying to make it "magical" in avoiding some but not all edge cases. One workaround is for the developer to place a sequential() before spliterator(), which is also the implementation solution for iterator: @Override public final Iterator iterator() { return Spliterators.iteratorFromSpliterator(sequential().spliterator()); } Paul. From brian.goetz at oracle.com Tue May 21 09:17:18 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 21 May 2013 12:17:18 -0400 Subject: In-person EG meeting Message-ID: <519B9E0E.6090706@oracle.com> Continuing the tradition of the last two years, I am happy to host an in-person meeting the day after JVM Language Summit in Santa Clara this year, which would be Thu August 1. We're zooming in on the finish line, but I still think there are topics it is worthwhile to discuss in person. Let me know if you'll be in town and are interested! From brian.goetz at oracle.com Thu May 23 11:06:14 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 23 May 2013 14:06:14 -0400 Subject: Loose end: concat Message-ID: <519E5A96.2080606@oracle.com> I cleaned up concat() and wrote Int/Long/Double versions. (Fortunately, with the recent addition of Spliterator.OfPrimitive, the duplication quotient was much lower.) Currently these still live in Streams. Is that still the right place? The stream classes (Stream, IntStream, etc) seem a little wrong for them, but I can't quite put my finger on why. Specs: /** * Creates a lazy concatenated {@code Stream} whose elements are all the * elements of a first {@code Stream} succeeded by all the elements of the * second {@code Stream}. The resulting stream is ordered if both * of the input streams are ordered, and parallel if either of the input * streams is parallel. * * @param The type of stream elements * @param a the first stream * @param b the second stream to concatenate on to end of the first * stream * @return the concatenation of the two input streams */ public static Stream concat(Stream a, Stream b) { /** * Creates a lazy concatenated {@code IntStream} whose elements are all the * elements of a first {@code IntStream} succeeded by all the elements of the * second {@code IntStream}. The resulting stream is ordered if both * of the input streams are ordered, and parallel if either of the input * streams is parallel. * * @param a the first stream * @param b the second stream to concatenate on to end of the first stream * @return the concatenation of the two streams */ public static IntStream concat(IntStream a, IntStream b) { (and similar for Long and Double). From paul.sandoz at oracle.com Thu May 23 11:13:08 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 23 May 2013 19:13:08 +0100 Subject: Remove DoubleStream.range methods Message-ID: <27B7843E-9EA2-424D-9D55-85E74C62FE28@oracle.com> Hi, Unless there are objections i plan to remove the DoubleStream.range methods from the lambda repo next week. I think we currently lack the use-cases to get this right and it is better if we leave it to numerical libraries to integrate as they see fit. Paul. From spullara at gmail.com Thu May 23 11:23:49 2013 From: spullara at gmail.com (Sam Pullara) Date: Thu, 23 May 2013 11:23:49 -0700 Subject: Loose end: concat In-Reply-To: <519E5A96.2080606@oracle.com> References: <519E5A96.2080606@oracle.com> Message-ID: *Stream seems like the right place for them to me. It is certainly the second place I would look ? first place would be as an instance method, but we've discussed that before. Sam On May 23, 2013, at 11:06 AM, Brian Goetz wrote: > I cleaned up concat() and wrote Int/Long/Double versions. (Fortunately, with the recent addition of Spliterator.OfPrimitive, the duplication quotient was much lower.) > > Currently these still live in Streams. Is that still the right place? The stream classes (Stream, IntStream, etc) seem a little wrong for them, but I can't quite put my finger on why. > > Specs: > > /** > * Creates a lazy concatenated {@code Stream} whose elements are all the > * elements of a first {@code Stream} succeeded by all the elements of the > * second {@code Stream}. The resulting stream is ordered if both > * of the input streams are ordered, and parallel if either of the input > * streams is parallel. > * > * @param The type of stream elements > * @param a the first stream > * @param b the second stream to concatenate on to end of the first > * stream > * @return the concatenation of the two input streams > */ > public static Stream concat(Stream a, Stream b) { > > > /** > * Creates a lazy concatenated {@code IntStream} whose elements are all the > * elements of a first {@code IntStream} succeeded by all the elements of the > * second {@code IntStream}. The resulting stream is ordered if both > * of the input streams are ordered, and parallel if either of the input > * streams is parallel. > * > * @param a the first stream > * @param b the second stream to concatenate on to end of the first stream > * @return the concatenation of the two streams > */ > public static IntStream concat(IntStream a, IntStream b) { > > > (and similar for Long and Double). > From joe.bowbeer at gmail.com Thu May 23 13:15:02 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Thu, 23 May 2013 13:15:02 -0700 Subject: Loose end: concat In-Reply-To: References: <519E5A96.2080606@oracle.com> Message-ID: Why was an instance method rejected? I can't recall. s = s1.concat(s2).concat(s3); On Thu, May 23, 2013 at 11:23 AM, Sam Pullara wrote: > *Stream seems like the right place for them to me. It is certainly the > second place I would look ? first place would be as an instance method, but > we've discussed that before. > > Sam > > On May 23, 2013, at 11:06 AM, Brian Goetz wrote: > > > I cleaned up concat() and wrote Int/Long/Double versions. (Fortunately, > with the recent addition of Spliterator.OfPrimitive, the duplication > quotient was much lower.) > > > > Currently these still live in Streams. Is that still the right place? > The stream classes (Stream, IntStream, etc) seem a little wrong for them, > but I can't quite put my finger on why. > > > > Specs: > > > > /** > > * Creates a lazy concatenated {@code Stream} whose elements are all > the > > * elements of a first {@code Stream} succeeded by all the elements > of the > > * second {@code Stream}. The resulting stream is ordered if both > > * of the input streams are ordered, and parallel if either of the > input > > * streams is parallel. > > * > > * @param The type of stream elements > > * @param a the first stream > > * @param b the second stream to concatenate on to end of the first > > * stream > > * @return the concatenation of the two input streams > > */ > > public static Stream concat(Stream a, Stream extends T> b) { > > > > > > /** > > * Creates a lazy concatenated {@code IntStream} whose elements are > all the > > * elements of a first {@code IntStream} succeeded by all the > elements of the > > * second {@code IntStream}. The resulting stream is ordered if both > > * of the input streams are ordered, and parallel if either of the > input > > * streams is parallel. > > * > > * @param a the first stream > > * @param b the second stream to concatenate on to end of the first > stream > > * @return the concatenation of the two streams > > */ > > public static IntStream concat(IntStream a, IntStream b) { > > > > > > (and similar for Long and Double). > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130523/25c9e461/attachment.html From brian.goetz at oracle.com Thu May 23 14:57:33 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 23 May 2013 17:57:33 -0400 Subject: Loose end: concat In-Reply-To: References: <519E5A96.2080606@oracle.com> Message-ID: <519E90CD.80705@oracle.com> Same reason as into(Collection) -- these were the sole cases where the arguments to stream methods were stateful objects; everything else is a stateless recipe for how to transform a stream, but which does not intrinsically mutate or consume any of its arguments. This wasn't just a philosophical concern, though I do believe it makes the API stronger. The problems showed up immediately in our testing framework. Our testing framework is based on repeatable stream transforms; we have an abstraction for "repeatable data source" (which can deliver its data as a stream, parallel stream, iterator, or spliterator), and tested stream operations using functions that transformed a stream into a stream: s -> s.map(i -> i*2) With the repeatable data source, we were able to easily automate comparison between dozens of different ways to express the same result, such as: source.stream().map(...).toArray() source.parallelStream().map(...).toArray() source.stream().map(...).collect(toList()) source.parallelStream().map(...).collect(toList()) source.stream().map(...).iterator() ... etc When we saw how much trouble methods like into() and concat() were for frameworks that want to build on the idea of "repeatable stream source/transform", this was a big warning sign; our test framework could not possibly be the only framework that would want to treat streams functionally. Secondarily, the value of fluency here is pretty limited, since its likely that the other stream operation will be enough of a mouthful that it will get "outlined" anyway. So bottom line: with into(Collection) and concat(Stream) removed, any function s -> s.streamOp(nonInterferingLambda) becomes a repeatable and stateless transform. No "except for ..." caveats needed. On 5/23/2013 4:15 PM, Joe Bowbeer wrote: > Why was an instance method rejected? I can't recall. > > s = s1.concat(s2).concat(s3); > > > > > On Thu, May 23, 2013 at 11:23 AM, Sam Pullara > wrote: > > *Stream seems like the right place for them to me. It is certainly > the second place I would look ? first place would be as an instance > method, but we've discussed that before. > > Sam > > On May 23, 2013, at 11:06 AM, Brian Goetz > wrote: > > > I cleaned up concat() and wrote Int/Long/Double versions. > (Fortunately, with the recent addition of Spliterator.OfPrimitive, > the duplication quotient was much lower.) > > > > Currently these still live in Streams. Is that still the right > place? The stream classes (Stream, IntStream, etc) seem a little > wrong for them, but I can't quite put my finger on why. > > > > Specs: > > > > /** > > * Creates a lazy concatenated {@code Stream} whose elements > are all the > > * elements of a first {@code Stream} succeeded by all the > elements of the > > * second {@code Stream}. The resulting stream is ordered if both > > * of the input streams are ordered, and parallel if either of > the input > > * streams is parallel. > > * > > * @param The type of stream elements > > * @param a the first stream > > * @param b the second stream to concatenate on to end of the > first > > * stream > > * @return the concatenation of the two input streams > > */ > > public static Stream concat(Stream a, > Stream b) { > > > > > > /** > > * Creates a lazy concatenated {@code IntStream} whose > elements are all the > > * elements of a first {@code IntStream} succeeded by all the > elements of the > > * second {@code IntStream}. The resulting stream is ordered > if both > > * of the input streams are ordered, and parallel if either of > the input > > * streams is parallel. > > * > > * @param a the first stream > > * @param b the second stream to concatenate on to end of the > first stream > > * @return the concatenation of the two streams > > */ > > public static IntStream concat(IntStream a, IntStream b) { > > > > > > (and similar for Long and Double). > > > > From brian.goetz at oracle.com Fri May 24 12:20:23 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 24 May 2013 15:20:23 -0400 Subject: Loose ends: Optional Message-ID: <519FBD77.9030400@oracle.com> Proposed spec for methods on Optional, which would have the obvious counterparts in Optional{Int,Long,Double}. These methods are known to be useful and seem mostly harmless now that other things have settled. (I don't think they greatly increase the moral hazard of Optional in general, and they do make it more expressive.) /** * If a value is present, and the value matches the given predicate, * return an {@code Optional} describing the value, otherwise return an * empty {@code Optional}. * * @param predicate a predicate to apply to the value, if present * @throws NullPointerException if the predicate is null * @return an {@code Optional} describing the value of this {@code Optional} * if a value is present and the value matches the given predicate, * otherwise an empty {@code Optional} */ public Optional filter(Predicate predicate) { Objects.requireNonNull(predicate); if (!isPresent()) return this; else return predicate.test(value) ? this : empty(); } /** * If a value is present, apply the provided mapping function to it, * and if the result is non-null, return an {@code Optional} describing the * result. Otherwise return an empty {@code Optional}. * * @param The type of the result of the mapping function * @param mapper a mapping function to apply to the value, if present * @throws NullPointerException if the mapping function is null * @return an {@code Optional} describing the result of applying a mapping * function to the value of this {@code Optional}, if a value is present, * otherwise an empty {@code Optional} */ public Optional map(Function mapper) { Objects.requireNonNull(mapper); if (!isPresent()) return empty(); else { U result = mapper.apply(value); return result == null ? empty() : Optional.of(result); } } /** * If a value is present, apply the provided {@code Optional}-bearing * mapping function to it, return that result, otherwise return an empty * {@code Optional}. * * @param The type parameter to the {@code Optional} returned by * @param mapper a mapping function to apply to the value, if present * the mapping function * @throws NullPointerException if the mapping function is null or returns * a null result * @return the result of applying an {@code Optional}-bearing mapping * function to the value of this {@code Optional}, if a value is present, * otherwise an empty {@code Optional} */ public Optional flatMap(Function> mapper) { Objects.requireNonNull(mapper); if (!isPresent()) return empty(); else { return Objects.requireNonNull(mapper.apply(value)); } } } From spullara at gmail.com Fri May 24 12:59:41 2013 From: spullara at gmail.com (Sam Pullara) Date: Fri, 24 May 2013 12:59:41 -0700 Subject: Loose ends: Optional In-Reply-To: <519FBD77.9030400@oracle.com> References: <519FBD77.9030400@oracle.com> Message-ID: <0C5F3DF5-9F47-4B09-BAE1-518F3C08B01B@gmail.com> Awesome, thanks. Sam On May 24, 2013, at 12:20 PM, Brian Goetz wrote: > Proposed spec for methods on Optional, which would have the obvious counterparts in Optional{Int,Long,Double}. > > These methods are known to be useful and seem mostly harmless now that other things have settled. (I don't think they greatly increase the moral hazard of Optional in general, and they do make it more expressive.) > > > /** > * If a value is present, and the value matches the given predicate, > * return an {@code Optional} describing the value, otherwise return an > * empty {@code Optional}. > * > * @param predicate a predicate to apply to the value, if present > * @throws NullPointerException if the predicate is null > * @return an {@code Optional} describing the value of this {@code Optional} > * if a value is present and the value matches the given predicate, > * otherwise an empty {@code Optional} > */ > public Optional filter(Predicate predicate) { > Objects.requireNonNull(predicate); > if (!isPresent()) > return this; > else > return predicate.test(value) ? this : empty(); > } > > /** > * If a value is present, apply the provided mapping function to it, > * and if the result is non-null, return an {@code Optional} describing the > * result. Otherwise return an empty {@code Optional}. > * > * @param The type of the result of the mapping function > * @param mapper a mapping function to apply to the value, if present > * @throws NullPointerException if the mapping function is null > * @return an {@code Optional} describing the result of applying a mapping > * function to the value of this {@code Optional}, if a value is present, > * otherwise an empty {@code Optional} > */ > public Optional map(Function mapper) { > Objects.requireNonNull(mapper); > if (!isPresent()) > return empty(); > else { > U result = mapper.apply(value); > return result == null ? empty() : Optional.of(result); > } > } > > /** > * If a value is present, apply the provided {@code Optional}-bearing > * mapping function to it, return that result, otherwise return an empty > * {@code Optional}. > * > * @param The type parameter to the {@code Optional} returned by > * @param mapper a mapping function to apply to the value, if present > * the mapping function > * @throws NullPointerException if the mapping function is null or returns > * a null result > * @return the result of applying an {@code Optional}-bearing mapping > * function to the value of this {@code Optional}, if a value is present, > * otherwise an empty {@code Optional} > */ > public Optional flatMap(Function> mapper) { > Objects.requireNonNull(mapper); > if (!isPresent()) > return empty(); > else { > return Objects.requireNonNull(mapper.apply(value)); > } > } > } From tim at peierls.net Fri May 24 13:10:57 2013 From: tim at peierls.net (Tim Peierls) Date: Fri, 24 May 2013 16:10:57 -0400 Subject: Loose ends: Optional In-Reply-To: <519FBD77.9030400@oracle.com> References: <519FBD77.9030400@oracle.com> Message-ID: On Fri, May 24, 2013 at 3:20 PM, Brian Goetz wrote: > Proposed spec for methods on Optional, which would have the obvious > counterparts in Optional{Int,Long,Double}. > > These methods are known to be useful and seem mostly harmless now that > other things have settled. (I don't think they greatly increase the moral > hazard of Optional in general, and they do make it more expressive.) > I'm in the curious (unique?) position of both desperately wanting Optional and desperately *not* wanting lots of additional methods like these. If the price of having Optional is the presence of these methods, I'll suck it up, but "mostly harmless" is not exactly a ringing endorsement. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130524/aac925c2/attachment.html From brian.goetz at oracle.com Fri May 24 13:15:49 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 24 May 2013 16:15:49 -0400 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> Message-ID: <519FCA75.6070600@oracle.com> Optional has obvious upsides and downsides. Some of the downsides are: - It's a box. Boxing can be heavy. - The more general-purpose value-wrapper classes you have, the more some people fear an explosion of unreadable types like Map>, List>> in API signatures. I think where we've tried to land is: do things that encourage people to use Optional only in return position. These methods make it more useful in return position while not increasing the temptation to use it elsewhere any more than we already have. Hence "mostly harmless". On 5/24/2013 4:10 PM, Tim Peierls wrote: > On Fri, May 24, 2013 at 3:20 PM, Brian Goetz > wrote: > > Proposed spec for methods on Optional, which would have the obvious > counterparts in Optional{Int,Long,Double}. > > These methods are known to be useful and seem mostly harmless now > that other things have settled. (I don't think they greatly > increase the moral hazard of Optional in general, and they do make > it more expressive.) > > > I'm in the curious (unique?) position of both desperately wanting > Optional and desperately *not* wanting lots of additional methods like > these. If the price of having Optional is the presence of these methods, > I'll suck it up, but "mostly harmless" is not exactly a ringing endorsement. > > --tim From forax at univ-mlv.fr Sat May 25 10:12:07 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 25 May 2013 19:12:07 +0200 Subject: Loose ends: Optional In-Reply-To: <519FCA75.6070600@oracle.com> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> Message-ID: <51A0F0E7.2090403@univ-mlv.fr> On 05/24/2013 10:15 PM, Brian Goetz wrote: > Optional has obvious upsides and downsides. Some of the downsides are: > - It's a box. Boxing can be heavy. > - The more general-purpose value-wrapper classes you have, the more > some people fear an explosion of unreadable types like > Map>, List List>> in API signatures. > > I think where we've tried to land is: do things that encourage people > to use Optional only in return position. These methods make it more > useful in return position while not increasing the temptation to use > it elsewhere any more than we already have. Hence "mostly harmless". I think you cross a line without seen it, filter, map and flatmap are lazy on Stream but not on Optional. R?mi > > On 5/24/2013 4:10 PM, Tim Peierls wrote: >> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz > > wrote: >> >> Proposed spec for methods on Optional, which would have the obvious >> counterparts in Optional{Int,Long,Double}. >> >> These methods are known to be useful and seem mostly harmless now >> that other things have settled. (I don't think they greatly >> increase the moral hazard of Optional in general, and they do make >> it more expressive.) >> >> >> I'm in the curious (unique?) position of both desperately wanting >> Optional and desperately *not* wanting lots of additional methods like >> these. If the price of having Optional is the presence of these methods, >> I'll suck it up, but "mostly harmless" is not exactly a ringing >> endorsement. >> >> --tim From forax at univ-mlv.fr Tue May 28 09:10:26 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 28 May 2013 18:10:26 +0200 Subject: Loose ends: Optional In-Reply-To: <51A0F0E7.2090403@univ-mlv.fr> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> Message-ID: <51A4D6F2.2060408@univ-mlv.fr> On lambda-dev: 05/28/2013 05:35 PM, brian.goetz at oracle.com wrote: > Changeset: fde3666e6394 > Author: briangoetz > Date: 2013-05-28 11:34 -0400 > URL:http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fde3666e6394 > > Additional convenience methods on Optional > > ! src/share/classes/java/util/Optional.java > > It seems, I have not received one or several emails about adding an eager versions of filter, map to Optional. The last email I received about that subject is the one below. R?mi On 05/25/2013 07:12 PM, Remi Forax wrote: > On 05/24/2013 10:15 PM, Brian Goetz wrote: >> Optional has obvious upsides and downsides. Some of the downsides are: >> - It's a box. Boxing can be heavy. >> - The more general-purpose value-wrapper classes you have, the more >> some people fear an explosion of unreadable types like >> Map>, List> List>> in API signatures. >> >> I think where we've tried to land is: do things that encourage people >> to use Optional only in return position. These methods make it more >> useful in return position while not increasing the temptation to use >> it elsewhere any more than we already have. Hence "mostly harmless". > > I think you cross a line without seen it, filter, map and flatmap are > lazy on Stream but not on Optional. > > R?mi > >> >> On 5/24/2013 4:10 PM, Tim Peierls wrote: >>> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz >> > wrote: >>> >>> Proposed spec for methods on Optional, which would have the obvious >>> counterparts in Optional{Int,Long,Double}. >>> >>> These methods are known to be useful and seem mostly harmless now >>> that other things have settled. (I don't think they greatly >>> increase the moral hazard of Optional in general, and they do make >>> it more expressive.) >>> >>> >>> I'm in the curious (unique?) position of both desperately wanting >>> Optional and desperately *not* wanting lots of additional methods like >>> these. If the price of having Optional is the presence of these >>> methods, >>> I'll suck it up, but "mostly harmless" is not exactly a ringing >>> endorsement. >>> >>> --tim > From brian.goetz at oracle.com Tue May 28 09:19:01 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 12:19:01 -0400 Subject: Loose ends: Optional In-Reply-To: <51A4D6F2.2060408@univ-mlv.fr> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> Message-ID: <51A4D8F5.9040000@oracle.com> No, you did not miss anything. We have never required that an issue be fully settled in the EG before committing it to the lambda repository. Having code to play with often plays a critical role in EG and community discussions, so we have regularly committed code to the lambda repository that has not yet been fully blessed by the EG. As to your concern, I see it slightly differently. It is not that the filter method is lazy on Stream and eager on Optional. It is that *Stream* itself is laziness-seeking (all methods that do not require an immediate result defer what computation they can) and Optional itself is eager (all methods produce a fully formed result or side-effect). > On lambda-dev: 05/28/2013 05:35 PM, brian.goetz at oracle.com wrote: >> Changeset: fde3666e6394 >> Author: briangoetz >> Date: 2013-05-28 11:34 -0400 >> URL:http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fde3666e6394 >> >> Additional convenience methods on Optional >> >> ! src/share/classes/java/util/Optional.java >> >> > > It seems, I have not received one or several emails about adding an > eager versions of filter, map to Optional. > The last email I received about that subject is the one below. > > R?mi > > > On 05/25/2013 07:12 PM, Remi Forax wrote: >> On 05/24/2013 10:15 PM, Brian Goetz wrote: >>> Optional has obvious upsides and downsides. Some of the downsides are: >>> - It's a box. Boxing can be heavy. >>> - The more general-purpose value-wrapper classes you have, the more >>> some people fear an explosion of unreadable types like >>> Map>, List>> List>> in API signatures. >>> >>> I think where we've tried to land is: do things that encourage people >>> to use Optional only in return position. These methods make it more >>> useful in return position while not increasing the temptation to use >>> it elsewhere any more than we already have. Hence "mostly harmless". >> >> I think you cross a line without seen it, filter, map and flatmap are >> lazy on Stream but not on Optional. >> >> R?mi >> >>> >>> On 5/24/2013 4:10 PM, Tim Peierls wrote: >>>> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz >>> > wrote: >>>> >>>> Proposed spec for methods on Optional, which would have the obvious >>>> counterparts in Optional{Int,Long,Double}. >>>> >>>> These methods are known to be useful and seem mostly harmless now >>>> that other things have settled. (I don't think they greatly >>>> increase the moral hazard of Optional in general, and they do make >>>> it more expressive.) >>>> >>>> >>>> I'm in the curious (unique?) position of both desperately wanting >>>> Optional and desperately *not* wanting lots of additional methods like >>>> these. If the price of having Optional is the presence of these >>>> methods, >>>> I'll suck it up, but "mostly harmless" is not exactly a ringing >>>> endorsement. >>>> >>>> --tim >> > From brian.goetz at oracle.com Tue May 28 09:26:38 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 12:26:38 -0400 Subject: Loose end: ints(), longs() Message-ID: <51A4DABE.7070900@oracle.com> Another loose end is a method to generate "all" ints / longs (which are sugar for ranges 0..MAX_VALUE.) These show up in pedagogical examples all the time: ints().filter(i -> isPrime(i)).limit(100) The logical place for them is: IntStream.ints() LongStream.longs() but some have raised concern that this might be confusing because we have an instance method on IntStream called "longs()" which widens the elements from int to long. While this isn't fatal, it might be confusing. Perhaps it would be better to rename these conversion methods: IntStream: longs(), doubles(), boxed() LongStream: doubles(), boxed() DoubleStream: boxed() to asInts(), asLongs(), and asBoxed()? In retrospect, these seem better names anyway. And they also eliminate the conflict above. From brian.goetz at oracle.com Tue May 28 09:34:46 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 12:34:46 -0400 Subject: Loose end: concat In-Reply-To: <519E5A96.2080606@oracle.com> References: <519E5A96.2080606@oracle.com> Message-ID: <51A4DCA6.8040301@oracle.com> Seems we've seen no objections on the existence of these methods. So the remaining issue is where they live. Candidates: Streams.concat(a, b) x 4 Stream.concat(a, b), IntStream.concat(a, b), etc. Over time, we've been moving away from "overloaded" stream methods towards more explicitly typed methods, which is a point against the first candidate. I initially preferred the first version, but am more on the fence now. Sam prefers the second. Other opinions? On 5/23/2013 2:06 PM, Brian Goetz wrote: > I cleaned up concat() and wrote Int/Long/Double versions. (Fortunately, > with the recent addition of Spliterator.OfPrimitive, the duplication > quotient was much lower.) > > Currently these still live in Streams. Is that still the right place? > The stream classes (Stream, IntStream, etc) seem a little wrong for > them, but I can't quite put my finger on why. > > Specs: > > /** > * Creates a lazy concatenated {@code Stream} whose elements are > all the > * elements of a first {@code Stream} succeeded by all the elements > of the > * second {@code Stream}. The resulting stream is ordered if both > * of the input streams are ordered, and parallel if either of the > input > * streams is parallel. > * > * @param The type of stream elements > * @param a the first stream > * @param b the second stream to concatenate on to end of the first > * stream > * @return the concatenation of the two input streams > */ > public static Stream concat(Stream a, Stream extends T> b) { > > > /** > * Creates a lazy concatenated {@code IntStream} whose elements are > all the > * elements of a first {@code IntStream} succeeded by all the > elements of the > * second {@code IntStream}. The resulting stream is ordered if both > * of the input streams are ordered, and parallel if either of the > input > * streams is parallel. > * > * @param a the first stream > * @param b the second stream to concatenate on to end of the first > stream > * @return the concatenation of the two streams > */ > public static IntStream concat(IntStream a, IntStream b) { > > > (and similar for Long and Double). > From joe.bowbeer at gmail.com Tue May 28 09:41:01 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 28 May 2013 09:41:01 -0700 Subject: Loose end: ints(), longs() In-Reply-To: <51A4DABE.7070900@oracle.com> References: <51A4DABE.7070900@oracle.com> Message-ID: I like the renamings you propose. I'm not really keen on the ints and longs methods though. I'm already used to intRange, which seems short enough, and is consistent with the other names, none of which are very short. On May 28, 2013 9:27 AM, "Brian Goetz" wrote: > Another loose end is a method to generate "all" ints / longs (which are > sugar for ranges 0..MAX_VALUE.) These show up in pedagogical examples all > the time: > > ints().filter(i -> isPrime(i)).limit(100) > > The logical place for them is: > > IntStream.ints() > LongStream.longs() > > but some have raised concern that this might be confusing because we have > an instance method on IntStream called "longs()" which widens the elements > from int to long. While this isn't fatal, it might be confusing. > > Perhaps it would be better to rename these conversion methods: > > IntStream: longs(), doubles(), boxed() > LongStream: doubles(), boxed() > DoubleStream: boxed() > > to asInts(), asLongs(), and asBoxed()? > > In retrospect, these seem better names anyway. And they also eliminate > the conflict above. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/d18caa31/attachment.html From joe.bowbeer at gmail.com Tue May 28 09:44:47 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 28 May 2013 09:44:47 -0700 Subject: Loose end: concat In-Reply-To: <51A4DCA6.8040301@oracle.com> References: <519E5A96.2080606@oracle.com> <51A4DCA6.8040301@oracle.com> Message-ID: I like the second, still, for its fluency. I didn't understand your stateful-based argument. Are you backing away from it? On May 28, 2013 9:35 AM, "Brian Goetz" wrote: > Seems we've seen no objections on the existence of these methods. So the > remaining issue is where they live. > > Candidates: > > Streams.concat(a, b) x 4 > Stream.concat(a, b), IntStream.concat(a, b), etc. > > Over time, we've been moving away from "overloaded" stream methods towards > more explicitly typed methods, which is a point against the first candidate. > > I initially preferred the first version, but am more on the fence now. Sam > prefers the second. Other opinions? > > > On 5/23/2013 2:06 PM, Brian Goetz wrote: > >> I cleaned up concat() and wrote Int/Long/Double versions. (Fortunately, >> with the recent addition of Spliterator.OfPrimitive, the duplication >> quotient was much lower.) >> >> Currently these still live in Streams. Is that still the right place? >> The stream classes (Stream, IntStream, etc) seem a little wrong for >> them, but I can't quite put my finger on why. >> >> Specs: >> >> /** >> * Creates a lazy concatenated {@code Stream} whose elements are >> all the >> * elements of a first {@code Stream} succeeded by all the elements >> of the >> * second {@code Stream}. The resulting stream is ordered if both >> * of the input streams are ordered, and parallel if either of the >> input >> * streams is parallel. >> * >> * @param The type of stream elements >> * @param a the first stream >> * @param b the second stream to concatenate on to end of the first >> * stream >> * @return the concatenation of the two input streams >> */ >> public static Stream concat(Stream a, Stream> extends T> b) { >> >> >> /** >> * Creates a lazy concatenated {@code IntStream} whose elements are >> all the >> * elements of a first {@code IntStream} succeeded by all the >> elements of the >> * second {@code IntStream}. The resulting stream is ordered if both >> * of the input streams are ordered, and parallel if either of the >> input >> * streams is parallel. >> * >> * @param a the first stream >> * @param b the second stream to concatenate on to end of the first >> stream >> * @return the concatenation of the two streams >> */ >> public static IntStream concat(IntStream a, IntStream b) { >> >> >> (and similar for Long and Double). >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/b80d2e89/attachment-0001.html From spullara at gmail.com Tue May 28 09:47:15 2013 From: spullara at gmail.com (Sam Pullara) Date: Tue, 28 May 2013 09:47:15 -0700 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> Message-ID: <8980F89E-4CEC-4D97-B05F-1B28D2F1D74A@gmail.com> I don't care about the new ints() / longs() methods as they seem more for example than actual code. However, I do like the renames. Sam On May 28, 2013, at 9:41 AM, Joe Bowbeer wrote: > I like the renamings you propose. > > I'm not really keen on the ints and longs methods though. I'm already used to intRange, which seems short enough, and is consistent with the other names, none of which are very short. > > On May 28, 2013 9:27 AM, "Brian Goetz" wrote: > Another loose end is a method to generate "all" ints / longs (which are sugar for ranges 0..MAX_VALUE.) These show up in pedagogical examples all the time: > > ints().filter(i -> isPrime(i)).limit(100) > > The logical place for them is: > > IntStream.ints() > LongStream.longs() > > but some have raised concern that this might be confusing because we have an instance method on IntStream called "longs()" which widens the elements from int to long. While this isn't fatal, it might be confusing. > > Perhaps it would be better to rename these conversion methods: > > IntStream: longs(), doubles(), boxed() > LongStream: doubles(), boxed() > DoubleStream: boxed() > > to asInts(), asLongs(), and asBoxed()? > > In retrospect, these seem better names anyway. And they also eliminate the conflict above. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/77c42bd2/attachment.html From brian.goetz at oracle.com Tue May 28 09:47:57 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 12:47:57 -0400 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> Message-ID: <51A4DFBD.8000508@oracle.com> > I'm not really keen on the ints and longs methods though. I'm already > used to intRange, which seems short enough, and is consistent with the > other names, none of which are very short. Actually intRange was already shortened to range() when it got moved from Streams to IntStream: IntStream r = IntStream.range(1, 10) The name itself is plenty short; it is the arguments that seem verbose and unnecessarily specific: IntStream.rangeClosed(0, Integer.MAX_VALUE) seems more "leaky" than IntStream.allTheInts() But are you saying you're fine with the verbose form? Or simply you'd rather have a name that sounds more like "range"? > > On May 28, 2013 9:27 AM, "Brian Goetz" > wrote: > > Another loose end is a method to generate "all" ints / longs (which > are sugar for ranges 0..MAX_VALUE.) These show up in pedagogical > examples all the time: > > ints().filter(i -> isPrime(i)).limit(100) > > The logical place for them is: > > IntStream.ints() > LongStream.longs() > > but some have raised concern that this might be confusing because we > have an instance method on IntStream called "longs()" which widens > the elements from int to long. While this isn't fatal, it might be > confusing. > > Perhaps it would be better to rename these conversion methods: > > IntStream: longs(), doubles(), boxed() > LongStream: doubles(), boxed() > DoubleStream: boxed() > > to asInts(), asLongs(), and asBoxed()? > > In retrospect, these seem better names anyway. And they also > eliminate the conflict above. > From brian.goetz at oracle.com Tue May 28 09:51:24 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 12:51:24 -0400 Subject: Loose end: concat In-Reply-To: References: <519E5A96.2080606@oracle.com> <51A4DCA6.8040301@oracle.com> Message-ID: <51A4E08C.9060708@oracle.com> No, sorry to be unclear. The second example suggests *STATIC* methods on *Stream: static Stream concat(Stream a, Stream b) The signatures would be the same, the only difference is what class these static methods would live in. On 5/28/2013 12:44 PM, Joe Bowbeer wrote: > I like the second, still, for its fluency. I didn't understand your > stateful-based argument. Are you backing away from it? > > On May 28, 2013 9:35 AM, "Brian Goetz" > wrote: > > Seems we've seen no objections on the existence of these methods. > So the remaining issue is where they live. > > Candidates: > > Streams.concat(a, b) x 4 > Stream.concat(a, b), IntStream.concat(a, b), etc. > > Over time, we've been moving away from "overloaded" stream methods > towards more explicitly typed methods, which is a point against the > first candidate. > > I initially preferred the first version, but am more on the fence > now. Sam prefers the second. Other opinions? > > > On 5/23/2013 2:06 PM, Brian Goetz wrote: > > I cleaned up concat() and wrote Int/Long/Double versions. > (Fortunately, > with the recent addition of Spliterator.OfPrimitive, the duplication > quotient was much lower.) > > Currently these still live in Streams. Is that still the right > place? > The stream classes (Stream, IntStream, etc) seem a little wrong for > them, but I can't quite put my finger on why. > > Specs: > > /** > * Creates a lazy concatenated {@code Stream} whose > elements are > all the > * elements of a first {@code Stream} succeeded by all the > elements > of the > * second {@code Stream}. The resulting stream is ordered > if both > * of the input streams are ordered, and parallel if > either of the > input > * streams is parallel. > * > * @param The type of stream elements > * @param a the first stream > * @param b the second stream to concatenate on to end of > the first > * stream > * @return the concatenation of the two input streams > */ > public static Stream concat(Stream a, > Stream extends T> b) { > > > /** > * Creates a lazy concatenated {@code IntStream} whose > elements are > all the > * elements of a first {@code IntStream} succeeded by all the > elements of the > * second {@code IntStream}. The resulting stream is > ordered if both > * of the input streams are ordered, and parallel if > either of the > input > * streams is parallel. > * > * @param a the first stream > * @param b the second stream to concatenate on to end of > the first > stream > * @return the concatenation of the two streams > */ > public static IntStream concat(IntStream a, IntStream b) { > > > (and similar for Long and Double). > From forax at univ-mlv.fr Tue May 28 10:12:02 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 28 May 2013 19:12:02 +0200 Subject: Loose ends: Optional In-Reply-To: <51A4D8F5.9040000@oracle.com> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> Message-ID: <51A4E562.1080003@univ-mlv.fr> On 05/28/2013 06:19 PM, Brian Goetz wrote: > No, you did not miss anything. We have never required that an issue > be fully settled in the EG before committing it to the lambda > repository. Having code to play with often plays a critical role in EG > and community discussions, so we have regularly committed code to the > lambda repository that has not yet been fully blessed by the EG. > I thought it was before we start to sync lambda with jdk8 repo, sorry, my mistake. > > As to your concern, I see it slightly differently. It is not that the > filter method is lazy on Stream and eager on Optional. It is that > *Stream* itself is laziness-seeking (all methods that do not require > an immediate result defer what computation they can) and Optional > itself is eager (all methods produce a fully formed result or > side-effect). Ok, filter is not filter in an object world, it's Stream.filter or Optional.filter, but in that case, why there is no eager implementation of filter on List, it's convenient too ? Having methods like filter or map defined on Optional with a different semantics as the ones of Stream will just introduce doubt and confusion, so it doesn't worth it. R?mi > >> On lambda-dev: 05/28/2013 05:35 PM, brian.goetz at oracle.com wrote: >>> Changeset: fde3666e6394 >>> Author: briangoetz >>> Date: 2013-05-28 11:34 -0400 >>> URL:http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fde3666e6394 >>> >>> Additional convenience methods on Optional >>> >>> ! src/share/classes/java/util/Optional.java >>> >>> >> >> It seems, I have not received one or several emails about adding an >> eager versions of filter, map to Optional. >> The last email I received about that subject is the one below. >> >> R?mi >> >> >> On 05/25/2013 07:12 PM, Remi Forax wrote: >>> On 05/24/2013 10:15 PM, Brian Goetz wrote: >>>> Optional has obvious upsides and downsides. Some of the downsides >>>> are: >>>> - It's a box. Boxing can be heavy. >>>> - The more general-purpose value-wrapper classes you have, the more >>>> some people fear an explosion of unreadable types like >>>> Map>, List>>> List>> in API signatures. >>>> >>>> I think where we've tried to land is: do things that encourage people >>>> to use Optional only in return position. These methods make it more >>>> useful in return position while not increasing the temptation to use >>>> it elsewhere any more than we already have. Hence "mostly harmless". >>> >>> I think you cross a line without seen it, filter, map and flatmap are >>> lazy on Stream but not on Optional. >>> >>> R?mi >>> >>>> >>>> On 5/24/2013 4:10 PM, Tim Peierls wrote: >>>>> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz >>>> > wrote: >>>>> >>>>> Proposed spec for methods on Optional, which would have the >>>>> obvious >>>>> counterparts in Optional{Int,Long,Double}. >>>>> >>>>> These methods are known to be useful and seem mostly harmless now >>>>> that other things have settled. (I don't think they greatly >>>>> increase the moral hazard of Optional in general, and they do >>>>> make >>>>> it more expressive.) >>>>> >>>>> >>>>> I'm in the curious (unique?) position of both desperately wanting >>>>> Optional and desperately *not* wanting lots of additional methods >>>>> like >>>>> these. If the price of having Optional is the presence of these >>>>> methods, >>>>> I'll suck it up, but "mostly harmless" is not exactly a ringing >>>>> endorsement. >>>>> >>>>> --tim >>> >> From joe.bowbeer at gmail.com Tue May 28 10:22:47 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 28 May 2013 10:22:47 -0700 Subject: Loose end: ints(), longs() In-Reply-To: <51A4DFBD.8000508@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> Message-ID: Another reason I'm not keen on ints() and longs() is that they have limited range despite their cool names. I'd like them more, and they would still be useful for demos, if they generated Big numbers. In other words, "range()" makes it clear to me that its range is limited, but "ints()" does not. This isn't to say that I'm opposed to ints() but I'm definitely not keen on it. Here are a couple other ideas for the renamings you propose: 1. toIntStream(), toLongStream(), etc. 2, Leave "boxed()" as is. There's no reason to rename this, right? --Joe On Tue, May 28, 2013 at 9:47 AM, Brian Goetz wrote: > I'm not really keen on the ints and longs methods though. I'm already >> used to intRange, which seems short enough, and is consistent with the >> other names, none of which are very short. >> > > Actually intRange was already shortened to range() when it got moved from > Streams to IntStream: > > IntStream r = IntStream.range(1, 10) > > The name itself is plenty short; it is the arguments that seem verbose and > unnecessarily specific: > > IntStream.rangeClosed(0, Integer.MAX_VALUE) > > seems more "leaky" than > > IntStream.allTheInts() > > But are you saying you're fine with the verbose form? Or simply you'd > rather have a name that sounds more like "range"? > > >> On May 28, 2013 9:27 AM, "Brian Goetz" > > wrote: >> >> Another loose end is a method to generate "all" ints / longs (which >> are sugar for ranges 0..MAX_VALUE.) These show up in pedagogical >> examples all the time: >> >> ints().filter(i -> isPrime(i)).limit(100) >> >> The logical place for them is: >> >> IntStream.ints() >> LongStream.longs() >> >> but some have raised concern that this might be confusing because we >> have an instance method on IntStream called "longs()" which widens >> the elements from int to long. While this isn't fatal, it might be >> confusing. >> >> Perhaps it would be better to rename these conversion methods: >> >> IntStream: longs(), doubles(), boxed() >> LongStream: doubles(), boxed() >> DoubleStream: boxed() >> >> to asInts(), asLongs(), and asBoxed()? >> >> In retrospect, these seem better names anyway. And they also >> eliminate the conflict above. >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/662c9998/attachment-0001.html From brian.goetz at oracle.com Tue May 28 10:25:30 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 13:25:30 -0400 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> Message-ID: <51A4E88A.6090202@oracle.com> > 2, Leave "boxed()" as is. There's no reason to rename this, right? Well, it seems that "widen to a long" and "box to a Long" are the same kind of operation -- give me the same value in a different representation -- which suggests a similar naming convention? From joe.bowbeer at gmail.com Tue May 28 10:29:15 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 28 May 2013 10:29:15 -0700 Subject: Loose end: ints(), longs() In-Reply-To: <51A4E88A.6090202@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> <51A4E88A.6090202@oracle.com> Message-ID: widen and box are different... asBoxed() is a mouthful. Shouldn't it just be "box()" or "boxed()"? I prefer the toIntStream() and toLongStream() names anyway, and then asBoxed() has no leg to stand on. On Tue, May 28, 2013 at 10:25 AM, Brian Goetz wrote: > 2, Leave "boxed()" as is. There's no reason to rename this, right? >> > > Well, it seems that "widen to a long" and "box to a Long" are the same > kind of operation -- give me the same value in a different representation > -- which suggests a similar naming convention? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/6102af5a/attachment.html From brian.goetz at oracle.com Tue May 28 10:31:29 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 13:31:29 -0400 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> <51A4E88A.6090202@oracle.com> Message-ID: <51A4E9F1.6020709@oracle.com> To be clear: I wasn't suggesting toIntStream, as much as toInts(). On 5/28/2013 1:29 PM, Joe Bowbeer wrote: > widen and box are different... > > asBoxed() is a mouthful. Shouldn't it just be "box()" or "boxed()"? > > I prefer the toIntStream() and toLongStream() names anyway, and then > asBoxed() has no leg to stand on. > > > On Tue, May 28, 2013 at 10:25 AM, Brian Goetz > wrote: > > 2, Leave "boxed()" as is. There's no reason to rename this, right? > > > Well, it seems that "widen to a long" and "box to a Long" are the > same kind of operation -- give me the same value in a different > representation -- which suggests a similar naming convention? > > > From mike.duigou at oracle.com Tue May 28 10:34:47 2013 From: mike.duigou at oracle.com (Mike Duigou) Date: Tue, 28 May 2013 10:34:47 -0700 Subject: Loose end: ints(), longs() In-Reply-To: <51A4DFBD.8000508@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> Message-ID: <0D22B943-D310-4F2C-9763-3AD7BF8A21F0@oracle.com> On May 28 2013, at 09:47 , Brian Goetz wrote: >> I'm not really keen on the ints and longs methods though. I'm already >> used to intRange, which seems short enough, and is consistent with the >> other names, none of which are very short. > > Actually intRange was already shortened to range() when it got moved from Streams to IntStream: > > IntStream r = IntStream.range(1, 10) > > The name itself is plenty short; it is the arguments that seem verbose and unnecessarily specific: > > IntStream.rangeClosed(0, Integer.MAX_VALUE) > > seems more "leaky" than > > IntStream.allTheInts() > > But are you saying you're fine with the verbose form? Or simply you'd rather have a name that sounds more like "range"? > >> >> On May 28, 2013 9:27 AM, "Brian Goetz" > > wrote: >> >> Another loose end is a method to generate "all" ints / longs (which >> are sugar for ranges 0..MAX_VALUE.) I was personally kind of confused why the range began at zero rather than MIN_VALUE. An equally simple mechanism for covering the entire range seems desirable. If it's going to be common to want the positive non-zero range then we'll see lots of ints().skip(1). I'm not sure ints() is sufficiently general enough. I feel like I'm just as almost as likely likely to want MIN_VALUE/MAX_VALUE, 1/MAX_VALUE, MIN_VALUE/-1, MIN_VALUE/0, etc. >> but some have raised concern that this might be confusing because we >> have an instance method on IntStream called "longs()" which widens >> the elements from int to long. While this isn't fatal, it might be >> confusing. >> >> Perhaps it would be better to rename these conversion methods: >> >> IntStream: longs(), doubles(), boxed() >> LongStream: doubles(), boxed() >> DoubleStream: boxed() >> >> to asInts(), asLongs(), and asBoxed()? >> >> In retrospect, these seem better names anyway. And they also >> eliminate the conflict above. This does seem like an improvement. From brian.goetz at oracle.com Tue May 28 10:37:42 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 13:37:42 -0400 Subject: Loose end: ints(), longs() In-Reply-To: <51A4E9F1.6020709@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> <51A4E88A.6090202@oracle.com> <51A4E9F1.6020709@oracle.com> Message-ID: <51A4EB66.1000505@oracle.com> BTW, asObjects() could be another name for boxed(): asInts() asLongs() asObjects() On 5/28/2013 1:31 PM, Brian Goetz wrote: > To be clear: I wasn't suggesting toIntStream, as much as toInts(). > > > > On 5/28/2013 1:29 PM, Joe Bowbeer wrote: >> widen and box are different... >> >> asBoxed() is a mouthful. Shouldn't it just be "box()" or "boxed()"? >> >> I prefer the toIntStream() and toLongStream() names anyway, and then >> asBoxed() has no leg to stand on. >> >> >> On Tue, May 28, 2013 at 10:25 AM, Brian Goetz > > wrote: >> >> 2, Leave "boxed()" as is. There's no reason to rename this, >> right? >> >> >> Well, it seems that "widen to a long" and "box to a Long" are the >> same kind of operation -- give me the same value in a different >> representation -- which suggests a similar naming convention? >> >> >> From brian.goetz at oracle.com Tue May 28 10:40:31 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 13:40:31 -0400 Subject: Loose end: ints(), longs() In-Reply-To: <0D22B943-D310-4F2C-9763-3AD7BF8A21F0@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4DFBD.8000508@oracle.com> <0D22B943-D310-4F2C-9763-3AD7BF8A21F0@oracle.com> Message-ID: <51A4EC0F.2010702@oracle.com> This is a pretty good argument against ints() and friends; the proposed semantics mostly extrapolate from toy examples found in functional textbooks, rather than real-world problems. > If it's going to be common to want the positive non-zero range then > we'll see lots of ints().skip(1). > > I'm not sure ints() is sufficiently general enough. I feel like I'm > just as almost as likely likely to want MIN_VALUE/MAX_VALUE, > 1/MAX_VALUE, MIN_VALUE/-1, MIN_VALUE/0, etc. From brian.goetz at oracle.com Tue May 28 11:05:15 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 14:05:15 -0400 Subject: Loose end: ints(), longs() In-Reply-To: <51A4DABE.7070900@oracle.com> References: <51A4DABE.7070900@oracle.com> Message-ID: <51A4F1DB.2020802@oracle.com> OK, so we have some compelling arguments against int() and longs(). I'm willing to drop those. I think Paul's addition of rangeClosed to handle closed ranges (previously discussed) proposal is also uncontroversial? There seems to be broad support behind renaming the existing xxxs() methods. Though there now are almost as many proposals as there are EG members :( as{Ints,Longs,Boxed} as{Ints,Longs,Objs} // most consistent with map/flatMap naming as{Ints,Longs} but keep boxed() to{Int,Long}Stream but keep boxed() etc... On 5/28/2013 12:26 PM, Brian Goetz wrote: > Another loose end is a method to generate "all" ints / longs (which are > sugar for ranges 0..MAX_VALUE.) These show up in pedagogical examples > all the time: > > ints().filter(i -> isPrime(i)).limit(100) > > The logical place for them is: > > IntStream.ints() > LongStream.longs() > > but some have raised concern that this might be confusing because we > have an instance method on IntStream called "longs()" which widens the > elements from int to long. While this isn't fatal, it might be confusing. > > Perhaps it would be better to rename these conversion methods: > > IntStream: longs(), doubles(), boxed() > LongStream: doubles(), boxed() > DoubleStream: boxed() > > to asInts(), asLongs(), and asBoxed()? > > In retrospect, these seem better names anyway. And they also eliminate > the conflict above. > From spullara at gmail.com Tue May 28 11:16:12 2013 From: spullara at gmail.com (Sam Pullara) Date: Tue, 28 May 2013 11:16:12 -0700 Subject: Loose end: ints(), longs() In-Reply-To: <51A4F1DB.2020802@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4F1DB.2020802@oracle.com> Message-ID: <5669AF4D-BF88-4643-AC6E-33859B5503B5@gmail.com> On May 28, 2013, at 11:05 AM, Brian Goetz wrote: > I think Paul's addition of rangeClosed to handle closed ranges (previously discussed) proposal is also uncontroversial? or just have an override that has: range(start, isInclusive, end, isInclusive) to cover the bases. > > as{Ints,Longs} but keep boxed() I vote for these. Sam > > etc... > > > On 5/28/2013 12:26 PM, Brian Goetz wrote: >> Another loose end is a method to generate "all" ints / longs (which are >> sugar for ranges 0..MAX_VALUE.) These show up in pedagogical examples >> all the time: >> >> ints().filter(i -> isPrime(i)).limit(100) >> >> The logical place for them is: >> >> IntStream.ints() >> LongStream.longs() >> >> but some have raised concern that this might be confusing because we >> have an instance method on IntStream called "longs()" which widens the >> elements from int to long. While this isn't fatal, it might be confusing. >> >> Perhaps it would be better to rename these conversion methods: >> >> IntStream: longs(), doubles(), boxed() >> LongStream: doubles(), boxed() >> DoubleStream: boxed() >> >> to asInts(), asLongs(), and asBoxed()? >> >> In retrospect, these seem better names anyway. And they also eliminate >> the conflict above. >> From tim at peierls.net Tue May 28 11:16:59 2013 From: tim at peierls.net (Tim Peierls) Date: Tue, 28 May 2013 14:16:59 -0400 Subject: Loose end: ints(), longs() In-Reply-To: <51A4F1DB.2020802@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4F1DB.2020802@oracle.com> Message-ID: On Tue, May 28, 2013 at 2:05 PM, Brian Goetz wrote: > There seems to be broad support behind renaming the existing xxxs() > methods. Though there now are almost as many proposals as there are EG > members :( > > as{Ints,Longs,Boxed} > as{Ints,Longs,Objs} // most consistent with map/flatMap naming > as{Ints,Longs} but keep boxed() > to{Int,Long}Stream but keep boxed() > The variety isn't necessarily a sign of strong disagreement. I don't think any of these are unworkable, but I do have a preference for the last one -- toIntStream(), toLongStream(), and boxed() -- because those names seem less prone to misinterpretation than in the other cases. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/24490547/attachment.html From brian.goetz at oracle.com Tue May 28 11:19:17 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 14:19:17 -0400 Subject: Loose end: concat In-Reply-To: References: <519E5A96.2080606@oracle.com> <51A4DCA6.8040301@oracle.com> Message-ID: <51A4F525.5010908@oracle.com> So, for the record, the instance version is really "false fluency". Consider: list.filter(...) .map(...) .concat(otherList.filter(....) .map(....) .distinct() .sorted()) .forEach(...); is not really very fluent either -- the argument to concat is likely to be quite a mouthful. If you separate out: Stream other = otherList.filter(....) .map(....) .distinct() .sorted()) list.filter(...) .map(...) .concat(other) .forEach(...); now you've lost half your "fluency" by having to have a backward ref to other anyway. Making: Stream a = otherList.filter(....) .map(....) .distinct() .sorted()) Stream b = list.filter(...) .map(...); concat(a, b).forEach(...) not all that much worse. But that simply undermines the (only) "pro" for making it an instance method. The real reason is that the "con" is overwhelming: all other stream transforms are pure functions, whereas concat() is irretrievably stateful. This means that it becomes impossible to build functional layers atop streams. Given that the only benefit of the instance route is fluency, and it isn't really very fluent anyway, the answer seems obvious? On 5/28/2013 12:44 PM, Joe Bowbeer wrote: > I like the second, still, for its fluency. I didn't understand your > stateful-based argument. Are you backing away from it? > > On May 28, 2013 9:35 AM, "Brian Goetz" > wrote: > > Seems we've seen no objections on the existence of these methods. > So the remaining issue is where they live. > > Candidates: > > Streams.concat(a, b) x 4 > Stream.concat(a, b), IntStream.concat(a, b), etc. > > Over time, we've been moving away from "overloaded" stream methods > towards more explicitly typed methods, which is a point against the > first candidate. > > I initially preferred the first version, but am more on the fence > now. Sam prefers the second. Other opinions? > > > On 5/23/2013 2:06 PM, Brian Goetz wrote: > > I cleaned up concat() and wrote Int/Long/Double versions. > (Fortunately, > with the recent addition of Spliterator.OfPrimitive, the duplication > quotient was much lower.) > > Currently these still live in Streams. Is that still the right > place? > The stream classes (Stream, IntStream, etc) seem a little wrong for > them, but I can't quite put my finger on why. > > Specs: > > /** > * Creates a lazy concatenated {@code Stream} whose > elements are > all the > * elements of a first {@code Stream} succeeded by all the > elements > of the > * second {@code Stream}. The resulting stream is ordered > if both > * of the input streams are ordered, and parallel if > either of the > input > * streams is parallel. > * > * @param The type of stream elements > * @param a the first stream > * @param b the second stream to concatenate on to end of > the first > * stream > * @return the concatenation of the two input streams > */ > public static Stream concat(Stream a, > Stream extends T> b) { > > > /** > * Creates a lazy concatenated {@code IntStream} whose > elements are > all the > * elements of a first {@code IntStream} succeeded by all the > elements of the > * second {@code IntStream}. The resulting stream is > ordered if both > * of the input streams are ordered, and parallel if > either of the > input > * streams is parallel. > * > * @param a the first stream > * @param b the second stream to concatenate on to end of > the first > stream > * @return the concatenation of the two streams > */ > public static IntStream concat(IntStream a, IntStream b) { > > > (and similar for Long and Double). > From brian.goetz at oracle.com Tue May 28 11:25:49 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 14:25:49 -0400 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> <51A4F1DB.2020802@oracle.com> Message-ID: <51A4F6AD.6090704@oracle.com> Separating.... There are at least three votes for treating "boxed" separately. Orthogonal to that is how we name the remaining primitive conversion methods. There is unfortunately no consistent rule for differentiating asXxx from toXxx, but we do try to lean towards: use asXxx when the transformation is a "lightweight" or lazy one (Arrays.asList provides a lazy List view of an array, rather than populating a new ArrayList) whereas toXxx performs a more expensive and eager transformation. Obviously exceptions abound, but this guideline strongly suggests "asXxxStream" over "toXxxStream". Which I don't think the toXxxStream contingent will object to? So combining the votes received thus far, seems like the equilibrium is: boxed() as{Int,Long,Double}Stream ? On 5/28/2013 2:16 PM, Tim Peierls wrote: > On Tue, May 28, 2013 at 2:05 PM, Brian Goetz > wrote: > > There seems to be broad support behind renaming the existing xxxs() > methods. Though there now are almost as many proposals as there are > EG members :( > > as{Ints,Longs,Boxed} > as{Ints,Longs,Objs} // most consistent with map/flatMap naming > as{Ints,Longs} but keep boxed() > to{Int,Long}Stream but keep boxed() > > > The variety isn't necessarily a sign of strong disagreement. I don't > think any of these are unworkable, but I do have a preference for the > last one -- toIntStream(), toLongStream(), and boxed() -- because those > names seem less prone to misinterpretation than in the other cases. > > --tim From tim at peierls.net Tue May 28 11:38:02 2013 From: tim at peierls.net (Tim Peierls) Date: Tue, 28 May 2013 14:38:02 -0400 Subject: Loose end: ints(), longs() In-Reply-To: <51A4F6AD.6090704@oracle.com> References: <51A4DABE.7070900@oracle.com> <51A4F1DB.2020802@oracle.com> <51A4F6AD.6090704@oracle.com> Message-ID: On Tue, May 28, 2013 at 2:25 PM, Brian Goetz wrote: > So combining the votes received thus far, seems like the equilibrium is: > > boxed() > as{Int,Long,Double}Stream > Works for me. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/0b645f13/attachment.html From spullara at gmail.com Tue May 28 13:07:13 2013 From: spullara at gmail.com (Sam Pullara) Date: Tue, 28 May 2013 13:07:13 -0700 Subject: Loose ends: Optional In-Reply-To: <51A4E562.1080003@univ-mlv.fr> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> Message-ID: On May 28, 2013, at 10:12 AM, Remi Forax wrote: > On 05/28/2013 06:19 PM, Brian Goetz wrote: >> >> As to your concern, I see it slightly differently. It is not that the filter method is lazy on Stream and eager on Optional. It is that *Stream* itself is laziness-seeking (all methods that do not require an immediate result defer what computation they can) and Optional itself is eager (all methods produce a fully formed result or side-effect). > > Ok, filter is not filter in an object world, it's Stream.filter or Optional.filter, > but in that case, why there is no eager implementation of filter on List, it's convenient too ? Manipulating an Optional with if/then statements basically makes it no better than null. For that reason alone we should have these methods and possibly a few more like exists(). I still struggle with the fact that with the current API I still have to make my own object to Optional converter to handle nulls in what I think is a sane way. > Having methods like filter or map defined on Optional with a different semantics as the ones of Stream > will just introduce doubt and confusion, so it doesn't worth it. The implementation decides if something can be lazy, I'm ok with these having the same methods. In fact, I suggest we change the name of isPresent to forEach. Sam > > R?mi > >> >>> On lambda-dev: 05/28/2013 05:35 PM, brian.goetz at oracle.com wrote: >>>> Changeset: fde3666e6394 >>>> Author: briangoetz >>>> Date: 2013-05-28 11:34 -0400 >>>> URL:http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fde3666e6394 >>>> >>>> Additional convenience methods on Optional >>>> >>>> ! src/share/classes/java/util/Optional.java >>>> >>>> >>> >>> It seems, I have not received one or several emails about adding an >>> eager versions of filter, map to Optional. >>> The last email I received about that subject is the one below. >>> >>> R?mi >>> >>> >>> On 05/25/2013 07:12 PM, Remi Forax wrote: >>>> On 05/24/2013 10:15 PM, Brian Goetz wrote: >>>>> Optional has obvious upsides and downsides. Some of the downsides are: >>>>> - It's a box. Boxing can be heavy. >>>>> - The more general-purpose value-wrapper classes you have, the more >>>>> some people fear an explosion of unreadable types like >>>>> Map>, List>>>> List>> in API signatures. >>>>> >>>>> I think where we've tried to land is: do things that encourage people >>>>> to use Optional only in return position. These methods make it more >>>>> useful in return position while not increasing the temptation to use >>>>> it elsewhere any more than we already have. Hence "mostly harmless". >>>> >>>> I think you cross a line without seen it, filter, map and flatmap are >>>> lazy on Stream but not on Optional. >>>> >>>> R?mi >>>> >>>>> >>>>> On 5/24/2013 4:10 PM, Tim Peierls wrote: >>>>>> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz >>>>> > wrote: >>>>>> >>>>>> Proposed spec for methods on Optional, which would have the obvious >>>>>> counterparts in Optional{Int,Long,Double}. >>>>>> >>>>>> These methods are known to be useful and seem mostly harmless now >>>>>> that other things have settled. (I don't think they greatly >>>>>> increase the moral hazard of Optional in general, and they do make >>>>>> it more expressive.) >>>>>> >>>>>> >>>>>> I'm in the curious (unique?) position of both desperately wanting >>>>>> Optional and desperately *not* wanting lots of additional methods like >>>>>> these. If the price of having Optional is the presence of these >>>>>> methods, >>>>>> I'll suck it up, but "mostly harmless" is not exactly a ringing >>>>>> endorsement. >>>>>> >>>>>> --tim >>>> >>> > From brian.goetz at oracle.com Tue May 28 13:22:49 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 16:22:49 -0400 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> Message-ID: >> Having methods like filter or map defined on Optional with a different semantics as the ones of Stream >> will just introduce doubt and confusion, so it doesn't worth it. > > The implementation decides if something can be lazy, I'm ok with these having the same methods. In fact, I suggest we change the name of isPresent to forEach. I'm OK with this. Its a little weird since there is at most one element to each, but the connection to methods on other containers is nice. From tim at peierls.net Tue May 28 13:26:41 2013 From: tim at peierls.net (Tim Peierls) Date: Tue, 28 May 2013 16:26:41 -0400 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> Message-ID: On Tue, May 28, 2013 at 4:22 PM, Brian Goetz wrote: > > The implementation decides if something can be lazy, I'm ok with these > having the same methods. In fact, I suggest we change the name of isPresent > to forEach. > > I'm OK with this. Its a little weird since there is at most one element > to each, but the connection to methods on other containers is nice. > Please, no! That would make Optional unusable for all the people who would actually benefit from it. There must be some serious disconnect. I thought this was all settled. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/4517e2ed/attachment.html From brian.goetz at oracle.com Tue May 28 13:28:56 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 16:28:56 -0400 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> Message-ID: <64F6EE0B-0E85-4C2E-B161-D14AA7F4E8D8@oracle.com> Sorry, Sam typo'ed and I missed it the first time. He meant change "ifPresent(Consumer)" to "forEach(Consumer)". Not isPresent. On May 28, 2013, at 4:26 PM, Tim Peierls wrote: > On Tue, May 28, 2013 at 4:22 PM, Brian Goetz wrote: > > The implementation decides if something can be lazy, I'm ok with these having the same methods. In fact, I suggest we change the name of isPresent to forEach. > > I'm OK with this. Its a little weird since there is at most one element to each, but the connection to methods on other containers is nice. > > Please, no! > > That would make Optional unusable for all the people who would actually benefit from it. > > There must be some serious disconnect. I thought this was all settled. > > --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/7cf3e8b4/attachment.html From tim at peierls.net Tue May 28 13:32:38 2013 From: tim at peierls.net (Tim Peierls) Date: Tue, 28 May 2013 16:32:38 -0400 Subject: Loose ends: Optional In-Reply-To: <64F6EE0B-0E85-4C2E-B161-D14AA7F4E8D8@oracle.com> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> <64F6EE0B-0E85-4C2E-B161-D14AA7F4E8D8@oracle.com> Message-ID: Oh. Emily Litella. On Tue, May 28, 2013 at 4:28 PM, Brian Goetz wrote: > Sorry, Sam typo'ed and I missed it the first time. He meant change > "ifPresent(Consumer)" to "forEach(Consumer)". Not isPresent. > > On May 28, 2013, at 4:26 PM, Tim Peierls wrote: > > On Tue, May 28, 2013 at 4:22 PM, Brian Goetz wrote: > >> > The implementation decides if something can be lazy, I'm ok with these >> having the same methods. In fact, I suggest we change the name of isPresent >> to forEach. >> >> I'm OK with this. Its a little weird since there is at most one element >> to each, but the connection to methods on other containers is nice. >> > > Please, no! > > That would make Optional unusable for all the people who would actually > benefit from it. > > There must be some serious disconnect. I thought this was all settled. > > --tim > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/fd9c7e88/attachment.html From joe.bowbeer at gmail.com Tue May 28 14:08:38 2013 From: joe.bowbeer at gmail.com (Joe Bowbeer) Date: Tue, 28 May 2013 14:08:38 -0700 Subject: Loose end: ints(), longs() In-Reply-To: References: <51A4DABE.7070900@oracle.com> <51A4F1DB.2020802@oracle.com> <51A4F6AD.6090704@oracle.com> Message-ID: On Tue, May 28, 2013 at 11:38 AM, Tim Peierls wrote: > On Tue, May 28, 2013 at 2:25 PM, Brian Goetz wrote: > >> So combining the votes received thus far, seems like the equilibrium is: >> >> boxed() >> as{Int,Long,Double}Stream >> > > Works for me. > > --tim > Cool! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/d356a56c/attachment.html From forax at univ-mlv.fr Tue May 28 15:30:44 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 29 May 2013 00:30:44 +0200 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> Message-ID: <51A53014.4010409@univ-mlv.fr> On 05/28/2013 10:07 PM, Sam Pullara wrote: > On May 28, 2013, at 10:12 AM, Remi Forax wrote: >> On 05/28/2013 06:19 PM, Brian Goetz wrote: >>> As to your concern, I see it slightly differently. It is not that the filter method is lazy on Stream and eager on Optional. It is that *Stream* itself is laziness-seeking (all methods that do not require an immediate result defer what computation they can) and Optional itself is eager (all methods produce a fully formed result or side-effect). >> Ok, filter is not filter in an object world, it's Stream.filter or Optional.filter, >> but in that case, why there is no eager implementation of filter on List, it's convenient too ? > Manipulating an Optional with if/then statements basically makes it no better than null. For that reason alone we should have these methods and possibly a few more like exists(). You mean null for signalling that there is no value I suppose. It's better than null in that case because Optional force the developer to check if the value is present or not. If you want to call any method on it (beside the one already existing), what you want in not 0ptional but null safe navigation (the .? of groovy by example). This pattern is by far more powerful but it was rejected by the coin project, that the gordian knot of this issue. So Optional is a pale replacement, I agree. But you can't add all the methods that exist on earth in it. > I still struggle with the fact that with the current API I still have to make my own object to Optional converter to handle nulls in what I think is a sane way. yes, Optional have not the same semantics as Optional (from Guava). > >> Having methods like filter or map defined on Optional with a different semantics as the ones of Stream >> will just introduce doubt and confusion, so it doesn't worth it. > The implementation decides if something can be lazy, I'm ok with these having the same methods. In fact, I suggest we change the name of isPresent to forEach. No, it's not the implementation that decides, it's the spec. You can not realistically change the implementation of a method from lazy to eager and vice versa. It's the argument of Brian, the spec of Optional will say that the method filter of Optional is not lazy. But relying on people reading the documentation is something that you should not do (hint people don't read the spec). Or should do for your most important concept of the API, not for something as minor as Optional. For Optional, it should just work, without thinking too much. > > Sam R?mi > >> R?mi >> >>>> On lambda-dev: 05/28/2013 05:35 PM, brian.goetz at oracle.com wrote: >>>>> Changeset: fde3666e6394 >>>>> Author: briangoetz >>>>> Date: 2013-05-28 11:34 -0400 >>>>> URL:http://hg.openjdk.java.net/lambda/lambda/jdk/rev/fde3666e6394 >>>>> >>>>> Additional convenience methods on Optional >>>>> >>>>> ! src/share/classes/java/util/Optional.java >>>>> >>>>> >>>> It seems, I have not received one or several emails about adding an >>>> eager versions of filter, map to Optional. >>>> The last email I received about that subject is the one below. >>>> >>>> R?mi >>>> >>>> >>>> On 05/25/2013 07:12 PM, Remi Forax wrote: >>>>> On 05/24/2013 10:15 PM, Brian Goetz wrote: >>>>>> Optional has obvious upsides and downsides. Some of the downsides are: >>>>>> - It's a box. Boxing can be heavy. >>>>>> - The more general-purpose value-wrapper classes you have, the more >>>>>> some people fear an explosion of unreadable types like >>>>>> Map>, List>>>>> List>> in API signatures. >>>>>> >>>>>> I think where we've tried to land is: do things that encourage people >>>>>> to use Optional only in return position. These methods make it more >>>>>> useful in return position while not increasing the temptation to use >>>>>> it elsewhere any more than we already have. Hence "mostly harmless". >>>>> I think you cross a line without seen it, filter, map and flatmap are >>>>> lazy on Stream but not on Optional. >>>>> >>>>> R?mi >>>>> >>>>>> On 5/24/2013 4:10 PM, Tim Peierls wrote: >>>>>>> On Fri, May 24, 2013 at 3:20 PM, Brian Goetz >>>>>> > wrote: >>>>>>> >>>>>>> Proposed spec for methods on Optional, which would have the obvious >>>>>>> counterparts in Optional{Int,Long,Double}. >>>>>>> >>>>>>> These methods are known to be useful and seem mostly harmless now >>>>>>> that other things have settled. (I don't think they greatly >>>>>>> increase the moral hazard of Optional in general, and they do make >>>>>>> it more expressive.) >>>>>>> >>>>>>> >>>>>>> I'm in the curious (unique?) position of both desperately wanting >>>>>>> Optional and desperately *not* wanting lots of additional methods like >>>>>>> these. If the price of having Optional is the presence of these >>>>>>> methods, >>>>>>> I'll suck it up, but "mostly harmless" is not exactly a ringing >>>>>>> endorsement. >>>>>>> >>>>>>> --tim From brian.goetz at oracle.com Tue May 28 14:59:48 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 17:59:48 -0400 Subject: Extending Collector to handle a post-transform Message-ID: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> Recall that it was a frequently-requested feature during the development of Collector to support an optional post-transform function, decoupling the intermediate accumulation state from the final result. At the time, I took a swing at implementing this but all the options that made sense at the time added complexity or cost, but since then there have been some refinements to the Collector API that have made this more practical. This message is in two parts: this one is about how to extend Collector, and the next about how this might affect the standard set of Collectors. Currently Collector looks like: interface Collector { Supplier resultSupplier(); // make a new container BiFunction accumulator(); // incorporate one new value BinaryOperator combiner(); // combine two containers Set characteristics(); } where the characteristics are drawn from an enum { CONCURRENT, UNORDERED, and STRICTLY_MUTATIVE }. Each of these are pure optimizations that enable frameworks to to take advantage of known properties of the collector; if a framework ignores the characteristics, it should still be able to arrive at the correct result. The proposed post-transform function would take the final result (after accumulation for serial operation, after combination for parallel operation), and apply a final transform function to it. Motivation for this feature include: - Use of a different type for accumulation and result. For example, use a StringBuilder to accumulate but then return a String when done. - Allowing the Collector to impose invariants on the result that may not be efficiently maintainable by the accumulator function. - Enable a Collector to return an immutable result even though mutation is integral to what collect() does. Adding a post-function is entirely straightforward, but there are a few disadvantages. The first is that Collector acquires an extra type parameter; instead of input/output types, there is a third type for the intermediate type. This adds somewhat to the API surface area. The other is a performance concern; for combinators like groupingBy(f, collector), we have a choice of ways to implement the post-function for each value of the resulting map, but none is perfect. One option is to update the elements in-place with Map.replaceAll(); the other is to return a "view" map. The former is O(n) in the number of map keys; the latter defers a potentially significant fraction of the collect() work until after the user thinks the collect() is finished (and may lead to redundant work if we don't cache the results of applying the post-function to a specific bucket.) If the post function is the identity function, this is even worse; we're doing potentially a lot of work for a no-op. The addition of the characteristics allows us to identify explicitly when the post-transform is a no-op; have a characteristic flag for that. So Collector becomes: interface Collector { Supplier resultSupplier(); // make a new container BiFunction accumulator(); // incorporate one new value BinaryOperator combiner(); // combine two containers Function transformer(); Set characteristics(); } and the characteristic enum acquires IDENTITY_TRANSFORM. This means that the cost can be completely eliminated if the feature is not used. What's bad? - More generics in Collector signatures. For Collectors that don't want to export their intermediate type, they are declared as Collector, which users may find disturbing. (The obvious attempts to make the extra type arg go away don't work.) - Reliance on erasure. For collectors like groupingBy() that take a Supplier, we either need to take two suppliers (one for Map and the other for Map) or explicitly spec that the Map will be used to contain values of either I or R. While this is not actually a problem for all the Map implementations in the JDK, it is kind of smelly. (Don't bother raising the issue "and it won't work with reification"; the set of things that already don't is so large that ten more won't make it worse.) This only shows up in the few Collector forms that take an explicit supplier argument; it is a pure implementation detail for the rest. Overall I think this is a reasonable price to pay for making the abstraction more powerful. From brian.goetz at oracle.com Tue May 28 15:23:22 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 18:23:22 -0400 Subject: Post-transform and the standard Collectors Message-ID: Adding the ability to have a post-transform function raises some questions about how the standard collectors should change to accomodate them. These fall into two categories: - Should we? - How? For collectors like toStringBuilder, we can now collect to a String and not expose the intermediate StringBuilder type. This is both closer to what the user wants and allows for better implementation hiding: static Collector toStringBuilder() { ... } Of course, now the name is wrong. So it would need a new name. (Ditto for toStringJoiner.) It also makes sense to have a new combinator that can attach a post-transform to an existing Collector (name is just a placeholder): Collector transforming(Function, Collector) A harder question is how much to introduce immutability. For example, one negative of the current toList() collector is that the returned list is sometimes, but not always, immutable. It would be nice to be able to commit to something. We could easily make it immutable with a post-transform of Collections::immutableList. At first, this seems a no-brainer. But after more thought, it's definitely a "should we?" Consider how this plays as a downstream collector. The simplest form of groupingBy -- groupingBy(f) -- expands to groupingBy(f, toList()). If we made toList always return an immutable List, then we would have to apply the post-transform to every value of the resulting map, likely via a (sequential) Map.replaceAll on the simplest groupingBy operation, even when the user didn't care about immutability. Making every groupingBy user pay for this seems like a lot. (Alternately, the default toList() could still return an immutable list, but the default groupingBy could use a different downstream collector.) One option is to have mutable and immutable versions of every Collection/Map-bearing Collector. But this is a 2x explosion of Collectors, after we did so much work to pare back the size of the Collector set. Another is to have combinators for adding immutability to Collection, List, Set, and Map. Then an immutable groupingBy would be: collect(asImmutableMap(groupingBy(f, asImmutableList(toList())))); Wordy, but not terrible, and probably better than imposing the costs on everyone? From brian.goetz at oracle.com Tue May 28 15:33:12 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 28 May 2013 18:33:12 -0400 Subject: Loose ends: Optional In-Reply-To: References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> <64F6EE0B-0E85-4C2E-B161-D14AA7F4E8D8@oracle.com> Message-ID: <08791AC0-DD81-4229-BAA9-FF36138C50AE@oracle.com> OK, so the current proposal on the table is: - Add filter, map, flatMap to Optional (these have a lot less utility on the primitive versions, plus map/flatMap would require more forms) - Rename ifPresent to forEach to Optional and Optional{Int,Long,Double} Anyone who hasn't already expressed an opinion want to weigh in (or comment on the new aspects)? On May 28, 2013, at 4:32 PM, Tim Peierls wrote: > Oh. Emily Litella. > > On Tue, May 28, 2013 at 4:28 PM, Brian Goetz wrote: > Sorry, Sam typo'ed and I missed it the first time. He meant change "ifPresent(Consumer)" to "forEach(Consumer)". Not isPresent. > > On May 28, 2013, at 4:26 PM, Tim Peierls wrote: > >> On Tue, May 28, 2013 at 4:22 PM, Brian Goetz wrote: >> > The implementation decides if something can be lazy, I'm ok with these having the same methods. In fact, I suggest we change the name of isPresent to forEach. >> >> I'm OK with this. Its a little weird since there is at most one element to each, but the connection to methods on other containers is nice. >> >> Please, no! >> >> That would make Optional unusable for all the people who would actually benefit from it. >> >> There must be some serious disconnect. I thought this was all settled. >> >> --tim > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130528/94fc94a1/attachment.html From forax at univ-mlv.fr Wed May 29 06:30:00 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 29 May 2013 15:30:00 +0200 Subject: Loose ends: Optional In-Reply-To: <08791AC0-DD81-4229-BAA9-FF36138C50AE@oracle.com> References: <519FBD77.9030400@oracle.com> <519FCA75.6070600@oracle.com> <51A0F0E7.2090403@univ-mlv.fr> <51A4D6F2.2060408@univ-mlv.fr> <51A4D8F5.9040000@oracle.com> <51A4E562.1080003@univ-mlv.fr> <64F6EE0B-0E85-4C2E-B161-D14AA7F4E8D8@oracle.com> <08791AC0-DD81-4229-BAA9-FF36138C50AE@oracle.com> Message-ID: <51A602D8.2070309@univ-mlv.fr> On 05/29/2013 12:33 AM, Brian Goetz wrote: > OK, so the current proposal on the table is: > > - Add filter, map, flatMap to Optional (these have a lot less utility > on the primitive versions, plus map/flatMap would require more forms) > - Rename ifPresent to forEach to Optional and Optional{Int,Long,Double} > > Anyone who hasn't already expressed an opinion want to weigh in (or > comment on the new aspects)? I think the two options are linked. If ifPresent is renamed to forEach it means that Optional is seen as a collection of elements (with zero or one element). R?mi > > On May 28, 2013, at 4:32 PM, Tim Peierls wrote: > >> Oh. Emily Litella. >> >> On Tue, May 28, 2013 at 4:28 PM, Brian Goetz > > wrote: >> >> Sorry, Sam typo'ed and I missed it the first time. He meant >> change "ifPresent(Consumer)" to "forEach(Consumer)". Not isPresent. >> >> On May 28, 2013, at 4:26 PM, Tim Peierls wrote: >> >>> On Tue, May 28, 2013 at 4:22 PM, Brian Goetz >>> > wrote: >>> >>> > The implementation decides if something can be lazy, I'm >>> ok with these having the same methods. In fact, I suggest we >>> change the name of isPresent to forEach. >>> >>> I'm OK with this. Its a little weird since there is at most >>> one element to each, but the connection to methods on other >>> containers is nice. >>> >>> >>> Please, no! >>> >>> That would make Optional unusable for all the people who would >>> actually benefit from it. >>> >>> There must be some serious disconnect. I thought this was all >>> settled. >>> >>> --tim >> >> > From tim at peierls.net Wed May 29 06:57:17 2013 From: tim at peierls.net (Tim Peierls) Date: Wed, 29 May 2013 09:57:17 -0400 Subject: Extending Collector to handle a post-transform In-Reply-To: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> References: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> Message-ID: On Tue, May 28, 2013 at 5:59 PM, Brian Goetz wrote: > Overall I think this is a reasonable price to pay for making the > abstraction more powerful. > For me, this tips over into unacceptable territory. It's a lot of API complexity that most of the time wouldn't be used, so users would be grappling with an extra type parameter unnecessarily. I think people are going to be confused enough already by things like groupingBy; adding post-transform would put even the simplest usage examples out of reach of ordinary users. If post-collect transformation is truly essential, I'd even go for having two variants of each method that currently takes a Collector, one to take non-post-transforming Collector and one to take CollectorAndTransformer. I don't love adding methods, but at least that way regular users could avoid the scary version. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130529/855fdc62/attachment.html From brian.goetz at oracle.com Wed May 29 08:06:37 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 29 May 2013 08:06:37 -0700 Subject: Extending Collector to handle a post-transform In-Reply-To: References: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> Message-ID: <29A61A00-440A-4CD6-8AE9-FAAB58C20880@oracle.com> I get the "that could be scary" reaction, but let's be more explicit about who has to deal with what incremental complexity. Most people will not write Collectors; most collectors that are used will have been written by Kevin or I. The proposed functionality would NOT affect any user code. Most users use canned collectors. The user could would look like: stream.collect(toList()) in either case. Where this intrudes is that the static method toList() in Collectors returns a Collector, List> instead of Collector>. That's the extent of the intrusion. Its not wonderful, but its not quite as awful as you make it sound. I agree that we should try to hide this where possible. I tried several ways of doing this, but the combinators -- the ones that take a Collector and return a new Collector -- which are most of the reason for being for this API -- are the issue. We'd need a 2x explosion in the groupingBy combinators, and we just did a LOT of work to get the size of that set down. So, can we be a bit more explicit about what you're scared that users will be scared by? Maybe its not so scary. On May 29, 2013, at 6:57 AM, Tim Peierls wrote: > On Tue, May 28, 2013 at 5:59 PM, Brian Goetz wrote: > Overall I think this is a reasonable price to pay for making the abstraction more powerful. > > For me, this tips over into unacceptable territory. It's a lot of API complexity that most of the time wouldn't be used, so users would be grappling with an extra type parameter unnecessarily. I think people are going to be confused enough already by things like groupingBy; adding post-transform would put even the simplest usage examples out of reach of ordinary users. > > If post-collect transformation is truly essential, I'd even go for having two variants of each method that currently takes a Collector, one to take non-post-transforming Collector and one to take CollectorAndTransformer. I don't love adding methods, but at least that way regular users could avoid the scary version. > > --tim > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130529/5866b0c3/attachment-0001.html From dl at cs.oswego.edu Wed May 29 08:11:51 2013 From: dl at cs.oswego.edu (Doug Lea) Date: Wed, 29 May 2013 11:11:51 -0400 Subject: Extending Collector to handle a post-transform In-Reply-To: References: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> Message-ID: <51A61AB7.6050302@cs.oswego.edu> On 05/29/13 09:57, Tim Peierls wrote: > On Tue, May 28, 2013 at 5:59 PM, Brian Goetz > wrote: > > Overall I think this is a reasonable price to pay for making the abstraction > more powerful. > > > For me, this tips over into unacceptable territory. It's a lot of API complexity > that most of the time wouldn't be used, This is the multiple-audience problem that we grapple with all the time. Most application-level stuff will never explicitly use a Collector at all. So the main audience is people building layered stuff. And those are likely to be the people most frustrated that they cannot do stuff available in other similar languages/frameworks. Especially considering that we are well aware of the limitations of not allowing cascading transforms, it seems like a bad idea not to do this. -Doug From paul.sandoz at oracle.com Wed May 29 08:33:43 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Wed, 29 May 2013 17:33:43 +0200 Subject: Extending Collector to handle a post-transform In-Reply-To: <29A61A00-440A-4CD6-8AE9-FAAB58C20880@oracle.com> References: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> <29A61A00-440A-4CD6-8AE9-FAAB58C20880@oracle.com> Message-ID: <98A8B3F7-0B4E-414C-9170-5F7988D6F0F9@oracle.com> On May 29, 2013, at 5:06 PM, Brian Goetz wrote: > I get the "that could be scary" reaction, but let's be more explicit about who has to deal with what incremental complexity. > > Most people will not write Collectors; most collectors that are used will have been written by Kevin or I. > > The proposed functionality would NOT affect any user code. Most users use canned collectors. > The only unpleasant smell i can detect is the use of a user supplied map, declared with values of type R, to store values of type I and R while collecting. I am prepared to hold my nose in such cases :-) The following is perhaps worse than just documenting the restriction: public static , M1 extends Map> Collector groupingBy(Function classifier, Supplier mapFactory, Collector downstream) { Paul. From tim at peierls.net Wed May 29 09:08:18 2013 From: tim at peierls.net (Tim Peierls) Date: Wed, 29 May 2013 12:08:18 -0400 Subject: Extending Collector to handle a post-transform In-Reply-To: <29A61A00-440A-4CD6-8AE9-FAAB58C20880@oracle.com> References: <1E45AE50-35CF-4005-ACEB-B5612D3BCE3D@oracle.com> <29A61A00-440A-4CD6-8AE9-FAAB58C20880@oracle.com> Message-ID: On Wed, May 29, 2013 at 11:06 AM, Brian Goetz wrote: > I get the "that could be scary" reaction, but let's be more explicit about > who has to deal with what incremental complexity. > > Most people will not write Collectors; most collectors that are used will > have been written by Kevin or I. > > The proposed functionality would NOT affect any user code. Most users use > canned collectors. > Right, but they would still have to assemble things out of the canned parts. Browsing javadocs to find those parts and learn how to fit them together ... that's the jungle that I'm afraid will scare people away. But I could be overly alarmist about this -- draft javadocs might show that the extra stuff is not so bad after all. However, if the feeling is, "Don't worry about the javadocs, your IDE will take care of all that for you," then , um, I'm still worried. --tim -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/lambda-libs-spec-experts/attachments/20130529/9aeaea5d/attachment.html From paul.sandoz at oracle.com Thu May 30 01:42:51 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 30 May 2013 10:42:51 +0200 Subject: Loose ends: Spliterators.iteratorFromSpliterator -> Spliterators.iterator Message-ID: <4512241A-CED4-4119-A3AA-DC43C61EC50A@oracle.com> Bikeshed opportunity... The overloaded methods Spliterators.iteratorFromSpliterator stand out a bit like a sore thumb compared to the Spliterators.spliterator methods. I think it would be more consistent to rename then Spliterators.iterator. Thoughts? Paul. From paul.sandoz at oracle.com Thu May 30 02:00:06 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 30 May 2013 11:00:06 +0200 Subject: Loose ends: PrimitiveIterator.forEachRemaining(T_CONS action) Message-ID: We recently added Spliterator.OfPrimitive and it has proved very useful reducing some primitive-based spliterator code, specifically wrapping-based spliterators, such as for concatenation. The same technique can be applied to PrimitiveIterator; a method forEachRemaining(T_CONS action) can be defined: public interface PrimitiveIterator extends Iterator { void forEachRemaining(T_CONS action); ... } This has little or no impact on the current implementation, since PrimitiveIterator is not widely used. However, i still think there are some advantages: 1) consistency with Spliterator.OfPrimitive; and 2) there may be cases in the future where this is useful and if we don't do it now it becomes difficult to do so later on. The change could be categorised as "mostly harmless and potentially useful in the future". Paul. From forax at univ-mlv.fr Thu May 30 02:06:47 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 30 May 2013 11:06:47 +0200 Subject: Loose ends: Spliterators.iteratorFromSpliterator -> Spliterators.iterator In-Reply-To: <4512241A-CED4-4119-A3AA-DC43C61EC50A@oracle.com> References: <4512241A-CED4-4119-A3AA-DC43C61EC50A@oracle.com> Message-ID: <51A716A7.8000103@univ-mlv.fr> On 05/30/2013 10:42 AM, Paul Sandoz wrote: > Bikeshed opportunity... > > The overloaded methods Spliterators.iteratorFromSpliterator stand out a bit like a sore thumb compared to the Spliterators.spliterator methods. I think it would be more consistent to rename then Spliterators.iterator. yes > > Thoughts? no :) > > Paul. R?mi From paul.sandoz at oracle.com Thu May 30 02:18:43 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Thu, 30 May 2013 11:18:43 +0200 Subject: Loose ends: Collection.toArray(IntFunction generator) Message-ID: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> A while ago there was some lively discussion on toArray and we finally settled on the following method in Stream: A[] toArray(IntFunction generator); Due to "noise" of other stuff i think we forgot to discuss whether it is valuable to add a similar default method to Collection (as recently suggested by Peter Levart on the lambda-dev list which motivated this email). The functionality can already be achieved with: c.stream().toArray(String[]::new); but is not likely to be as efficient as the existing concrete Collection.toArray implementations (e.g. see ArrayList.toArray) This is also the way to produce an array in parallel: c.parallelStream().toArray(String[]::new); where efficiencies can be obtained if the collection is large enough. But we could also add to Collection: default T[] toArray(IntFunction generator) { return toArray(generator.apply(size())); } c.toArray(String[]::new); Seems like a reasonable thing to do. The sequentially efficiency is retained without the "stink" of using the other toArray method accepting an array instance (of or not of the right length). Paul. From forax at univ-mlv.fr Thu May 30 03:09:57 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 30 May 2013 12:09:57 +0200 Subject: Loose ends: Collection.toArray(IntFunction generator) In-Reply-To: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> References: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> Message-ID: <51A72575.5000504@univ-mlv.fr> On 05/30/2013 11:18 AM, Paul Sandoz wrote: > A while ago there was some lively discussion on toArray and we finally settled on the following method in Stream: > > A[] toArray(IntFunction generator); > > Due to "noise" of other stuff i think we forgot to discuss whether it is valuable to add a similar default method to Collection (as recently suggested by Peter Levart on the lambda-dev list which motivated this email). > > > The functionality can already be achieved with: > > c.stream().toArray(String[]::new); > > but is not likely to be as efficient as the existing concrete Collection.toArray implementations (e.g. see ArrayList.toArray) > > This is also the way to produce an array in parallel: > > c.parallelStream().toArray(String[]::new); > > where efficiencies can be obtained if the collection is large enough. > > > But we could also add to Collection: > > default T[] toArray(IntFunction generator) { > return toArray(generator.apply(size())); > } > > c.toArray(String[]::new); > > Seems like a reasonable thing to do. The sequentially efficiency is retained without the "stink" of using the other toArray method accepting an array instance (of or not of the right length). > > Paul. good idea, current toArray(T[]) is awkward to use and has a weird semantics. R?mi From forax at univ-mlv.fr Thu May 30 07:57:45 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 30 May 2013 16:57:45 +0200 Subject: Premature Optimization in the implementation of Collectors.toList() Message-ID: <51A768E9.2060300@univ-mlv.fr> There current implementation can return three different implementations, an empty list, a singleton list and an ArrayList. While trying to minimize the memory consumption may be a good idea, this will make all codes that use the resulting list (by example in a for-each-loop) megamorphic effectively disabling any possible inlining of the iterator (if we stay with our for-each-loop example). Given that the Stream API is not used in the wild for now, this optimization is premature, because we have no idea if this implementation is better or worst that the dumb implementation, the one similar to the one used in toSet(). We should stick with the dumb implementation and try to optimize later, when we will have enough data about the usage of the resulting list. R?mi From paul.sandoz at oracle.com Fri May 31 00:41:47 2013 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Fri, 31 May 2013 09:41:47 +0200 Subject: Loose ends: Collection.toArray(IntFunction generator) In-Reply-To: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> References: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> Message-ID: <9A44C6F3-512D-4647-A66D-3C252ED612CC@oracle.com> Brian pointed out a incompatibility issue on lambda-dev: toArray(null) is now ambiguous and will not compile This is likely in practice to only affect tests checking the method throws an NPE. Paul. On May 30, 2013, at 11:18 AM, Paul Sandoz wrote: > A while ago there was some lively discussion on toArray and we finally settled on the following method in Stream: > > A[] toArray(IntFunction generator); > > Due to "noise" of other stuff i think we forgot to discuss whether it is valuable to add a similar default method to Collection (as recently suggested by Peter Levart on the lambda-dev list which motivated this email). > > > The functionality can already be achieved with: > > c.stream().toArray(String[]::new); > > but is not likely to be as efficient as the existing concrete Collection.toArray implementations (e.g. see ArrayList.toArray) > > This is also the way to produce an array in parallel: > > c.parallelStream().toArray(String[]::new); > > where efficiencies can be obtained if the collection is large enough. > > > But we could also add to Collection: > > default T[] toArray(IntFunction generator) { > return toArray(generator.apply(size())); > } > > c.toArray(String[]::new); > > Seems like a reasonable thing to do. The sequentially efficiency is retained without the "stink" of using the other toArray method accepting an array instance (of or not of the right length). > > Paul. From forax at univ-mlv.fr Fri May 31 01:50:46 2013 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 31 May 2013 10:50:46 +0200 Subject: Loose ends: Collection.toArray(IntFunction generator) In-Reply-To: <9A44C6F3-512D-4647-A66D-3C252ED612CC@oracle.com> References: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> <9A44C6F3-512D-4647-A66D-3C252ED612CC@oracle.com> Message-ID: <51A86466.5040804@univ-mlv.fr> On 05/31/2013 09:41 AM, Paul Sandoz wrote: > Brian pointed out a incompatibility issue on lambda-dev: > > toArray(null) is now ambiguous and will not compile > > This is likely in practice to only affect tests checking the method throws an NPE. > > Paul. yes, and the test can be change by adding a cast. R?mi > > On May 30, 2013, at 11:18 AM, Paul Sandoz wrote: > >> A while ago there was some lively discussion on toArray and we finally settled on the following method in Stream: >> >> A[] toArray(IntFunction generator); >> >> Due to "noise" of other stuff i think we forgot to discuss whether it is valuable to add a similar default method to Collection (as recently suggested by Peter Levart on the lambda-dev list which motivated this email). >> >> >> The functionality can already be achieved with: >> >> c.stream().toArray(String[]::new); >> >> but is not likely to be as efficient as the existing concrete Collection.toArray implementations (e.g. see ArrayList.toArray) >> >> This is also the way to produce an array in parallel: >> >> c.parallelStream().toArray(String[]::new); >> >> where efficiencies can be obtained if the collection is large enough. >> >> >> But we could also add to Collection: >> >> default T[] toArray(IntFunction generator) { >> return toArray(generator.apply(size())); >> } >> >> c.toArray(String[]::new); >> >> Seems like a reasonable thing to do. The sequentially efficiency is retained without the "stink" of using the other toArray method accepting an array instance (of or not of the right length). >> >> Paul. From brian.goetz at oracle.com Fri May 31 06:42:15 2013 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 31 May 2013 06:42:15 -0700 Subject: Loose ends: Collection.toArray(IntFunction generator) In-Reply-To: <51A86466.5040804@univ-mlv.fr> References: <8E65C784-1F49-48CF-8A51-E7896226CDBB@oracle.com> <9A44C6F3-512D-4647-A66D-3C252ED612CC@oracle.com> <51A86466.5040804@univ-mlv.fr> Message-ID: <0DE32BAC-191A-444B-B7E4-99B33CF6B28A@oracle.com> Of course the test can be changed, that wasn't the point. The point is the cost of this will be higher than one might initially think since there will almost certainly be JCK tests that have to be changed too. On May 31, 2013, at 1:50 AM, Remi Forax wrote: > On 05/31/2013 09:41 AM, Paul Sandoz wrote: >> Brian pointed out a incompatibility issue on lambda-dev: >> >> toArray(null) is now ambiguous and will not compile >> >> This is likely in practice to only affect tests checking the method throws an NPE. >> >> Paul. > > yes, > and the test can be change by adding a cast. > > R?mi > >> >> On May 30, 2013, at 11:18 AM, Paul Sandoz wrote: >> >>> A while ago there was some lively discussion on toArray and we finally settled on the following method in Stream: >>> >>> A[] toArray(IntFunction generator); >>> >>> Due to "noise" of other stuff i think we forgot to discuss whether it is valuable to add a similar default method to Collection (as recently suggested by Peter Levart on the lambda-dev list which motivated this email). >>> >>> >>> The functionality can already be achieved with: >>> >>> c.stream().toArray(String[]::new); >>> >>> but is not likely to be as efficient as the existing concrete Collection.toArray implementations (e.g. see ArrayList.toArray) >>> >>> This is also the way to produce an array in parallel: >>> >>> c.parallelStream().toArray(String[]::new); >>> >>> where efficiencies can be obtained if the collection is large enough. >>> >>> >>> But we could also add to Collection: >>> >>> default T[] toArray(IntFunction generator) { >>> return toArray(generator.apply(size())); >>> } >>> >>> c.toArray(String[]::new); >>> >>> Seems like a reasonable thing to do. The sequentially efficiency is retained without the "stink" of using the other toArray method accepting an array instance (of or not of the right length). >>> >>> Paul. >