From nradov at axolotl.com Wed Nov 14 02:21:37 2007 From: nradov at axolotl.com (Nick Radov) Date: Tue, 13 Nov 2007 18:21:37 -0800 Subject: String.ltrim() and rtrim() methods RFE Message-ID: I would like to reopen discussion of Bug ID: 4074696 < http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible inclusion in jdk7. It seems to me that RFE was prematurely closed without proper consideration. Many developers have needed to left or right trim a String and I'm sure that same code has been rewritten thousands of times in applications. It would really help to have them in the standard library, and the impact on compiled file size would be minimal. The evaluation reason giving for closing the bug was "You can write this yourself and get reasonable performance with a modern VM." Well, of course we can get reasonable performance, but that isn't really the point. The reason for adding those methods to the standard library is to reduce the amount of redundant, low-level code that application developers have to write. Having to write those methods ourselves in applications also forces the creation of "StringUtilities" classes with a variety of static methods, which somewhat defeats the purpose of OO design. If we can get this reopened I would be happy to take care of making the actual code changes. I just requested the Developer role on the jdk project so hopefully that will be approved soon. Nick Radov ? Research & Development Manager ? Axolotl Corp www.axolotl.com o: 650.964.1100 x 116 800 West El Camino Real Suite 270 Mountain View CA 94040 Frost and Sullivan Awards | Market Leadership | Business Development Strategy Leadership The information contained in this e-mail transmission may contain confidential information. It is intended for the use of the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of this information is strictly prohibited. If you receive this message in error, please inform the sender immediately and remove any record of this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Phil.Race at Sun.COM Wed Nov 14 04:01:16 2007 From: Phil.Race at Sun.COM (Phil Race) Date: Tue, 13 Nov 2007 20:01:16 -0800 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: References: Message-ID: <473A730C.4080408@sun.com> Nick, I note this has just two customer records and just a single jdc vote despite having over eight years to accumulate these whilst it was open, whereas popular issues quickly accumulate hundreds of votes. Furthermore only one person has found it important enough to comment in the bug parade comments. So I think you'd need to show that its a lot more widely needed than some low single digits number of developers. -phil. Nick Radov wrote: > > I would like to reopen discussion of Bug ID: 4074696 > for possible > inclusion in jdk7. It seems to me that RFE was prematurely closed > without proper consideration. Many developers have needed to left or > right trim a String and I'm sure that same code has been rewritten > thousands of times in applications. It would really help to have them > in the standard library, and the impact on compiled file size would be > minimal. The evaluation reason giving for closing the bug was "You can > write this yourself and get reasonable performance with a modern VM." > Well, of course we can get reasonable performance, but that isn't > really the point. The reason for adding those methods to the standard > library is to reduce the amount of redundant, low-level code that > application developers have to write. Having to write those methods > ourselves in applications also forces the creation of > "StringUtilities" classes with a variety of static methods, which > somewhat defeats the purpose of OO design. > > If we can get this reopened I would be happy to take care of making > the actual code changes. I just requested the Developer role on the > jdk project so hopefully that will be approved soon. > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > www.axolotl.com o: 650.964.1100 x 116 > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > Frost and Sullivan Awards | _Market Leadership_ > | _Business Development > Strategy Leadership_ > > /The information contained in this e-mail transmission may contain > confidential information. It is intended for the use of the addressee. > If you are not the intended recipient, any disclosure, copying, or > distribution of this information is strictly prohibited. If you > receive this message in error, please inform the sender immediately > and remove any record of this message./ From tobrien at discursive.com Fri Nov 16 22:11:05 2007 From: tobrien at discursive.com (Tim O'Brien) Date: Fri, 16 Nov 2007 17:11:05 -0500 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <473A730C.4080408@sun.com> References: <473A730C.4080408@sun.com> Message-ID: <7260cf030711161411k3ab9a21nf4231d1fa5a47010@mail.gmail.com> ltrim and rtrim are excedingly useful. I've often wondered why Sun didn't bother to include them. It is just another one of those reasons why people tend to say that string manipulation in Java leaves much to be desired. This should take all of two minutes to implement, and has a sub-trivial impact. Nick's observation is tru: > The reason for adding those methods to the standard > library is to reduce the amount of redundant, low-level code that > application developers have to write. Having to write those methods > ourselves in applications also forces the creation of > "StringUtilities" classes with a variety of static methods, which > somewhat defeats the purpose of OO design. I've worked on Commons Lang, but I have to tell you that it is an ongoing embarrassment that every Java program ever developed has to either include a set of static methods or reimplement low-level String parsing functions because Sun just doesn't think it is a big idea. If Java had open classes this really wouldn't be such a big deal. On 11/13/07, Phil Race wrote: > > Nick, > I note this has just two customer records and just a single jdc vote > despite having over eight years to > accumulate these whilst it was open, whereas popular issues quickly > accumulate hundreds of votes. > Furthermore only one person has found it important enough to comment in > the bug parade comments. > So I think you'd need to show that its a lot more widely needed than > some low single digits number > of developers. > > -phil. > > > Nick Radov wrote: > > > > I would like to reopen discussion of Bug ID: 4074696 > > for possible > > inclusion in jdk7. It seems to me that RFE was prematurely closed > > without proper consideration. Many developers have needed to left or > > right trim a String and I'm sure that same code has been rewritten > > thousands of times in applications. It would really help to have them > > in the standard library, and the impact on compiled file size would be > > minimal. The evaluation reason giving for closing the bug was "You can > > write this yourself and get reasonable performance with a modern VM." > > Well, of course we can get reasonable performance, but that isn't > > really the point. The reason for adding those methods to the standard > > library is to reduce the amount of redundant, low-level code that > > application developers have to write. Having to write those methods > > ourselves in applications also forces the creation of > > "StringUtilities" classes with a variety of static methods, which > > somewhat defeats the purpose of OO design. > > > > If we can get this reopened I would be happy to take care of making > > the actual code changes. I just requested the Developer role on the > > jdk project so hopefully that will be approved soon. > > > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > > www.axolotl.com o: 650.964.1100 x 116 > > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > > > Frost and Sullivan Awards | _Market Leadership_ > > | _Business Development > > Strategy Leadership_ > > > > /The information contained in this e-mail transmission may contain > > confidential information. It is intended for the use of the addressee. > > If you are not the intended recipient, any disclosure, copying, or > > distribution of this information is strictly prohibited. If you > > receive this message in error, please inform the sender immediately > > and remove any record of this message./ > > -- ------ Tim O'Brien: (847) 863-7045 -------------- next part -------------- An HTML attachment was scrubbed... URL: From nradov at axolotl.com Fri Nov 16 23:19:50 2007 From: nradov at axolotl.com (Nick Radov) Date: Fri, 16 Nov 2007 15:19:50 -0800 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <473A730C.4080408@sun.com> References: <473A730C.4080408@sun.com> Message-ID: Phil, The bug voting mechanism doesn't really work for trivial RFEs like this. First, the bug has been closed for a while and no one is going to vote for a closed bug. Second, everyone only gets a few votes so they're going to put their votes on the most critical issues. Annoyances like this are left to languish. Let's look at this RFE a different way. Is there any reason not to implement it? All of the necessary functionality is already present in the trim() method. We just need to break it out into two public ltrim() and rtrim() methods. Microsoft's .NET 3.5 framework has equivalent methods on the System.String class as TrimStart and TrimEnd. Don't we want to keep Java competitive with .NET? < http://msdn2.microsoft.com/en-us/library/system.string_methods(VS.90).aspx > Nick Radov ? Research & Development Manager ? Axolotl Corp www.axolotl.com o: 650.964.1100 x 116 800 West El Camino Real Suite 270 Mountain View CA 94040 Frost and Sullivan Awards | Market Leadership | Business Development Strategy Leadership The information contained in this e-mail transmission may contain confidential information. It is intended for the use of the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of this information is strictly prohibited. If you receive this message in error, please inform the sender immediately and remove any record of this message. From: Phil Race To: Nick Radov Cc: core-libs-dev at openjdk.java.net Date: 11/13/2007 07:58 PM Subject: Re: String.ltrim() and rtrim() methods RFE Nick, I note this has just two customer records and just a single jdc vote despite having over eight years to accumulate these whilst it was open, whereas popular issues quickly accumulate hundreds of votes. Furthermore only one person has found it important enough to comment in the bug parade comments. So I think you'd need to show that its a lot more widely needed than some low single digits number of developers. -phil. Nick Radov wrote: > > I would like to reopen discussion of Bug ID: 4074696 > for possible > inclusion in jdk7. It seems to me that RFE was prematurely closed > without proper consideration. Many developers have needed to left or > right trim a String and I'm sure that same code has been rewritten > thousands of times in applications. It would really help to have them > in the standard library, and the impact on compiled file size would be > minimal. The evaluation reason giving for closing the bug was "You can > write this yourself and get reasonable performance with a modern VM." > Well, of course we can get reasonable performance, but that isn't > really the point. The reason for adding those methods to the standard > library is to reduce the amount of redundant, low-level code that > application developers have to write. Having to write those methods > ourselves in applications also forces the creation of > "StringUtilities" classes with a variety of static methods, which > somewhat defeats the purpose of OO design. > > If we can get this reopened I would be happy to take care of making > the actual code changes. I just requested the Developer role on the > jdk project so hopefully that will be approved soon. > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > www.axolotl.com o: 650.964.1100 x 116 > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > Frost and Sullivan Awards | _Market Leadership_ > | _Business Development > Strategy Leadership_ > > /The information contained in this e-mail transmission may contain > confidential information. It is intended for the use of the addressee. > If you are not the intended recipient, any disclosure, copying, or > distribution of this information is strictly prohibited. If you > receive this message in error, please inform the sender immediately > and remove any record of this message./ -------------- next part -------------- An HTML attachment was scrubbed... URL: From scolebourne at joda.org Sat Nov 17 00:22:18 2007 From: scolebourne at joda.org (Stephen Colebourne) Date: Sat, 17 Nov 2007 00:22:18 +0000 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: References: <473A730C.4080408@sun.com> Message-ID: <473E343A.3090506@joda.org> There is some talk of adding a group of new methods to low level classes: http://smallwig.blogspot.com/2007/11/minor-api-fixes-for-jdk-7.html Personally, I would like to see this extended to become a simple JSR where ideas such as ltrim/rtrim and so on can be evaluated properly. ie. take each JDK class and work through them one by one, considering what needs adding (by evaluating open and closed source libraries, and taking the most common) Any JSR does need to be very open though, and it needs to be supported by (and funded by) a major company, probably Sun. Stephen Nick Radov wrote: > > Phil, > > The bug voting mechanism doesn't really work for trivial RFEs like this. > First, the bug has been closed for a while and no one is going to vote > for a closed bug. Second, everyone only gets a few votes so they're > going to put their votes on the most critical issues. Annoyances like > this are left to languish. > > Let's look at this RFE a different way. Is there any reason not to > implement it? All of the necessary functionality is already present in > the trim() method. We just need to break it out into two public ltrim() > and rtrim() methods. > > Microsoft's .NET 3.5 framework has equivalent methods on the > System.String class as TrimStart and TrimEnd. Don't we want to keep Java > competitive with .NET? > > > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > www.axolotl.com o: 650.964.1100 x 116 > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > Frost and Sullivan Awards | _Market Leadership_ > | _Business Development > Strategy Leadership_ > > /The information contained in this e-mail transmission may contain > confidential information. It is intended for the use of the addressee. > If you are not the intended recipient, any disclosure, copying, or > distribution of this information is strictly prohibited. If you receive > this message in error, please inform the sender immediately and remove > any record of this message./ > > > From: Phil Race > To: Nick Radov > Cc: core-libs-dev at openjdk.java.net > Date: 11/13/2007 07:58 PM > Subject: Re: String.ltrim() and rtrim() methods RFE > > > ------------------------------------------------------------------------ > > > > Nick, > I note this has just two customer records and just a single jdc vote > despite having over eight years to > accumulate these whilst it was open, whereas popular issues quickly > accumulate hundreds of votes. > Furthermore only one person has found it important enough to comment in > the bug parade comments. > So I think you'd need to show that its a lot more widely needed than > some low single digits number > of developers. > > -phil. > > > Nick Radov wrote: > > > > I would like to reopen discussion of Bug ID: 4074696 > > for possible > > inclusion in jdk7. It seems to me that RFE was prematurely closed > > without proper consideration. Many developers have needed to left or > > right trim a String and I'm sure that same code has been rewritten > > thousands of times in applications. It would really help to have them > > in the standard library, and the impact on compiled file size would be > > minimal. The evaluation reason giving for closing the bug was "You can > > write this yourself and get reasonable performance with a modern VM." > > Well, of course we can get reasonable performance, but that isn't > > really the point. The reason for adding those methods to the standard > > library is to reduce the amount of redundant, low-level code that > > application developers have to write. Having to write those methods > > ourselves in applications also forces the creation of > > "StringUtilities" classes with a variety of static methods, which > > somewhat defeats the purpose of OO design. > > > > If we can get this reopened I would be happy to take care of making > > the actual code changes. I just requested the Developer role on the > > jdk project so hopefully that will be approved soon. > > > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > > www.axolotl.com o: 650.964.1100 x 116 > > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > > > Frost and Sullivan Awards | _Market Leadership_ > > | _Business Development > > Strategy Leadership_ > > > > /The information contained in this e-mail transmission may contain > > confidential information. It is intended for the use of the addressee. > > If you are not the intended recipient, any disclosure, copying, or > > distribution of this information is strictly prohibited. If you > > receive this message in error, please inform the sender immediately > > and remove any record of this message./ > > From freds at jfrog.org Sat Nov 17 01:55:15 2007 From: freds at jfrog.org (freds at jfrog.org) Date: Fri, 16 Nov 2007 23:55:15 -0200 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <473E343A.3090506@joda.org> References: <473A730C.4080408@sun.com> <473E343A.3090506@joda.org> Message-ID: The cost/benefit should be evaluated. And for this one the cost looks very low. So... On 11/16/07, Stephen Colebourne wrote: > There is some talk of adding a group of new methods to low level > classes: http://smallwig.blogspot.com/2007/11/minor-api-fixes-for-jdk-7.html > > Personally, I would like to see this extended to become a simple JSR > where ideas such as ltrim/rtrim and so on can be evaluated properly. > ie. take each JDK class and work through them one by one, considering > what needs adding (by evaluating open and closed source libraries, and > taking the most common) > > Any JSR does need to be very open though, and it needs to be supported > by (and funded by) a major company, probably Sun. > > Stephen > > Nick Radov wrote: > > > > Phil, > > > > The bug voting mechanism doesn't really work for trivial RFEs like this. > > First, the bug has been closed for a while and no one is going to vote > > for a closed bug. Second, everyone only gets a few votes so they're > > going to put their votes on the most critical issues. Annoyances like > > this are left to languish. > > > > Let's look at this RFE a different way. Is there any reason not to > > implement it? All of the necessary functionality is already present in > > the trim() method. We just need to break it out into two public ltrim() > > and rtrim() methods. > > > > Microsoft's .NET 3.5 framework has equivalent methods on the > > System.String class as TrimStart and TrimEnd. Don't we want to keep Java > > competitive with .NET? > > > > > > > > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > > www.axolotl.com o: 650.964.1100 x 116 > > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > > > Frost and Sullivan Awards | _Market Leadership_ > > | _Business Development > > Strategy Leadership_ > > > > /The information contained in this e-mail transmission may contain > > confidential information. It is intended for the use of the addressee. > > If you are not the intended recipient, any disclosure, copying, or > > distribution of this information is strictly prohibited. If you receive > > this message in error, please inform the sender immediately and remove > > any record of this message./ > > > > > > From: Phil Race > > To: Nick Radov > > Cc: core-libs-dev at openjdk.java.net > > Date: 11/13/2007 07:58 PM > > Subject: Re: String.ltrim() and rtrim() methods RFE > > > > > > ------------------------------------------------------------------------ > > > > > > > > Nick, > > I note this has just two customer records and just a single jdc vote > > despite having over eight years to > > accumulate these whilst it was open, whereas popular issues quickly > > accumulate hundreds of votes. > > Furthermore only one person has found it important enough to comment in > > the bug parade comments. > > So I think you'd need to show that its a lot more widely needed than > > some low single digits number > > of developers. > > > > -phil. > > > > > > Nick Radov wrote: > > > > > > I would like to reopen discussion of Bug ID: 4074696 > > > for possible > > > inclusion in jdk7. It seems to me that RFE was prematurely closed > > > without proper consideration. Many developers have needed to left or > > > right trim a String and I'm sure that same code has been rewritten > > > thousands of times in applications. It would really help to have them > > > in the standard library, and the impact on compiled file size would be > > > minimal. The evaluation reason giving for closing the bug was "You can > > > write this yourself and get reasonable performance with a modern VM." > > > Well, of course we can get reasonable performance, but that isn't > > > really the point. The reason for adding those methods to the standard > > > library is to reduce the amount of redundant, low-level code that > > > application developers have to write. Having to write those methods > > > ourselves in applications also forces the creation of > > > "StringUtilities" classes with a variety of static methods, which > > > somewhat defeats the purpose of OO design. > > > > > > If we can get this reopened I would be happy to take care of making > > > the actual code changes. I just requested the Developer role on the > > > jdk project so hopefully that will be approved soon. > > > > > > *Nick Radov ? Research & Development Manager ? Axolotl Corp* > > > www.axolotl.com o: 650.964.1100 x 116 > > > 800 West El Camino Real Suite 270 Mountain View CA 94040 > > > > > > Frost and Sullivan Awards | _Market Leadership_ > > > | _Business Development > > > Strategy Leadership_ > > > > > > /The information contained in this e-mail transmission may contain > > > confidential information. It is intended for the use of the addressee. > > > If you are not the intended recipient, any disclosure, copying, or > > > distribution of this information is strictly prohibited. If you > > > receive this message in error, please inform the sender immediately > > > and remove any record of this message./ > > > > > -- http://freddy33.bglogspot.com/ http://www.jfrog.org/ From mr at sun.com Sat Nov 17 04:19:27 2007 From: mr at sun.com (Mark Reinhold) Date: Fri, 16 Nov 2007 20:19:27 -0800 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: nradov@axolotl.com; Fri, 16 Nov 2007 15:19:50 PST; Message-ID: <20071117041927.3965AC13@callebaut.niobe.net> > Date: Fri, 16 Nov 2007 15:19:50 -0800 > From: Nick Radov > The bug voting mechanism doesn't really work for trivial RFEs like this. > First, the bug has been closed for a while and no one is going to vote for > a closed bug. Second, everyone only gets a few votes so they're going to > put their votes on the most critical issues. Annoyances like this are left > to languish. Agreed. > Let's look at this RFE a different way. Is there any reason not to > implement it? Beware: In general this is not a very persuasive line of reasoning. If all RFEs over the last ten years had been evaluated in this way then most of them would've been implemented by now, and the platform would be a horrid, rotting mess of woefully inconsistent spaghetti. Having said that, I've spent quite a bit of time over the last couple of months hacking Python code for the OpenJDK Mercurial infrastructure, and I've used Python's equivalent lstrip/rstrip functions more than once. They're quite handy actually, especially the rstrip function that takes a string argument and removes any trailing characters present in that string. So if somebody's going do to this I'd actually recommend adding four methods (needed since Java doesn't have default parameter values): String.ltrim() String.ltrim(String charsToTrim) String.rtrim() String.rtrim(String charsToTrim) Of course one can get this behavior today with the replaceFirst method, e.g., s.replaceFirst("[charsToTrim]+$", "") but that requires compiling the regex and building a matcher, which is awfully heavyweight, especially in the middle of a tight loop. - Mark From pdoubleya at gmail.com Sat Nov 17 07:00:58 2007 From: pdoubleya at gmail.com (Patrick Wright) Date: Sat, 17 Nov 2007 08:00:58 +0100 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net> References: <20071117041927.3965AC13@callebaut.niobe.net> Message-ID: <64efa1ba0711162300y6e2d8b44la299e75ceaabc48a@mail.gmail.com> Hi Mark > > Let's look at this RFE a different way. Is there any reason not to > > implement it? > > Beware: In general this is not a very persuasive line of reasoning. > > If all RFEs over the last ten years had been evaluated in this way then > most of them would've been implemented by now, and the platform would be > a horrid, rotting mess of woefully inconsistent spaghetti. It's nice to see you comment on this issue--I would like to hear sometime (seems possibly related to governance) how smaller decisions about the JDK (like this one) have been made in the past and will be made in the future. Is it always a matter of responding to an RFE or a bug? Who makes the decision, the respective group owners? Who keeps an eye on the big picture? It somehow seems these sorts of changes fall below the threshold of a JSR, but there's enough of this "tuning" possible that it seems there should/could be some process for it as well in order to achieve the both consistency in the result (across the JDK) as well as encourage a "constant tuning" attitude to avoid API rot. I also note that in the projects I've worked on (as a contractor, server side, for many years), as the other poster said, it's extremely common for teams to either, a) roll their own convenience APIs, b) use Apache commons or other friendly libraries. However, it would seem to me that we could do better, if the community around JDK could agree on convenience APIs that were perhaps not shipped with the JDK, but were "blessed" as high-quality and recommended as "useful optional libraries" and promoted as such. Patrick From forax at univ-mlv.fr Sat Nov 17 14:08:19 2007 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Sat, 17 Nov 2007 15:08:19 +0100 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net> References: <20071117041927.3965AC13@callebaut.niobe.net> Message-ID: <473EF5D3.3040700@univ-mlv.fr> Mark Reinhold a ?crit : >> Date: Fri, 16 Nov 2007 15:19:50 -0800 >> From: Nick Radov >> > > >> The bug voting mechanism doesn't really work for trivial RFEs like this. >> First, the bug has been closed for a while and no one is going to vote for >> a closed bug. Second, everyone only gets a few votes so they're going to >> put their votes on the most critical issues. Annoyances like this are left >> to languish. >> > > Agreed. > > >> Let's look at this RFE a different way. Is there any reason not to >> implement it? >> > > Beware: In general this is not a very persuasive line of reasoning. > > If all RFEs over the last ten years had been evaluated in this way then > most of them would've been implemented by now, and the platform would be > a horrid, rotting mess of woefully inconsistent spaghetti. > > Having said that, I've spent quite a bit of time over the last couple of > months hacking Python code for the OpenJDK Mercurial infrastructure, and > I've used Python's equivalent lstrip/rstrip functions more than once. > They're quite handy actually, especially the rstrip function that takes > a string argument and removes any trailing characters present in that > string. > > So if somebody's going do to this I'd actually recommend adding four > methods (needed since Java doesn't have default parameter values): > > String.ltrim() > String.ltrim(String charsToTrim) > String.rtrim() > String.rtrim(String charsToTrim) > we don't have default values but we have varargs :) so we can mimic python lstrip/rstrip using only two methods: String.ltrim(char... charsToTrim) String.rtrim(char... charsToTrim) If the array charsToTrim is empty or null, whitespace characters are used. > Of course one can get this behavior today with the replaceFirst method, > e.g., > > s.replaceFirst("[charsToTrim]+$", "") > > but that requires compiling the regex and building a matcher, which is > awfully heavyweight, especially in the middle of a tight loop. > > - Mark > R?mi From nradov at axolotl.com Mon Nov 19 18:55:04 2007 From: nradov at axolotl.com (Nick Radov) Date: Mon, 19 Nov 2007 10:55:04 -0800 Subject: String.ltrim() and rtrim() methods RFE In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net> References: nradov@axolotl.com; Fri, 16 Nov 2007 15:19:50 PST; <20071117041927.3965AC13@callebaut.niobe.net> Message-ID: It seems we have general consensus that the ltrim() and rtrim() methods should be added, and possibly some other related methods as well. Now, how do we go about reopening Bug ID: 4074696 < http://bugs.sun.com/view_bug.do?bug_id=4074696>? Who has authority to make decisions on those issues? Nick Radov ? Research & Development Manager ? Axolotl Corp www.axolotl.com o: 650.964.1100 x 116 800 West El Camino Real Suite 270 Mountain View CA 94040 Frost and Sullivan Awards | Market Leadership | Business Development Strategy Leadership The information contained in this e-mail transmission may contain confidential information. It is intended for the use of the addressee. If you are not the intended recipient, any disclosure, copying, or distribution of this information is strictly prohibited. If you receive this message in error, please inform the sender immediately and remove any record of this message. From: Mark Reinhold To: Nick Radov Cc: core-libs-dev at openjdk.java.net Date: 11/16/2007 08:16 PM Subject: Re: String.ltrim() and rtrim() methods RFE > Date: Fri, 16 Nov 2007 15:19:50 -0800 > From: Nick Radov > The bug voting mechanism doesn't really work for trivial RFEs like this. > First, the bug has been closed for a while and no one is going to vote for > a closed bug. Second, everyone only gets a few votes so they're going to > put their votes on the most critical issues. Annoyances like this are left > to languish. Agreed. > Let's look at this RFE a different way. Is there any reason not to > implement it? Beware: In general this is not a very persuasive line of reasoning. If all RFEs over the last ten years had been evaluated in this way then most of them would've been implemented by now, and the platform would be a horrid, rotting mess of woefully inconsistent spaghetti. Having said that, I've spent quite a bit of time over the last couple of months hacking Python code for the OpenJDK Mercurial infrastructure, and I've used Python's equivalent lstrip/rstrip functions more than once. They're quite handy actually, especially the rstrip function that takes a string argument and removes any trailing characters present in that string. So if somebody's going do to this I'd actually recommend adding four methods (needed since Java doesn't have default parameter values): String.ltrim() String.ltrim(String charsToTrim) String.rtrim() String.rtrim(String charsToTrim) Of course one can get this behavior today with the replaceFirst method, e.g., s.replaceFirst("[charsToTrim]+$", "") but that requires compiling the regex and building a matcher, which is awfully heavyweight, especially in the middle of a tight loop. - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at sun.com Thu Nov 15 13:08:22 2007 From: charlie.hunt at sun.com (charlie hunt) Date: Thu, 15 Nov 2007 07:08:22 -0600 Subject: encoding-agnostic byte[]-based regexp engine...interested? In-Reply-To: <473BB9F6.8090206@sun.com> References: <473BB9F6.8090206@sun.com> Message-ID: <473C44C6.7030003@sun.com> Hi Charlie, I'm adding OpenJDK's Java SE core libraries since that's where Java NIO lives. I doubt anything could be done at the class libraries level since an API addition / enhancement would likely require JCP activity. But, there may be some value in raising some awareness at the class libraries level ? I'd like to hear others reactions on this mailing list. My initial reaction is what you are describing sounds like something that could be very useful for a protocol parser. The core of Grizzly is protocol independent. But, this might be a useful be able to offer to those who are implementing the com.sun.grizzly.ProtocolParser interface. ProtocolParser is part of core Grizzly / Grizzly Framework. I think some additional exploration / investigation is worthy. We are in the process of gathering new feature requests. I think we should add this to that list. Again, anyone else who has some comments / reactions, please feel free to jump in. :-) charlie ... Charles Oliver Nutter wrote: > Oniguruma is a C-based regular expression engine starting to get some > attention. The key selling points are its speed and the fact that it > can be applied to string content with arbitrary encodings. It will be > the default regex engine in Ruby 1.9. > > JRuby 1.1 will ship with a port of Oniguruma dubbed "Joni". For us, > the benefit is that we'll finally have a fast regex engine that can > work with Ruby's encoding-free byte[]-based strings, where before we > had to convert to/from char[] for all regex engines. We expect to see > great gains in regex performance with JRuby 1.1 when we release the > final version in Decemberish timeframe. > > But it has occurred to me there could be an even more interesting use > of Joni: as a regexp engine that could accept NIO bytebuffers > directly. Because it just walks byte[], no decoding is necessary. > Because it's encoding-agnostic, any arbitrary byte content could be > matched. So in theory it could easily be adapted to be a fast NIO > bytebuffer regex engine. > > Would there be interest in such a thing? I'm sure there are other > NIO-related lists that would be appropriate, but Grizzly is the first > actual project that springs to mind when I think of NIO, so I thought > I'd toss it out there. > > - Charlie > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe at grizzly.dev.java.net > For additional commands, e-mail: dev-help at grizzly.dev.java.net > -- Charlie Hunt Java Performance Engineer From charles.nutter at sun.com Fri Nov 16 06:56:05 2007 From: charles.nutter at sun.com (Charles Oliver Nutter) Date: Fri, 16 Nov 2007 00:56:05 -0600 Subject: encoding-agnostic byte[]-based regexp engine...interested? In-Reply-To: <473C44C6.7030003@sun.com> References: <473BB9F6.8090206@sun.com> <473C44C6.7030003@sun.com> Message-ID: <473D3F05.4030303@sun.com> charlie hunt wrote: > Hi Charlie, > > I'm adding OpenJDK's Java SE core libraries since that's where Java NIO > lives. I doubt anything could be done at the class libraries level since > an API addition / enhancement would likely require JCP activity. But, > there may be some value in raising some awareness at the class libraries > level ? I wouldn't expect this to necessarily be included in Java in the future; but this port is rapidly maturing, and having such a thing available in the community could show its value for the future. > I'd like to hear others reactions on this mailing list. My initial > reaction is what you are describing sounds like something that could be > very useful for a protocol parser. The core of Grizzly is protocol > independent. But, this might be a useful be able to offer to those who > are implementing the com.sun.grizzly.ProtocolParser interface. > ProtocolParser is part of core Grizzly / Grizzly Framework. Yes, this is exactly what I was thinking. The ability to do such parsing without having to decode the incoming content could be very useful. > I think some additional exploration / investigation is worthy. We are > in the process of gathering new feature requests. I think we should add > this to that list. If you think that's a possibility. At any rate, the repository for the Joni engine is here: https://svn.codehaus.org/jruby/joni/src/org/joni/ The porter is Marcin Mielczynski, a member of the JRuby team, and all credit and kudos should go his way. - Charlie From peter.arrenbrecht at gmail.com Wed Nov 21 08:52:03 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Wed, 21 Nov 2007 09:52:03 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight Message-ID: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> Hi all, I recently thought about how to resize hashmaps. When looking at the JDK 6 source, I saw that java.util.HashMap always transfers old entries from the old table. When memory is tight, it might be better to first concatenate the entry lists into one big list, throw away the old table, allocate the new one, and then fill it from the concatenated list. So: @Override void resize( int _newCapacity ) { try { fastResize( _newCapacity ); // This would be the current resize code. } catch (OutOfMemoryError e) { tightResize( _newCapacity ); } } @SuppressWarnings( "unchecked" ) private void tightResize( int newCapacity ) { Entry head = joinLists(); table = new Entry[ newCapacity ]; threshold = (int) (newCapacity * loadFactor); transfer( head, table ); } @SuppressWarnings("unchecked") private Entry joinLists() { final Entry[] src = table; final int n = src.length; Entry head = null; int i = 0; while (i < n && null == head) { head = src[ i++ ]; } Entry tail = head; assert i >= n || null != tail; while (i < n) { Entry e = src[ i++ ]; if (null != e) { tail.next = e; do { tail = e; e = e.next; } while (null != e); } } return head; } @SuppressWarnings("unchecked") private void transfer( Entry head, Entry[] tgt ) { int n = capacity(); while (head != null) { Entry e = head; head = head.next; int i = indexFor( e.hash, n ); e.next = tgt[ i ]; tgt[ i ] = e; } } What do you think? -peo From peter.arrenbrecht at gmail.com Wed Nov 21 15:51:28 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Wed, 21 Nov 2007 16:51:28 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> Message-ID: <16fff4530711210751k9b00c25rc2a1c3a03b271dc9@mail.gmail.com> Hi again, I have gone over my code again and a) discovered a very stupid mistake rendering the desired effect null and void, and b) developed a test that demos the effect of the improvement. Here's the improved code: private void tightResize( int newCapacity ) { Entry head = joinLists(); table = null; // free it first table = new Entry[ newCapacity ]; // then reallocate threshold = (int) (newCapacity * loadFactor); transfer( head, table ); } Below you can find the test code. This shows the problem here on Ubuntu Linux 7.04 with jre 1.6.0: java version "1.6.0" Java(TM) SE Runtime Environment (build 1.6.0-b105) Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode) The command-line for the improved map is: java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new and for the old map: java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap I have only managed to demo the effect on the server VM. And it is necessary to leave an Object array of size initial*2+something free, rather than just initial+something, which I expected. Maybe that is an effect of the generational collector. Also, there have been spurious cases where the test failed even with the new map. No idea why. Here is the test code: public static void main( String[] args ) { final int initial = 131072; final float loadFactor = 0.5F; final HashMap m; if (args.length > 0) { System.out.println( "Creating better map..." ); m = new BetterHashMap( initial, loadFactor ); } else { System.out.println( "Creating standard map..." ); m = new HashMap( initial, loadFactor ); } System.out.println( "Priming map (should see no resize here)..." ); for (int i = 0; i < initial / 2; i++) { Integer o = i; m.put( o, o ); } Integer o = initial; Entry head = blockMemExcept( initial * 2 + initial / 4 ); System.out.println( "Filled with " + n + " entries." ); System.out.println( "Adding next element (should see resize here)..." ); m.put( o, o ); if (head == null) System.out.println( "Bad." ); // force "head" to remain in scope System.out.println( "Done." ); } /** * Done separately so memBlock goes out of scope cleanly, leaving no local stack copies pointing * to it. */ private static Entry blockMemExcept( int exceptObjs ) { System.out.println( "Reserving memory..." ); Object[] memBlock = new Object[ exceptObjs ]; System.out.println( "Filling rest of memory..." ); int i = 0; Entry head = null; try { while (true) { head = new Entry( 0, null, null, head ); i++; } } catch (OutOfMemoryError e) { // ignore } if (memBlock[ 0 ] != null) return null; n = i; return head; } private static int n = 0; Cheers, -peo ps. This all runs on copies of HashMap and AbstractMap in ch.arrrenbrecht.java.util. On Nov 21, 2007 9:52 AM, Peter Arrenbrecht wrote: > Hi all, > > I recently thought about how to resize hashmaps. When looking at the > JDK 6 source, I saw that java.util.HashMap always transfers old > entries from the old table. When memory is tight, it might be better > to first concatenate the entry lists into one big list, throw away the > old table, allocate the new one, and then fill it from the > concatenated list. So: > > @Override > void resize( int _newCapacity ) > { > try { > fastResize( _newCapacity ); // This would be the current resize code. > } > catch (OutOfMemoryError e) { > tightResize( _newCapacity ); > } > } > > @SuppressWarnings( "unchecked" ) > private void tightResize( int newCapacity ) > { > Entry head = joinLists(); > table = new Entry[ newCapacity ]; > threshold = (int) (newCapacity * loadFactor); > transfer( head, table ); > } > > @SuppressWarnings("unchecked") > private Entry joinLists() > { > final Entry[] src = table; > final int n = src.length; > Entry head = null; > int i = 0; > while (i < n && null == head) { > head = src[ i++ ]; > } > Entry tail = head; > assert i >= n || null != tail; > while (i < n) { > Entry e = src[ i++ ]; > if (null != e) { > tail.next = e; > do { > tail = e; > e = e.next; > } while (null != e); > } > } > return head; > } > > @SuppressWarnings("unchecked") > private void transfer( Entry head, Entry[] tgt ) > { > int n = capacity(); > while (head != null) { > Entry e = head; > head = head.next; > int i = indexFor( e.hash, n ); > e.next = tgt[ i ]; > tgt[ i ] = e; > } > } > > > What do you think? > -peo > From Martin.Buchholz at Sun.COM Wed Nov 21 17:37:37 2007 From: Martin.Buchholz at Sun.COM (Martin Buchholz) Date: Wed, 21 Nov 2007 09:37:37 -0800 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> Message-ID: <47446CE1.30000@sun.com> Hi Peter, It's true that under low memory conditions your code would allow execution to continue under some circumstances. However, I'm not sure this would be an improvement to the JDK. Recovery from OOME is fraught with hazards. We do occasionally try, but an application becomes much less reliable once OOMEs start appearing. Perhaps it's better to fail than to pretend that the JDK has been bullet-proofed against OOME. OOME recovery code is rarely executed and hard to test. The new code would have to be maintained indefinitely, making future maintenance just a little bit harder for the maintainers. If the hashmap is fully populated, most of the memory is tied up in the Entry objects themselves, not in the table array. Each Entry object should be about 5 words of memory, while there's approximately one word used within the table array. So I don't think we'll see anything close to the factor of two max memory saving that we might expect. I would prefer to see engineering work go into something like auto-reduction of the table array when many elements have been removed, but that's a hard problem. Martin Peter Arrenbrecht wrote: > Hi all, > > I recently thought about how to resize hashmaps. When looking at the > JDK 6 source, I saw that java.util.HashMap always transfers old > entries from the old table. When memory is tight, it might be better > to first concatenate the entry lists into one big list, throw away the > old table, allocate the new one, and then fill it from the > concatenated list. So: > > @Override > void resize( int _newCapacity ) > { > try { > fastResize( _newCapacity ); // This would be the current resize code. > } > catch (OutOfMemoryError e) { > tightResize( _newCapacity ); > } > } > > @SuppressWarnings( "unchecked" ) > private void tightResize( int newCapacity ) > { > Entry head = joinLists(); > table = new Entry[ newCapacity ]; > threshold = (int) (newCapacity * loadFactor); > transfer( head, table ); > } > > @SuppressWarnings("unchecked") > private Entry joinLists() > { > final Entry[] src = table; > final int n = src.length; > Entry head = null; > int i = 0; > while (i < n && null == head) { > head = src[ i++ ]; > } > Entry tail = head; > assert i >= n || null != tail; > while (i < n) { > Entry e = src[ i++ ]; > if (null != e) { > tail.next = e; > do { > tail = e; > e = e.next; > } while (null != e); > } > } > return head; > } > > @SuppressWarnings("unchecked") > private void transfer( Entry head, Entry[] tgt ) > { > int n = capacity(); > while (head != null) { > Entry e = head; > head = head.next; > int i = indexFor( e.hash, n ); > e.next = tgt[ i ]; > tgt[ i ] = e; > } > } > > > What do you think? > -peo -------------- next part -------------- An embedded message was scrubbed... From: Peter Arrenbrecht Subject: Re: Proposal: Better HashMap.resize() when memory is tight Date: Wed, 21 Nov 2007 16:51:28 +0100 Size: 10453 URL: From peter.arrenbrecht at gmail.com Wed Nov 21 20:01:38 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Wed, 21 Nov 2007 21:01:38 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <47446CE1.30000@sun.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> <47446CE1.30000@sun.com> Message-ID: <16fff4530711211201h49fff2f2mf4f6632fd072ba81@mail.gmail.com> Hi Martin Thanks for responding so quickly and thoughtfully. I just thought that trying this little bit harder could make the difference when your already big hashmap overflows by just a few entries. While I agree that testing this code under OOME conditions would be tiresome (it took me quite a while to get the demo right, too), its basic soundness would be very easy to test using direct calls to tightResize(). So a single test showing it's really an improvement over fastResize() in terms of max memory footprint would suffice, no? However, since you say the scenario doesn't warrant the added maintenance burden, I'm just going to take your word for it. After all, I'm not going to be the one maintaining it, and I've never seen the problem in practice myself, either. ;) But, more generally speaking, does your "no bullet-proofing" argument mean that in general you don't endorse switching algorithms to try to cope with tight memory? I can see that starting to "bullet-proof" could be a never-ending story. However, I think there is a distinction between what I proposed and some of the wilder schemes one could contemplate (like growing by less than doubling and having to switch to less efficient hash->index conversions, or using a TreeMap to hold overflows, etc.) These latter would affect the overall behaviour of the HashMap significantly, leading to pervasive code changes. My change only tries a little harder in one very isolated spot. It does this with no significant code complexity, and with no effects on the overall behaviour (other than making it still work in my scenario, of course). Is that kind of "trying-harder" still bad? Isn't it kind of similar to what the GC does, too? Peter On Nov 21, 2007 6:37 PM, Martin Buchholz wrote: > Hi Peter, > > It's true that under low memory conditions your code would > allow execution to continue under some circumstances. > However, I'm not sure this would be an improvement to the JDK. > Recovery from OOME is fraught with hazards. We do occasionally > try, but an application becomes much less reliable once OOMEs > start appearing. Perhaps it's better to fail than to pretend > that the JDK has been bullet-proofed against OOME. > OOME recovery code is rarely executed and hard to test. > The new code would have to be maintained indefinitely, > making future maintenance just a little bit harder for > the maintainers. > > If the hashmap is fully populated, most of the memory is tied > up in the Entry objects themselves, not in the table array. > Each Entry object should be about 5 words of memory, while > there's approximately one word used within the table array. > So I don't think we'll see anything close to the factor of > two max memory saving that we might expect. > > I would prefer to see engineering work go into something > like auto-reduction of the table array when many elements > have been removed, but that's a hard problem. > > Martin > > > Peter Arrenbrecht wrote: > > Hi all, > > > > I recently thought about how to resize hashmaps. When looking at the > > JDK 6 source, I saw that java.util.HashMap always transfers old > > entries from the old table. When memory is tight, it might be better > > to first concatenate the entry lists into one big list, throw away the > > old table, allocate the new one, and then fill it from the > > concatenated list. So: > > > > @Override > > void resize( int _newCapacity ) > > { > > try { > > fastResize( _newCapacity ); // This would be the current resize code. > > } > > catch (OutOfMemoryError e) { > > tightResize( _newCapacity ); > > } > > } > > > > @SuppressWarnings( "unchecked" ) > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = new Entry[ newCapacity ]; > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > @SuppressWarnings("unchecked") > > private Entry joinLists() > > { > > final Entry[] src = table; > > final int n = src.length; > > Entry head = null; > > int i = 0; > > while (i < n && null == head) { > > head = src[ i++ ]; > > } > > Entry tail = head; > > assert i >= n || null != tail; > > while (i < n) { > > Entry e = src[ i++ ]; > > if (null != e) { > > tail.next = e; > > do { > > tail = e; > > e = e.next; > > } while (null != e); > > } > > } > > return head; > > } > > > > @SuppressWarnings("unchecked") > > private void transfer( Entry head, Entry[] tgt ) > > { > > int n = capacity(); > > while (head != null) { > > Entry e = head; > > head = head.next; > > int i = indexFor( e.hash, n ); > > e.next = tgt[ i ]; > > tgt[ i ] = e; > > } > > } > > > > > > What do you think? > > -peo > > > ---------- Forwarded message ---------- > From: Peter Arrenbrecht > To: core-libs-dev at openjdk.java.net > Date: Wed, 21 Nov 2007 16:51:28 +0100 > Subject: Re: Proposal: Better HashMap.resize() when memory is tight > Hi again, I have gone over my code again and a) discovered a very > stupid mistake rendering the desired effect null and void, and b) > developed a test that demos the effect of the improvement. Here's the > improved code: > > private void tightResize( int newCapacity ) > { > Entry head = joinLists(); > table = null; // free it first > table = new Entry[ newCapacity ]; // then reallocate > threshold = (int) (newCapacity * loadFactor); > transfer( head, table ); > } > > Below you can find the test code. This shows the problem here on > Ubuntu Linux 7.04 with jre 1.6.0: > > java version "1.6.0" > Java(TM) SE Runtime Environment (build 1.6.0-b105) > Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode) > > The command-line for the improved map is: > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new > > and for the old map: > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap > > I have only managed to demo the effect on the server VM. And it is > necessary to leave an Object array of size initial*2+something free, > rather than just initial+something, which I expected. Maybe that is an > effect of the generational collector. Also, there have been spurious > cases where the test failed even with the new map. No idea why. > > Here is the test code: > > public static void main( String[] args ) > { > final int initial = 131072; > final float loadFactor = 0.5F; > final HashMap m; > if (args.length > 0) { > System.out.println( "Creating better map..." ); > m = new BetterHashMap( initial, loadFactor ); > } > else { > System.out.println( "Creating standard map..." ); > m = new HashMap( initial, loadFactor ); > } > > System.out.println( "Priming map (should see no resize here)..." ); > for (int i = 0; i < initial / 2; i++) { > Integer o = i; > m.put( o, o ); > } > Integer o = initial; > > Entry head = blockMemExcept( initial * 2 + initial / 4 ); > System.out.println( "Filled with " + n + " entries." ); > > System.out.println( "Adding next element (should see resize here)..." ); > m.put( o, o ); > > if (head == null) System.out.println( "Bad." ); // force "head" to > remain in scope > System.out.println( "Done." ); > } > > /** > * Done separately so memBlock goes out of scope cleanly, leaving no > local stack copies pointing > * to it. > */ > private static Entry blockMemExcept( int exceptObjs ) > { > System.out.println( "Reserving memory..." ); > Object[] memBlock = new Object[ exceptObjs ]; > > System.out.println( "Filling rest of memory..." ); > int i = 0; > Entry head = null; > try { > while (true) { > head = new Entry( 0, null, null, head ); > i++; > } > } > catch (OutOfMemoryError e) { > // ignore > } > > if (memBlock[ 0 ] != null) return null; > n = i; > return head; > } > private static int n = 0; > > Cheers, > -peo > > > ps. This all runs on copies of HashMap and AbstractMap in > ch.arrrenbrecht.java.util. > > > On Nov 21, 2007 9:52 AM, Peter Arrenbrecht wrote: > > Hi all, > > > > I recently thought about how to resize hashmaps. When looking at the > > JDK 6 source, I saw that java.util.HashMap always transfers old > > entries from the old table. When memory is tight, it might be better > > to first concatenate the entry lists into one big list, throw away the > > old table, allocate the new one, and then fill it from the > > concatenated list. So: > > > > @Override > > void resize( int _newCapacity ) > > { > > try { > > fastResize( _newCapacity ); // This would be the current resize code. > > } > > catch (OutOfMemoryError e) { > > tightResize( _newCapacity ); > > } > > } > > > > @SuppressWarnings( "unchecked" ) > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = new Entry[ newCapacity ]; > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > @SuppressWarnings("unchecked") > > private Entry joinLists() > > { > > final Entry[] src = table; > > final int n = src.length; > > Entry head = null; > > int i = 0; > > while (i < n && null == head) { > > head = src[ i++ ]; > > } > > Entry tail = head; > > assert i >= n || null != tail; > > while (i < n) { > > Entry e = src[ i++ ]; > > if (null != e) { > > tail.next = e; > > do { > > tail = e; > > e = e.next; > > } while (null != e); > > } > > } > > return head; > > } > > > > @SuppressWarnings("unchecked") > > private void transfer( Entry head, Entry[] tgt ) > > { > > int n = capacity(); > > while (head != null) { > > Entry e = head; > > head = head.next; > > int i = indexFor( e.hash, n ); > > e.next = tgt[ i ]; > > tgt[ i ] = e; > > } > > } > > > > > > What do you think? > > -peo > > > > From roman at kennke.org Wed Nov 21 20:23:39 2007 From: roman at kennke.org (Roman Kennke) Date: Wed, 21 Nov 2007 21:23:39 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <47446CE1.30000@sun.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> <47446CE1.30000@sun.com> Message-ID: <1195676619.6742.17.camel@mercury> Hi there, Why not implement such a thing as a separate library/class. After all, Map is an interface which can be implemented in many ways and for many different purposes. I think there are a couple of efforts that go in this direction, for example javolution: http://javolution.org/ Cheers, Roman Am Mittwoch, den 21.11.2007, 09:37 -0800 schrieb Martin Buchholz: > Hi Peter, > > It's true that under low memory conditions your code would > allow execution to continue under some circumstances. > However, I'm not sure this would be an improvement to the JDK. > Recovery from OOME is fraught with hazards. We do occasionally > try, but an application becomes much less reliable once OOMEs > start appearing. Perhaps it's better to fail than to pretend > that the JDK has been bullet-proofed against OOME. > OOME recovery code is rarely executed and hard to test. > The new code would have to be maintained indefinitely, > making future maintenance just a little bit harder for > the maintainers. > > If the hashmap is fully populated, most of the memory is tied > up in the Entry objects themselves, not in the table array. > Each Entry object should be about 5 words of memory, while > there's approximately one word used within the table array. > So I don't think we'll see anything close to the factor of > two max memory saving that we might expect. > > I would prefer to see engineering work go into something > like auto-reduction of the table array when many elements > have been removed, but that's a hard problem. > > Martin > > Peter Arrenbrecht wrote: > > Hi all, > > > > I recently thought about how to resize hashmaps. When looking at the > > JDK 6 source, I saw that java.util.HashMap always transfers old > > entries from the old table. When memory is tight, it might be better > > to first concatenate the entry lists into one big list, throw away the > > old table, allocate the new one, and then fill it from the > > concatenated list. So: > > > > @Override > > void resize( int _newCapacity ) > > { > > try { > > fastResize( _newCapacity ); // This would be the current resize code. > > } > > catch (OutOfMemoryError e) { > > tightResize( _newCapacity ); > > } > > } > > > > @SuppressWarnings( "unchecked" ) > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = new Entry[ newCapacity ]; > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > @SuppressWarnings("unchecked") > > private Entry joinLists() > > { > > final Entry[] src = table; > > final int n = src.length; > > Entry head = null; > > int i = 0; > > while (i < n && null == head) { > > head = src[ i++ ]; > > } > > Entry tail = head; > > assert i >= n || null != tail; > > while (i < n) { > > Entry e = src[ i++ ]; > > if (null != e) { > > tail.next = e; > > do { > > tail = e; > > e = e.next; > > } while (null != e); > > } > > } > > return head; > > } > > > > @SuppressWarnings("unchecked") > > private void transfer( Entry head, Entry[] tgt ) > > { > > int n = capacity(); > > while (head != null) { > > Entry e = head; > > head = head.next; > > int i = indexFor( e.hash, n ); > > e.next = tgt[ i ]; > > tgt[ i ] = e; > > } > > } > > > > > > What do you think? > > -peo > E-Mail-Nachricht-Anlage (Attached Message) > > -------- Weitergeleitete Nachricht -------- > > Von: Peter Arrenbrecht > > Antwort an: peter.arrenbrecht at gmail.com > > An: core-libs-dev at openjdk.java.net > > Betreff: Re: Proposal: Better HashMap.resize() when memory is tight > > Datum: Wed, 21 Nov 2007 16:51:28 +0100 > > > > einfaches Textdokument-Anlage (Attached Message) > > Hi again, I have gone over my code again and a) discovered a very > > stupid mistake rendering the desired effect null and void, and b) > > developed a test that demos the effect of the improvement. Here's the > > improved code: > > > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = null; // free it first > > table = new Entry[ newCapacity ]; // then reallocate > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > Below you can find the test code. This shows the problem here on > > Ubuntu Linux 7.04 with jre 1.6.0: > > > > java version "1.6.0" > > Java(TM) SE Runtime Environment (build 1.6.0-b105) > > Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode) > > > > The command-line for the improved map is: > > > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new > > > > and for the old map: > > > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap > > > > I have only managed to demo the effect on the server VM. And it is > > necessary to leave an Object array of size initial*2+something free, > > rather than just initial+something, which I expected. Maybe that is an > > effect of the generational collector. Also, there have been spurious > > cases where the test failed even with the new map. No idea why. > > > > Here is the test code: > > > > public static void main( String[] args ) > > { > > final int initial = 131072; > > final float loadFactor = 0.5F; > > final HashMap m; > > if (args.length > 0) { > > System.out.println( "Creating better map..." ); > > m = new BetterHashMap( initial, loadFactor ); > > } > > else { > > System.out.println( "Creating standard map..." ); > > m = new HashMap( initial, loadFactor ); > > } > > > > System.out.println( "Priming map (should see no resize here)..." ); > > for (int i = 0; i < initial / 2; i++) { > > Integer o = i; > > m.put( o, o ); > > } > > Integer o = initial; > > > > Entry head = blockMemExcept( initial * 2 + initial / 4 ); > > System.out.println( "Filled with " + n + " entries." ); > > > > System.out.println( "Adding next element (should see resize here)..." ); > > m.put( o, o ); > > > > if (head == null) System.out.println( "Bad." ); // force "head" to > > remain in scope > > System.out.println( "Done." ); > > } > > > > /** > > * Done separately so memBlock goes out of scope cleanly, leaving no > > local stack copies pointing > > * to it. > > */ > > private static Entry blockMemExcept( int exceptObjs ) > > { > > System.out.println( "Reserving memory..." ); > > Object[] memBlock = new Object[ exceptObjs ]; > > > > System.out.println( "Filling rest of memory..." ); > > int i = 0; > > Entry head = null; > > try { > > while (true) { > > head = new Entry( 0, null, null, head ); > > i++; > > } > > } > > catch (OutOfMemoryError e) { > > // ignore > > } > > > > if (memBlock[ 0 ] != null) return null; > > n = i; > > return head; > > } > > private static int n = 0; > > > > Cheers, > > -peo > > > > > > ps. This all runs on copies of HashMap and AbstractMap in > > ch.arrrenbrecht.java.util. > > > > > > On Nov 21, 2007 9:52 AM, Peter Arrenbrecht wrote: > > > Hi all, > > > > > > I recently thought about how to resize hashmaps. When looking at the > > > JDK 6 source, I saw that java.util.HashMap always transfers old > > > entries from the old table. When memory is tight, it might be better > > > to first concatenate the entry lists into one big list, throw away the > > > old table, allocate the new one, and then fill it from the > > > concatenated list. So: > > > > > > @Override > > > void resize( int _newCapacity ) > > > { > > > try { > > > fastResize( _newCapacity ); // This would be the current resize code. > > > } > > > catch (OutOfMemoryError e) { > > > tightResize( _newCapacity ); > > > } > > > } > > > > > > @SuppressWarnings( "unchecked" ) > > > private void tightResize( int newCapacity ) > > > { > > > Entry head = joinLists(); > > > table = new Entry[ newCapacity ]; > > > threshold = (int) (newCapacity * loadFactor); > > > transfer( head, table ); > > > } > > > > > > @SuppressWarnings("unchecked") > > > private Entry joinLists() > > > { > > > final Entry[] src = table; > > > final int n = src.length; > > > Entry head = null; > > > int i = 0; > > > while (i < n && null == head) { > > > head = src[ i++ ]; > > > } > > > Entry tail = head; > > > assert i >= n || null != tail; > > > while (i < n) { > > > Entry e = src[ i++ ]; > > > if (null != e) { > > > tail.next = e; > > > do { > > > tail = e; > > > e = e.next; > > > } while (null != e); > > > } > > > } > > > return head; > > > } > > > > > > @SuppressWarnings("unchecked") > > > private void transfer( Entry head, Entry[] tgt ) > > > { > > > int n = capacity(); > > > while (head != null) { > > > Entry e = head; > > > head = head.next; > > > int i = indexFor( e.hash, n ); > > > e.next = tgt[ i ]; > > > tgt[ i ] = e; > > > } > > > } > > > > > > > > > What do you think? > > > -peo > > > -- http://kennke.org/blog/ From peter.arrenbrecht at gmail.com Wed Nov 21 20:58:16 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Wed, 21 Nov 2007 21:58:16 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <47446CE1.30000@sun.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> <47446CE1.30000@sun.com> Message-ID: <16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com> Hi Martin > I would prefer to see engineering work go into something > like auto-reduction of the table array when many elements > have been removed, but that's a hard problem. Would you care to elaborate? At first glance, this seems not especially hard to me, in particular since tightResize() would be able to switch to a smaller array without causing a short spike of needing more memory first. But I'm sure I have overlooked something here. -peter On Nov 21, 2007 6:37 PM, Martin Buchholz wrote: > Hi Peter, > > It's true that under low memory conditions your code would > allow execution to continue under some circumstances. > However, I'm not sure this would be an improvement to the JDK. > Recovery from OOME is fraught with hazards. We do occasionally > try, but an application becomes much less reliable once OOMEs > start appearing. Perhaps it's better to fail than to pretend > that the JDK has been bullet-proofed against OOME. > OOME recovery code is rarely executed and hard to test. > The new code would have to be maintained indefinitely, > making future maintenance just a little bit harder for > the maintainers. > > If the hashmap is fully populated, most of the memory is tied > up in the Entry objects themselves, not in the table array. > Each Entry object should be about 5 words of memory, while > there's approximately one word used within the table array. > So I don't think we'll see anything close to the factor of > two max memory saving that we might expect. > > I would prefer to see engineering work go into something > like auto-reduction of the table array when many elements > have been removed, but that's a hard problem. > > Martin > > > Peter Arrenbrecht wrote: > > Hi all, > > > > I recently thought about how to resize hashmaps. When looking at the > > JDK 6 source, I saw that java.util.HashMap always transfers old > > entries from the old table. When memory is tight, it might be better > > to first concatenate the entry lists into one big list, throw away the > > old table, allocate the new one, and then fill it from the > > concatenated list. So: > > > > @Override > > void resize( int _newCapacity ) > > { > > try { > > fastResize( _newCapacity ); // This would be the current resize code. > > } > > catch (OutOfMemoryError e) { > > tightResize( _newCapacity ); > > } > > } > > > > @SuppressWarnings( "unchecked" ) > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = new Entry[ newCapacity ]; > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > @SuppressWarnings("unchecked") > > private Entry joinLists() > > { > > final Entry[] src = table; > > final int n = src.length; > > Entry head = null; > > int i = 0; > > while (i < n && null == head) { > > head = src[ i++ ]; > > } > > Entry tail = head; > > assert i >= n || null != tail; > > while (i < n) { > > Entry e = src[ i++ ]; > > if (null != e) { > > tail.next = e; > > do { > > tail = e; > > e = e.next; > > } while (null != e); > > } > > } > > return head; > > } > > > > @SuppressWarnings("unchecked") > > private void transfer( Entry head, Entry[] tgt ) > > { > > int n = capacity(); > > while (head != null) { > > Entry e = head; > > head = head.next; > > int i = indexFor( e.hash, n ); > > e.next = tgt[ i ]; > > tgt[ i ] = e; > > } > > } > > > > > > What do you think? > > -peo > > > ---------- Forwarded message ---------- > From: Peter Arrenbrecht > To: core-libs-dev at openjdk.java.net > Date: Wed, 21 Nov 2007 16:51:28 +0100 > Subject: Re: Proposal: Better HashMap.resize() when memory is tight > Hi again, I have gone over my code again and a) discovered a very > stupid mistake rendering the desired effect null and void, and b) > developed a test that demos the effect of the improvement. Here's the > improved code: > > private void tightResize( int newCapacity ) > { > Entry head = joinLists(); > table = null; // free it first > table = new Entry[ newCapacity ]; // then reallocate > threshold = (int) (newCapacity * loadFactor); > transfer( head, table ); > } > > Below you can find the test code. This shows the problem here on > Ubuntu Linux 7.04 with jre 1.6.0: > > java version "1.6.0" > Java(TM) SE Runtime Environment (build 1.6.0-b105) > Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode) > > The command-line for the improved map is: > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new > > and for the old map: > > java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap > > I have only managed to demo the effect on the server VM. And it is > necessary to leave an Object array of size initial*2+something free, > rather than just initial+something, which I expected. Maybe that is an > effect of the generational collector. Also, there have been spurious > cases where the test failed even with the new map. No idea why. > > Here is the test code: > > public static void main( String[] args ) > { > final int initial = 131072; > final float loadFactor = 0.5F; > final HashMap m; > if (args.length > 0) { > System.out.println( "Creating better map..." ); > m = new BetterHashMap( initial, loadFactor ); > } > else { > System.out.println( "Creating standard map..." ); > m = new HashMap( initial, loadFactor ); > } > > System.out.println( "Priming map (should see no resize here)..." ); > for (int i = 0; i < initial / 2; i++) { > Integer o = i; > m.put( o, o ); > } > Integer o = initial; > > Entry head = blockMemExcept( initial * 2 + initial / 4 ); > System.out.println( "Filled with " + n + " entries." ); > > System.out.println( "Adding next element (should see resize here)..." ); > m.put( o, o ); > > if (head == null) System.out.println( "Bad." ); // force "head" to > remain in scope > System.out.println( "Done." ); > } > > /** > * Done separately so memBlock goes out of scope cleanly, leaving no > local stack copies pointing > * to it. > */ > private static Entry blockMemExcept( int exceptObjs ) > { > System.out.println( "Reserving memory..." ); > Object[] memBlock = new Object[ exceptObjs ]; > > System.out.println( "Filling rest of memory..." ); > int i = 0; > Entry head = null; > try { > while (true) { > head = new Entry( 0, null, null, head ); > i++; > } > } > catch (OutOfMemoryError e) { > // ignore > } > > if (memBlock[ 0 ] != null) return null; > n = i; > return head; > } > private static int n = 0; > > Cheers, > -peo > > > ps. This all runs on copies of HashMap and AbstractMap in > ch.arrrenbrecht.java.util. > > > On Nov 21, 2007 9:52 AM, Peter Arrenbrecht wrote: > > Hi all, > > > > I recently thought about how to resize hashmaps. When looking at the > > JDK 6 source, I saw that java.util.HashMap always transfers old > > entries from the old table. When memory is tight, it might be better > > to first concatenate the entry lists into one big list, throw away the > > old table, allocate the new one, and then fill it from the > > concatenated list. So: > > > > @Override > > void resize( int _newCapacity ) > > { > > try { > > fastResize( _newCapacity ); // This would be the current resize code. > > } > > catch (OutOfMemoryError e) { > > tightResize( _newCapacity ); > > } > > } > > > > @SuppressWarnings( "unchecked" ) > > private void tightResize( int newCapacity ) > > { > > Entry head = joinLists(); > > table = new Entry[ newCapacity ]; > > threshold = (int) (newCapacity * loadFactor); > > transfer( head, table ); > > } > > > > @SuppressWarnings("unchecked") > > private Entry joinLists() > > { > > final Entry[] src = table; > > final int n = src.length; > > Entry head = null; > > int i = 0; > > while (i < n && null == head) { > > head = src[ i++ ]; > > } > > Entry tail = head; > > assert i >= n || null != tail; > > while (i < n) { > > Entry e = src[ i++ ]; > > if (null != e) { > > tail.next = e; > > do { > > tail = e; > > e = e.next; > > } while (null != e); > > } > > } > > return head; > > } > > > > @SuppressWarnings("unchecked") > > private void transfer( Entry head, Entry[] tgt ) > > { > > int n = capacity(); > > while (head != null) { > > Entry e = head; > > head = head.next; > > int i = indexFor( e.hash, n ); > > e.next = tgt[ i ]; > > tgt[ i ] = e; > > } > > } > > > > > > What do you think? > > -peo > > > > From peter.arrenbrecht at gmail.com Wed Nov 21 21:59:11 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Wed, 21 Nov 2007 22:59:11 +0100 Subject: Proposal: Better HashMap.resize() when memory is tight In-Reply-To: <16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com> References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com> <47446CE1.30000@sun.com> <16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com> Message-ID: <16fff4530711211359q30a93e1cp3443a9b6878f97e9@mail.gmail.com> On Nov 21, 2007 9:58 PM, Peter Arrenbrecht wrote: > Hi Martin > > > I would prefer to see engineering work go into something > > like auto-reduction of the table array when many elements > > have been removed, but that's a hard problem. > > Would you care to elaborate? At first glance, this seems not > especially hard to me, in particular since tightResize() would be able > to switch to a smaller array without causing a short spike of needing > more memory first. But I'm sure I have overlooked something here. Ah, you don't want it to oscillate. Is that it? -peter From peter.arrenbrecht at gmail.com Thu Nov 22 09:33:15 2007 From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht) Date: Thu, 22 Nov 2007 10:33:15 +0100 Subject: Shrinking HashMaps (was Re: Proposal: Better HashMap.resize() when memory is tight) Message-ID: <16fff4530711220133v1418fed4o4f75f0655fef059f@mail.gmail.com> Hi Martin As per your hint, I've taken some time to think about shrinking hashmaps. As you said, it is hard to find a general solution. I do, in fact, believe that we should not even try. I'm asking myself: What are the scenarios where people remove entries on a big scale from large hashmaps? I for one always end up throwing the maps away, not removing things from them. So since I suspect these scenarios to be rare, I also suspect we won't find a large enough subset with compatible requirements to warrant aiming for a general solution. Instead, I propose to allow people to implement shrinking hashmaps themselves on top of HashMap. This could be done by directly extending HashMap, or by adding a new descendant, java.util.ShrinkableHashMap. I have attempted the latter (in order to better show the changes) to see what would be required. The attached classes are just a sketch to invite further discussion. There is ShrinkableHashMap, which would go into java.util so it can properly access HashMap, and there are two demo user classes that implement specific shrinking strategies. (Note: The whole thing is untested and the demos are contrived. I'd like some feedback before I take this further.) Key points of ShrinkableHashMap: * Expose methods to adjust the capacity (several variants). * Expose methods to query the current capacity and load factor. * Expose override point to be notified when the capacity changes. * Expose override point to be notified when the size is reduced (requires change to HashMap). * Use the fallback on tightResize() when super.resize() fails. I chose to use tightResize() here because shrinking hashmaps is something people might want to do especially when memory is low, so if we can do it without needing additional memory, we should. Since we're talking strategies here, another approach might be to use explicit strategy interfaces so people could supply implementations thereof to (Shrinkable)HashMap. I haven't explored this yet. As you can see, this change would open up a fair bit of HashMap's heretofore non-public interface. You would know better whether this is warranted, i.e. whether there is sufficient demand for being able to shrink hashmaps. Is this a direction to follow, do you think? If not, what would you suggest? -Peter ps. Here's the necessary change to HashMap (required to properly support removals through the key and entry sets): @@ -591,7 +591,7 @@ public class HashMap table[i] = next; else prev.next = next; - e.recordRemoval(this); + removed(e); return e; } prev = e; @@ -624,7 +624,7 @@ public class HashMap table[i] = next; else prev.next = next; - e.recordRemoval(this); + removed(e); return e; } prev = e; @@ -632,6 +632,13 @@ public class HashMap } return e; + } + + /** + * Gives both map and entry descendants a chance to react to entry removals. + */ + void removed(Entry e) { + e.recordRemoval(this); } On Nov 21, 2007 10:59 PM, Peter Arrenbrecht wrote: > On Nov 21, 2007 9:58 PM, Peter Arrenbrecht wrote: > > Hi Martin > > > > > I would prefer to see engineering work go into something > > > like auto-reduction of the table array when many elements > > > have been removed, but that's a hard problem. > > > > Would you care to elaborate? At first glance, this seems not > > especially hard to me, in particular since tightResize() would be able > > to switch to a smaller array without causing a short spike of needing > > more memory first. But I'm sure I have overlooked something here. > > Ah, you don't want it to oscillate. Is that it? > -peter > -------------- next part -------------- A non-text attachment was scrubbed... Name: ShrinkableHashMap.java Type: text/x-java Size: 4193 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AggressivelyShrinkingHashMapDescendant.java Type: text/x-java Size: 730 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: SlowlyShrinkingHashMapDescendant.java Type: text/x-java Size: 1030 bytes Desc: not available URL: From neugens at limasoftware.net Fri Nov 23 19:08:04 2007 From: neugens at limasoftware.net (Mario Torre) Date: Fri, 23 Nov 2007 20:08:04 +0100 Subject: [PATCH]: Performance improvement on CopyOnWriteArrayList Message-ID: <1195844884.3311.17.camel@nirvana.limasoftware.net> Hello all! Attached to this mail is a small patch that improves the performance in the removeAll and retainAll methods of CopyOnWriteArrayList. The patch simply checks if the input collection, in both method, is void or contains elements, in the former case no action is performed on the underlying storage, while in retainAll the list is cleared and true is returned, so actually these methods run in constant time when a collection with no elements is passed in. I did a similar change in Classpath (not yet in CVS), and with the following test the improvement is big in both cases (almost 7 seconds on my computer): package test; import java.util.ArrayList; import java.util.List; import java.util.concurrent.CopyOnWriteArrayList; public class CopyOnWriteListPerformance { /** * @param args */ public static void main(String[] args) { List srcList = new ArrayList(); for (int i = 0; i < 10000; i++) srcList.add(i); CopyOnWriteArrayList list = new CopyOnWriteArrayList(srcList); srcList.clear(); long start = System.currentTimeMillis(); list.retainAll(srcList); long stop = System.currentTimeMillis(); System.out.println("starting time: " + start); System.out.println("end time: " + stop); System.out.println("total running time: " + (stop - start) + " (approx. " + ((stop - start) / 1000) + " seconds)"); } } Of course, I don't think that this method is used this way in 99% of the cases; honestly, I think very few would pass intentionally a void list to retainAll, but still, the check is harmless and represent a huge improvement if someone needs it. The patch apply to b23. As a final note, I've already signed the SCA. Thanks for looking, Mario -- Lima Software - http://www.limasoftware.net/ GNU Classpath Developer - http://www.classpath.org/ Fedora Ambassador - http://fedoraproject.org/wiki/MarioTorre Jabber: neugens at jabber.org pgp key: http://subkeys.pgp.net/ PGP Key ID: 80F240CF Fingerprint: BA39 9666 94EC 8B73 27FA FC7C 4086 63E3 80F2 40CF Please, support open standards: http://opendocumentfellowship.org/petition/ http://www.nosoftwarepatents.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: copy-on-write-array-list-performance.patch Type: text/x-patch Size: 866 bytes Desc: not available URL: From markus.gaisbauer at gmail.com Fri Nov 23 21:58:45 2007 From: markus.gaisbauer at gmail.com (Markus Gaisbauer) Date: Fri, 23 Nov 2007 22:58:45 +0100 Subject: [PATCH] Performance bug in String(byte[],int,int,Charset) Message-ID: <47474D15.4060504@gmail.com> A bug in java.lang.StringCoding causes a full and unnecessary copy of the byte array given as the first argument. This results in severe slow down of the Constructor if the byte array is big. The attached patch, should fix the problem. Unfortunately I do not (yet) have an official bug id for this, as this seems to take a while (reported 2 weeks ago). To reproduce the problem run the following test program: import java.nio.charset.Charset; public class StringTest { public static void main(String[] args) throws Exception { long before; long after; byte[] data; data = new byte[1024*1024*16]; // 16 megabyte data[0] = 'X'; // warmup new String(data, 0, 1); new String(data, 0, 1, Charset.forName("UTF8")); new String(data, 0, 1, "UTF8"); before = System.nanoTime(); new String(data, 0, 1); after = System.nanoTime(); System.out.println((after - before) / 1000000 + "ms"); before = System.nanoTime(); new String(data, 0, 1, Charset.forName("UTF8")); after = System.nanoTime(); System.out.println((after - before) / 1000000 + "ms"); before = System.nanoTime(); new String(data, 0, 1, "UTF8"); after = System.nanoTime(); System.out.println((after - before) / 1000000 + "ms"); } } -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: StringCoding.diff URL: From i30817 at gmail.com Sat Nov 24 23:18:19 2007 From: i30817 at gmail.com (Paulo Levi) Date: Sat, 24 Nov 2007 23:18:19 +0000 Subject: core-libs-dev Digest, Vol 7, Issue 8 In-Reply-To: References: Message-ID: <212322090711241518t34ed6dbase5c59b28a3e928b9@mail.gmail.com> Could you look in fixing the insertString(int where, String str) or was it replace(int position, String str, int addsize), method in GapContent ? Currently it makes a totally unneeded copy that stresses the garbage collector where it could use something like this (i removed the undoableEdit on purpose) public UndoableEdit insertString(int where, String str) throws BadLocationException { if (where > length() || where < 0) { throw new BadLocationException("Invalid insert", length()); } //char [] s = str.toCharArray(); //replace(where, 0, s, s.length); replace(where, str, str.length()); return null; } protected void replace(int position, String addItems, int addSize) { if (addSize == 0) { return; } else { int end = open(position, addSize); //System.arraycopy(addItems, rmSize, array, end, endSize); addItems.getChars(0, addSize, (char[]) getArray(), end); } } Another thing: the default smallattributeset created in the DefaultStyledDocument takes far to much time to be found in a hashmap (where it lives) i added a small check for the degenerate case where the attribute count is equal to 0 to not have to enter containsattributes. In my application this was a hotstop, i don't remember exactly where ( I think it was setCharacterAttributes of DefaultStyledDocument, that caused in turn an addEdit to a DefaultDocumentEvent that used a hashmap, and that edit would be got later). protected SmallAttributeSet createSmallAttributeSet(AttributeSet a) { return new SmallAttributeSet(a){ //hashcode of superclass. Redefined to see if weakhasmap behaves better @Override public boolean equals(Object obj){ if (obj instanceof AttributeSet) { AttributeSet attrs = (AttributeSet) obj; return getAttributeCount() == attrs.getAttributeCount() && (getAttributeCount() == 0 || containsAttributes(attrs)); } return false; } }; } From Alan.Bateman at Sun.COM Sun Nov 25 14:22:28 2007 From: Alan.Bateman at Sun.COM (Alan Bateman) Date: Sun, 25 Nov 2007 14:22:28 +0000 Subject: [Fwd: Re: core-libs-dev Digest, Vol 7, Issue 8] Message-ID: <47498524.9040105@sun.com> I assume this was meant for swing-dev. -------------- next part -------------- An embedded message was scrubbed... From: Paulo Levi Subject: Re: core-libs-dev Digest, Vol 7, Issue 8 Date: Sat, 24 Nov 2007 23:18:19 +0000 Size: 7118 URL: From forax at univ-mlv.fr Sun Nov 25 15:20:52 2007 From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=) Date: Sun, 25 Nov 2007 16:20:52 +0100 Subject: [PATCH] Performance bug in String(byte[],int,int,Charset) In-Reply-To: <47474D15.4060504@gmail.com> References: <47474D15.4060504@gmail.com> Message-ID: <474992D4.3010908@univ-mlv.fr> Markus Gaisbauer a ?crit : > A bug in java.lang.StringCoding causes a full and unnecessary copy of > the byte array given as the first argument. > it's not a bug, it's a feature :) i think this copy is a defensive copy to avoid malicious charser (decoder) to access the underlying buffer. By the way, using clone() seams better than Arrays.copyOf() here. byte[] b = ba.clone(); > This results in severe slow down of the Constructor if the byte array is > big. > R?mi > The attached patch, should fix the problem. > > > Unfortunately I do not (yet) have an official bug id for this, as this > seems to take a while (reported 2 weeks ago). > > To reproduce the problem run the following test program: > > import java.nio.charset.Charset; > > public class StringTest { > > public static void main(String[] args) throws Exception { > long before; > long after; > byte[] data; > > data = new byte[1024*1024*16]; // 16 megabyte > data[0] = 'X'; > > // warmup > new String(data, 0, 1); > new String(data, 0, 1, Charset.forName("UTF8")); > new String(data, 0, 1, "UTF8"); > > before = System.nanoTime(); > new String(data, 0, 1); > after = System.nanoTime(); > System.out.println((after - before) / 1000000 + "ms"); > > before = System.nanoTime(); > new String(data, 0, 1, Charset.forName("UTF8")); > after = System.nanoTime(); > System.out.println((after - before) / 1000000 + "ms"); > > before = System.nanoTime(); > new String(data, 0, 1, "UTF8"); > after = System.nanoTime(); > System.out.println((after - before) / 1000000 + "ms"); > } > > } > > ------------------------------------------------------------------------ > > Index: StringCoding.java > =================================================================== > --- StringCoding.java (revision 258) > +++ StringCoding.java (working copy) > @@ -193,7 +193,6 @@ > > static char[] decode(Charset cs, byte[] ba, int off, int len) { > StringDecoder sd = new StringDecoder(cs, cs.name()); > - byte[] b = Arrays.copyOf(ba, ba.length); > return sd.decode(b, off, len); > } > >