From nradov at axolotl.com  Wed Nov 14 02:21:37 2007
From: nradov at axolotl.com (Nick Radov)
Date: Tue, 13 Nov 2007 18:21:37 -0800
Subject: String.ltrim() and rtrim() methods RFE
Message-ID: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>

I would like to reopen discussion of Bug ID: 4074696 <
http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible inclusion in 
jdk7. It seems to me that RFE was prematurely closed without proper 
consideration. Many developers have needed to left or right trim a String 
and I'm sure that same code has been rewritten thousands of times in 
applications. It would really help to have them in the standard library, 
and the impact on compiled file size would be minimal. The evaluation 
reason giving for closing the bug was "You can write this yourself and get 
reasonable performance with a modern VM." Well, of course we can get 
reasonable performance, but that isn't really the point. The reason for 
adding those methods to the standard library is to reduce the amount of 
redundant, low-level code that application developers have to write. 
Having to write those methods ourselves in applications also forces the 
creation of "StringUtilities" classes with a variety of static methods, 
which somewhat defeats the purpose of OO design.

If we can get this reopened I would be happy to take care of making the 
actual code changes. I just requested the Developer role on the jdk 
project so hopefully that will be approved soon.

Nick Radov ? Research & Development Manager ? Axolotl Corp
www.axolotl.com o: 650.964.1100 x 116
800 West El Camino Real Suite 270 Mountain View CA 94040
 
Frost and Sullivan Awards | Market Leadership | Business Development 
Strategy Leadership
 
The information contained in this e-mail transmission may contain 
confidential information. It is intended for the use of the addressee. If 
you are not the intended recipient, any disclosure, copying, or 
distribution of this information is strictly prohibited. If you receive 
this message in error, please inform the sender immediately and remove any 
record of this message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071113/db50ecab/attachment.html>

From Phil.Race at Sun.COM  Wed Nov 14 04:01:16 2007
From: Phil.Race at Sun.COM (Phil Race)
Date: Tue, 13 Nov 2007 20:01:16 -0800
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>
References: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>
Message-ID: <473A730C.4080408@sun.com>

Nick,
I note this has just two customer records and just a single  jdc vote 
despite having over eight years to
accumulate these whilst it was open, whereas popular issues quickly 
accumulate hundreds of votes.
Furthermore only one person has found it important enough to comment in 
the bug parade comments.
So I think you'd need to show that its a lot more widely needed than 
some low single digits number
of developers.

-phil.


Nick Radov wrote:
>
> I would like to reopen discussion of Bug ID: 4074696 
> <http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible 
> inclusion in jdk7. It seems to me that RFE was prematurely closed 
> without proper consideration. Many developers have needed to left or 
> right trim a String and I'm sure that same code has been rewritten 
> thousands of times in applications. It would really help to have them 
> in the standard library, and the impact on compiled file size would be 
> minimal. The evaluation reason giving for closing the bug was "You can 
> write this yourself and get reasonable performance with a modern VM." 
> Well, of course we can get reasonable performance, but that isn't 
> really the point. The reason for adding those methods to the standard 
> library is to reduce the amount of redundant, low-level code that 
> application developers have to write. Having to write those methods 
> ourselves in applications also forces the creation of 
> "StringUtilities" classes with a variety of static methods, which 
> somewhat defeats the purpose of OO design.
>
> If we can get this reopened I would be happy to take care of making 
> the actual code changes. I just requested the Developer role on the 
> jdk project so hopefully that will be approved soon.
>
> *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> www.axolotl.com o: 650.964.1100 x 116
> 800 West El Camino Real Suite 270 Mountain View CA 94040
>  
> Frost and Sullivan Awards | _Market Leadership_ 
> <http://www.axolotl.com/press/20060626a/> | _Business Development 
> Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
>  
> /The information contained in this e-mail transmission may contain 
> confidential information. It is intended for the use of the addressee. 
> If you are not the intended recipient, any disclosure, copying, or 
> distribution of this information is strictly prohibited. If you 
> receive this message in error, please inform the sender immediately 
> and remove any record of this message./ 


From tobrien at discursive.com  Fri Nov 16 22:11:05 2007
From: tobrien at discursive.com (Tim O'Brien)
Date: Fri, 16 Nov 2007 17:11:05 -0500
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <473A730C.4080408@sun.com>
References: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>
	<473A730C.4080408@sun.com>
Message-ID: <7260cf030711161411k3ab9a21nf4231d1fa5a47010@mail.gmail.com>

ltrim and rtrim are excedingly useful.  I've often wondered why Sun didn't
bother to include them.  It is just another one of those reasons why people
tend to say that string manipulation in Java leaves much to be desired.

This should take all of two minutes to implement, and has a sub-trivial
impact.   Nick's observation is tru:

> The reason for adding those methods to the standard
> library is to reduce the amount of redundant, low-level code that
> application developers have to write. Having to write those methods
> ourselves in applications also forces the creation of
> "StringUtilities" classes with a variety of static methods, which
> somewhat defeats the purpose of OO design.

I've worked on Commons Lang, but I have to tell you that it is an ongoing
embarrassment that every Java program ever developed has to either include a
set of static methods or reimplement low-level String parsing functions
because Sun just doesn't think it is a big idea.

If Java had open classes this really wouldn't be such a big deal.

On 11/13/07, Phil Race <Phil.Race at sun.com> wrote:
>
> Nick,
> I note this has just two customer records and just a single  jdc vote
> despite having over eight years to
> accumulate these whilst it was open, whereas popular issues quickly
> accumulate hundreds of votes.
> Furthermore only one person has found it important enough to comment in
> the bug parade comments.
> So I think you'd need to show that its a lot more widely needed than
> some low single digits number
> of developers.
>
> -phil.
>
>
> Nick Radov wrote:
> >
> > I would like to reopen discussion of Bug ID: 4074696
> > <http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible
> > inclusion in jdk7. It seems to me that RFE was prematurely closed
> > without proper consideration. Many developers have needed to left or
> > right trim a String and I'm sure that same code has been rewritten
> > thousands of times in applications. It would really help to have them
> > in the standard library, and the impact on compiled file size would be
> > minimal. The evaluation reason giving for closing the bug was "You can
> > write this yourself and get reasonable performance with a modern VM."
> > Well, of course we can get reasonable performance, but that isn't
> > really the point. The reason for adding those methods to the standard
> > library is to reduce the amount of redundant, low-level code that
> > application developers have to write. Having to write those methods
> > ourselves in applications also forces the creation of
> > "StringUtilities" classes with a variety of static methods, which
> > somewhat defeats the purpose of OO design.
> >
> > If we can get this reopened I would be happy to take care of making
> > the actual code changes. I just requested the Developer role on the
> > jdk project so hopefully that will be approved soon.
> >
> > *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> > www.axolotl.com o: 650.964.1100 x 116
> > 800 West El Camino Real Suite 270 Mountain View CA 94040
> >
> > Frost and Sullivan Awards | _Market Leadership_
> > <http://www.axolotl.com/press/20060626a/> | _Business Development
> > Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
> >
> > /The information contained in this e-mail transmission may contain
> > confidential information. It is intended for the use of the addressee.
> > If you are not the intended recipient, any disclosure, copying, or
> > distribution of this information is strictly prohibited. If you
> > receive this message in error, please inform the sender immediately
> > and remove any record of this message./
>
>


-- 
------
Tim O'Brien: (847) 863-7045
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071116/37d0109d/attachment.html>

From nradov at axolotl.com  Fri Nov 16 23:19:50 2007
From: nradov at axolotl.com (Nick Radov)
Date: Fri, 16 Nov 2007 15:19:50 -0800
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <473A730C.4080408@sun.com>
References: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>
	<473A730C.4080408@sun.com>
Message-ID: <OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>

Phil,

The bug voting mechanism doesn't really work for trivial RFEs like this. 
First, the bug has been closed for a while and no one is going to vote for 
a closed bug. Second, everyone only gets a few votes so they're going to 
put their votes on the most critical issues. Annoyances like this are left 
to languish.

Let's look at this RFE a different way. Is there any reason not to 
implement it? All of the necessary functionality is already present in the 
trim() method. We just need to break it out into two public ltrim() and 
rtrim() methods.

Microsoft's .NET 3.5 framework has equivalent methods on the System.String 
class as TrimStart and TrimEnd. Don't we want to keep Java competitive 
with .NET?
<
http://msdn2.microsoft.com/en-us/library/system.string_methods(VS.90).aspx
>

Nick Radov ? Research & Development Manager ? Axolotl Corp
www.axolotl.com o: 650.964.1100 x 116
800 West El Camino Real Suite 270 Mountain View CA 94040
 
Frost and Sullivan Awards | Market Leadership | Business Development 
Strategy Leadership
 
The information contained in this e-mail transmission may contain 
confidential information. It is intended for the use of the addressee. If 
you are not the intended recipient, any disclosure, copying, or 
distribution of this information is strictly prohibited. If you receive 
this message in error, please inform the sender immediately and remove any 
record of this message.


From:
Phil Race <Phil.Race at Sun.COM>
To:
Nick Radov <nradov at axolotl.com>
Cc:
core-libs-dev at openjdk.java.net
Date:
11/13/2007 07:58 PM
Subject:
Re: String.ltrim() and rtrim() methods RFE


Nick,
I note this has just two customer records and just a single  jdc vote 
despite having over eight years to
accumulate these whilst it was open, whereas popular issues quickly 
accumulate hundreds of votes.
Furthermore only one person has found it important enough to comment in 
the bug parade comments.
So I think you'd need to show that its a lot more widely needed than 
some low single digits number
of developers.

-phil.


Nick Radov wrote:
>
> I would like to reopen discussion of Bug ID: 4074696 
> <http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible 
> inclusion in jdk7. It seems to me that RFE was prematurely closed 
> without proper consideration. Many developers have needed to left or 
> right trim a String and I'm sure that same code has been rewritten 
> thousands of times in applications. It would really help to have them 
> in the standard library, and the impact on compiled file size would be 
> minimal. The evaluation reason giving for closing the bug was "You can 
> write this yourself and get reasonable performance with a modern VM." 
> Well, of course we can get reasonable performance, but that isn't 
> really the point. The reason for adding those methods to the standard 
> library is to reduce the amount of redundant, low-level code that 
> application developers have to write. Having to write those methods 
> ourselves in applications also forces the creation of 
> "StringUtilities" classes with a variety of static methods, which 
> somewhat defeats the purpose of OO design.
>
> If we can get this reopened I would be happy to take care of making 
> the actual code changes. I just requested the Developer role on the 
> jdk project so hopefully that will be approved soon.
>
> *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> www.axolotl.com o: 650.964.1100 x 116
> 800 West El Camino Real Suite 270 Mountain View CA 94040
> 
> Frost and Sullivan Awards | _Market Leadership_ 
> <http://www.axolotl.com/press/20060626a/> | _Business Development 
> Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
> 
> /The information contained in this e-mail transmission may contain 
> confidential information. It is intended for the use of the addressee. 
> If you are not the intended recipient, any disclosure, copying, or 
> distribution of this information is strictly prohibited. If you 
> receive this message in error, please inform the sender immediately 
> and remove any record of this message./ 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071116/b7bc174b/attachment.html>

From scolebourne at joda.org  Sat Nov 17 00:22:18 2007
From: scolebourne at joda.org (Stephen Colebourne)
Date: Sat, 17 Nov 2007 00:22:18 +0000
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
References: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>	<473A730C.4080408@sun.com>
	<OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
Message-ID: <473E343A.3090506@joda.org>

There is some talk of adding a group of new methods to low level 
classes: http://smallwig.blogspot.com/2007/11/minor-api-fixes-for-jdk-7.html

Personally, I would like to see this extended to become a simple JSR 
where ideas such as ltrim/rtrim and so on can be evaluated properly.
ie. take each JDK class and work through them one by one, considering 
what needs adding (by evaluating open and closed source libraries, and 
taking the most common)

Any JSR does need to be very open though, and it needs to be supported 
by (and funded by) a major company, probably Sun.

Stephen

Nick Radov wrote:
> 
> Phil,
> 
> The bug voting mechanism doesn't really work for trivial RFEs like this. 
> First, the bug has been closed for a while and no one is going to vote 
> for a closed bug. Second, everyone only gets a few votes so they're 
> going to put their votes on the most critical issues. Annoyances like 
> this are left to languish.
> 
> Let's look at this RFE a different way. Is there any reason not to 
> implement it? All of the necessary functionality is already present in 
> the trim() method. We just need to break it out into two public ltrim() 
> and rtrim() methods.
> 
> Microsoft's .NET 3.5 framework has equivalent methods on the 
> System.String class as TrimStart and TrimEnd. Don't we want to keep Java 
> competitive with .NET?
> <http://msdn2.microsoft.com/en-us/library/system.string_methods(VS.90).aspx> 
> 
> 
> *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> www.axolotl.com o: 650.964.1100 x 116
> 800 West El Camino Real Suite 270 Mountain View CA 94040
>  
> Frost and Sullivan Awards | _Market Leadership_ 
> <http://www.axolotl.com/press/20060626a/> | _Business Development 
> Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
>  
> /The information contained in this e-mail transmission may contain 
> confidential information. It is intended for the use of the addressee. 
> If you are not the intended recipient, any disclosure, copying, or 
> distribution of this information is strictly prohibited. If you receive 
> this message in error, please inform the sender immediately and remove 
> any record of this message./
> 
> 
> From: 	Phil Race <Phil.Race at Sun.COM>
> To: 	Nick Radov <nradov at axolotl.com>
> Cc: 	core-libs-dev at openjdk.java.net
> Date: 	11/13/2007 07:58 PM
> Subject: 	Re: String.ltrim() and rtrim() methods RFE
> 
> 
> ------------------------------------------------------------------------
> 
> 
> 
> Nick,
> I note this has just two customer records and just a single  jdc vote
> despite having over eight years to
> accumulate these whilst it was open, whereas popular issues quickly
> accumulate hundreds of votes.
> Furthermore only one person has found it important enough to comment in
> the bug parade comments.
> So I think you'd need to show that its a lot more widely needed than
> some low single digits number
> of developers.
> 
> -phil.
> 
> 
> Nick Radov wrote:
>  >
>  > I would like to reopen discussion of Bug ID: 4074696
>  > <http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible
>  > inclusion in jdk7. It seems to me that RFE was prematurely closed
>  > without proper consideration. Many developers have needed to left or
>  > right trim a String and I'm sure that same code has been rewritten
>  > thousands of times in applications. It would really help to have them
>  > in the standard library, and the impact on compiled file size would be
>  > minimal. The evaluation reason giving for closing the bug was "You can
>  > write this yourself and get reasonable performance with a modern VM."
>  > Well, of course we can get reasonable performance, but that isn't
>  > really the point. The reason for adding those methods to the standard
>  > library is to reduce the amount of redundant, low-level code that
>  > application developers have to write. Having to write those methods
>  > ourselves in applications also forces the creation of
>  > "StringUtilities" classes with a variety of static methods, which
>  > somewhat defeats the purpose of OO design.
>  >
>  > If we can get this reopened I would be happy to take care of making
>  > the actual code changes. I just requested the Developer role on the
>  > jdk project so hopefully that will be approved soon.
>  >
>  > *Nick Radov ? Research & Development Manager ? Axolotl Corp*
>  > www.axolotl.com o: 650.964.1100 x 116
>  > 800 West El Camino Real Suite 270 Mountain View CA 94040
>  >  
>  > Frost and Sullivan Awards | _Market Leadership_
>  > <http://www.axolotl.com/press/20060626a/> | _Business Development
>  > Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
>  >  
>  > /The information contained in this e-mail transmission may contain
>  > confidential information. It is intended for the use of the addressee.
>  > If you are not the intended recipient, any disclosure, copying, or
>  > distribution of this information is strictly prohibited. If you
>  > receive this message in error, please inform the sender immediately
>  > and remove any record of this message./
> 
> 


From freds at jfrog.org  Sat Nov 17 01:55:15 2007
From: freds at jfrog.org (freds at jfrog.org)
Date: Fri, 16 Nov 2007 23:55:15 -0200
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <473E343A.3090506@joda.org>
References: <OF0610DF45.8CF064B6-ON88257393.000ADAA9-88257393.000CF719@axolotl.com>
	<473A730C.4080408@sun.com>
	<OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
	<473E343A.3090506@joda.org>
Message-ID: <cf6776be0711161755n247cbc29w76aaf7ea1c13fd67@mail.gmail.com>

The cost/benefit should be evaluated. And for this one the cost looks
very low. So...

On 11/16/07, Stephen Colebourne <scolebourne at joda.org> wrote:
> There is some talk of adding a group of new methods to low level
> classes: http://smallwig.blogspot.com/2007/11/minor-api-fixes-for-jdk-7.html
>
> Personally, I would like to see this extended to become a simple JSR
> where ideas such as ltrim/rtrim and so on can be evaluated properly.
> ie. take each JDK class and work through them one by one, considering
> what needs adding (by evaluating open and closed source libraries, and
> taking the most common)
>
> Any JSR does need to be very open though, and it needs to be supported
> by (and funded by) a major company, probably Sun.
>
> Stephen
>
> Nick Radov wrote:
> >
> > Phil,
> >
> > The bug voting mechanism doesn't really work for trivial RFEs like this.
> > First, the bug has been closed for a while and no one is going to vote
> > for a closed bug. Second, everyone only gets a few votes so they're
> > going to put their votes on the most critical issues. Annoyances like
> > this are left to languish.
> >
> > Let's look at this RFE a different way. Is there any reason not to
> > implement it? All of the necessary functionality is already present in
> > the trim() method. We just need to break it out into two public ltrim()
> > and rtrim() methods.
> >
> > Microsoft's .NET 3.5 framework has equivalent methods on the
> > System.String class as TrimStart and TrimEnd. Don't we want to keep Java
> > competitive with .NET?
> >
> <http://msdn2.microsoft.com/en-us/library/system.string_methods(VS.90).aspx>
> >
> >
> > *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> > www.axolotl.com o: 650.964.1100 x 116
> > 800 West El Camino Real Suite 270 Mountain View CA 94040
> >
> > Frost and Sullivan Awards | _Market Leadership_
> > <http://www.axolotl.com/press/20060626a/> | _Business Development
> > Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
> >
> > /The information contained in this e-mail transmission may contain
> > confidential information. It is intended for the use of the addressee.
> > If you are not the intended recipient, any disclosure, copying, or
> > distribution of this information is strictly prohibited. If you receive
> > this message in error, please inform the sender immediately and remove
> > any record of this message./
> >
> >
> > From: 	Phil Race <Phil.Race at Sun.COM>
> > To: 	Nick Radov <nradov at axolotl.com>
> > Cc: 	core-libs-dev at openjdk.java.net
> > Date: 	11/13/2007 07:58 PM
> > Subject: 	Re: String.ltrim() and rtrim() methods RFE
> >
> >
> > ------------------------------------------------------------------------
> >
> >
> >
> > Nick,
> > I note this has just two customer records and just a single  jdc vote
> > despite having over eight years to
> > accumulate these whilst it was open, whereas popular issues quickly
> > accumulate hundreds of votes.
> > Furthermore only one person has found it important enough to comment in
> > the bug parade comments.
> > So I think you'd need to show that its a lot more widely needed than
> > some low single digits number
> > of developers.
> >
> > -phil.
> >
> >
> > Nick Radov wrote:
> >  >
> >  > I would like to reopen discussion of Bug ID: 4074696
> >  > <http://bugs.sun.com/view_bug.do?bug_id=4074696> for possible
> >  > inclusion in jdk7. It seems to me that RFE was prematurely closed
> >  > without proper consideration. Many developers have needed to left or
> >  > right trim a String and I'm sure that same code has been rewritten
> >  > thousands of times in applications. It would really help to have them
> >  > in the standard library, and the impact on compiled file size would be
> >  > minimal. The evaluation reason giving for closing the bug was "You can
> >  > write this yourself and get reasonable performance with a modern VM."
> >  > Well, of course we can get reasonable performance, but that isn't
> >  > really the point. The reason for adding those methods to the standard
> >  > library is to reduce the amount of redundant, low-level code that
> >  > application developers have to write. Having to write those methods
> >  > ourselves in applications also forces the creation of
> >  > "StringUtilities" classes with a variety of static methods, which
> >  > somewhat defeats the purpose of OO design.
> >  >
> >  > If we can get this reopened I would be happy to take care of making
> >  > the actual code changes. I just requested the Developer role on the
> >  > jdk project so hopefully that will be approved soon.
> >  >
> >  > *Nick Radov ? Research & Development Manager ? Axolotl Corp*
> >  > www.axolotl.com o: 650.964.1100 x 116
> >  > 800 West El Camino Real Suite 270 Mountain View CA 94040
> >  >
> >  > Frost and Sullivan Awards | _Market Leadership_
> >  > <http://www.axolotl.com/press/20060626a/> | _Business Development
> >  > Strategy Leadership_ <http://www.axolotl.com/press/20060626b/>
> >  >
> >  > /The information contained in this e-mail transmission may contain
> >  > confidential information. It is intended for the use of the addressee.
> >  > If you are not the intended recipient, any disclosure, copying, or
> >  > distribution of this information is strictly prohibited. If you
> >  > receive this message in error, please inform the sender immediately
> >  > and remove any record of this message./
> >
> >
>


-- 
http://freddy33.bglogspot.com/
http://www.jfrog.org/


From mr at sun.com  Sat Nov 17 04:19:27 2007
From: mr at sun.com (Mark Reinhold)
Date: Fri, 16 Nov 2007 20:19:27 -0800
Subject: String.ltrim() and rtrim() methods RFE 
In-Reply-To: nradov@axolotl.com; Fri, 16 Nov 2007 15:19:50 PST;
	<OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
Message-ID: <20071117041927.3965AC13@callebaut.niobe.net>

> Date: Fri, 16 Nov 2007 15:19:50 -0800
> From: Nick Radov <nradov at axolotl.com>

> The bug voting mechanism doesn't really work for trivial RFEs like this.
> First, the bug has been closed for a while and no one is going to vote for
> a closed bug. Second, everyone only gets a few votes so they're going to
> put their votes on the most critical issues. Annoyances like this are left
> to languish.

Agreed.

> Let's look at this RFE a different way. Is there any reason not to
> implement it?

Beware: In general this is not a very persuasive line of reasoning.

If all RFEs over the last ten years had been evaluated in this way then
most of them would've been implemented by now, and the platform would be
a horrid, rotting mess of woefully inconsistent spaghetti.

Having said that, I've spent quite a bit of time over the last couple of
months hacking Python code for the OpenJDK Mercurial infrastructure, and
I've used Python's equivalent lstrip/rstrip functions more than once.
They're quite handy actually, especially the rstrip function that takes
a string argument and removes any trailing characters present in that
string.

So if somebody's going do to this I'd actually recommend adding four
methods (needed since Java doesn't have default parameter values):

    String.ltrim()
    String.ltrim(String charsToTrim)
    String.rtrim()
    String.rtrim(String charsToTrim)

Of course one can get this behavior today with the replaceFirst method,
e.g.,

    s.replaceFirst("[charsToTrim]+$", "")

but that requires compiling the regex and building a matcher, which is
awfully heavyweight, especially in the middle of a tight loop.

- Mark


From pdoubleya at gmail.com  Sat Nov 17 07:00:58 2007
From: pdoubleya at gmail.com (Patrick Wright)
Date: Sat, 17 Nov 2007 08:00:58 +0100
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net>
References: <OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
	<20071117041927.3965AC13@callebaut.niobe.net>
Message-ID: <64efa1ba0711162300y6e2d8b44la299e75ceaabc48a@mail.gmail.com>

Hi Mark

> > Let's look at this RFE a different way. Is there any reason not to
> > implement it?
>
> Beware: In general this is not a very persuasive line of reasoning.
>
> If all RFEs over the last ten years had been evaluated in this way then
> most of them would've been implemented by now, and the platform would be
> a horrid, rotting mess of woefully inconsistent spaghetti.

It's nice to see you comment on this issue--I would like to hear
sometime (seems possibly related to governance) how smaller decisions
about the JDK (like this one) have been made in the past and will be
made in the future. Is it always a matter of responding to an RFE or a
bug? Who makes the decision, the respective group owners? Who keeps an
eye on the big picture? It somehow seems these sorts of changes fall
below the threshold of a JSR, but there's enough of this "tuning"
possible that it seems there should/could be some process for it as
well in order to achieve the both consistency in the result (across
the JDK) as well as encourage a "constant tuning" attitude to avoid
API rot.

I also note that in the projects I've worked on (as a contractor,
server side, for many years), as the other poster said, it's extremely
common for teams to either, a) roll their own convenience APIs, b) use
Apache commons or other friendly libraries. However, it would seem to
me that we could do better, if the community around JDK could agree on
convenience APIs that were perhaps not shipped with the JDK, but were
"blessed" as high-quality and recommended as "useful optional
libraries" and promoted as such.


Patrick


From forax at univ-mlv.fr  Sat Nov 17 14:08:19 2007
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Sat, 17 Nov 2007 15:08:19 +0100
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net>
References: <20071117041927.3965AC13@callebaut.niobe.net>
Message-ID: <473EF5D3.3040700@univ-mlv.fr>

Mark Reinhold a ?crit :
>> Date: Fri, 16 Nov 2007 15:19:50 -0800
>> From: Nick Radov <nradov at axolotl.com>
>>     
>
>   
>> The bug voting mechanism doesn't really work for trivial RFEs like this.
>> First, the bug has been closed for a while and no one is going to vote for
>> a closed bug. Second, everyone only gets a few votes so they're going to
>> put their votes on the most critical issues. Annoyances like this are left
>> to languish.
>>     
>
> Agreed.
>
>   
>> Let's look at this RFE a different way. Is there any reason not to
>> implement it?
>>     
>
> Beware: In general this is not a very persuasive line of reasoning.
>
> If all RFEs over the last ten years had been evaluated in this way then
> most of them would've been implemented by now, and the platform would be
> a horrid, rotting mess of woefully inconsistent spaghetti.
>
> Having said that, I've spent quite a bit of time over the last couple of
> months hacking Python code for the OpenJDK Mercurial infrastructure, and
> I've used Python's equivalent lstrip/rstrip functions more than once.
> They're quite handy actually, especially the rstrip function that takes
> a string argument and removes any trailing characters present in that
> string.
>
> So if somebody's going do to this I'd actually recommend adding four
> methods (needed since Java doesn't have default parameter values):
>
>     String.ltrim()
>     String.ltrim(String charsToTrim)
>     String.rtrim()
>     String.rtrim(String charsToTrim)
>   
we don't have default values but we have varargs :)
so we can mimic python lstrip/rstrip using only two methods:

     String.ltrim(char... charsToTrim)
     String.rtrim(char... charsToTrim)

If the array charsToTrim is empty or null, whitespace characters are used.
> Of course one can get this behavior today with the replaceFirst method,
> e.g.,
>
>     s.replaceFirst("[charsToTrim]+$", "")
>
> but that requires compiling the regex and building a matcher, which is
> awfully heavyweight, especially in the middle of a tight loop.
>
> - Mark
>   
R?mi


From nradov at axolotl.com  Mon Nov 19 18:55:04 2007
From: nradov at axolotl.com (Nick Radov)
Date: Mon, 19 Nov 2007 10:55:04 -0800
Subject: String.ltrim() and rtrim() methods RFE
In-Reply-To: <20071117041927.3965AC13@callebaut.niobe.net>
References: nradov@axolotl.com; Fri, 16 Nov 2007 15:19:50 PST;
	<OF847EDB0D.13859935-ON88257395.007F8C32-88257395.008000EB@axolotl.com>
	<20071117041927.3965AC13@callebaut.niobe.net>
Message-ID: <OF52548122.751355D8-ON88257398.0067A132-88257398.0067EB8D@axolotl.com>

It seems we have general consensus that the ltrim() and rtrim() methods 
should be added, and possibly some other related methods as well. Now, how 
do we go about reopening Bug ID: 4074696 <
http://bugs.sun.com/view_bug.do?bug_id=4074696>? Who has authority to make 
decisions on those issues?

Nick Radov ? Research & Development Manager ? Axolotl Corp
www.axolotl.com o: 650.964.1100 x 116
800 West El Camino Real Suite 270 Mountain View CA 94040
 
Frost and Sullivan Awards | Market Leadership | Business Development 
Strategy Leadership
 
The information contained in this e-mail transmission may contain 
confidential information. It is intended for the use of the addressee. If 
you are not the intended recipient, any disclosure, copying, or 
distribution of this information is strictly prohibited. If you receive 
this message in error, please inform the sender immediately and remove any 
record of this message.


From:
Mark Reinhold <mr at sun.com>
To:
Nick Radov <nradov at axolotl.com>
Cc:
core-libs-dev at openjdk.java.net
Date:
11/16/2007 08:16 PM
Subject:
Re: String.ltrim() and rtrim() methods RFE


> Date: Fri, 16 Nov 2007 15:19:50 -0800
> From: Nick Radov <nradov at axolotl.com>

> The bug voting mechanism doesn't really work for trivial RFEs like this.
> First, the bug has been closed for a while and no one is going to vote 
for
> a closed bug. Second, everyone only gets a few votes so they're going to
> put their votes on the most critical issues. Annoyances like this are 
left
> to languish.

Agreed.

> Let's look at this RFE a different way. Is there any reason not to
> implement it?

Beware: In general this is not a very persuasive line of reasoning.

If all RFEs over the last ten years had been evaluated in this way then
most of them would've been implemented by now, and the platform would be
a horrid, rotting mess of woefully inconsistent spaghetti.

Having said that, I've spent quite a bit of time over the last couple of
months hacking Python code for the OpenJDK Mercurial infrastructure, and
I've used Python's equivalent lstrip/rstrip functions more than once.
They're quite handy actually, especially the rstrip function that takes
a string argument and removes any trailing characters present in that
string.

So if somebody's going do to this I'd actually recommend adding four
methods (needed since Java doesn't have default parameter values):

    String.ltrim()
    String.ltrim(String charsToTrim)
    String.rtrim()
    String.rtrim(String charsToTrim)

Of course one can get this behavior today with the replaceFirst method,
e.g.,

    s.replaceFirst("[charsToTrim]+$", "")

but that requires compiling the regex and building a matcher, which is
awfully heavyweight, especially in the middle of a tight loop.

- Mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071119/e24e2658/attachment.html>

From charlie.hunt at sun.com  Thu Nov 15 13:08:22 2007
From: charlie.hunt at sun.com (charlie hunt)
Date: Thu, 15 Nov 2007 07:08:22 -0600
Subject: encoding-agnostic byte[]-based regexp engine...interested?
In-Reply-To: <473BB9F6.8090206@sun.com>
References: <473BB9F6.8090206@sun.com>
Message-ID: <473C44C6.7030003@sun.com>

Hi Charlie,

I'm adding OpenJDK's Java SE core libraries since that's where Java NIO 
lives. I doubt anything could be done at the class libraries level since 
an API addition / enhancement would likely require JCP activity.  But, 
there may be some value in raising some awareness at the class libraries 
level ?

I'd like to hear others reactions on this mailing list.  My initial 
reaction is what you are describing sounds like something that could be 
very useful for a protocol parser.  The core of Grizzly is protocol 
independent.  But, this might be a useful be able to offer to those who 
are implementing the com.sun.grizzly.ProtocolParser<T> interface.  
ProtocolParser is part of core Grizzly / Grizzly Framework.

I think some additional exploration / investigation is worthy.   We are 
in the process of gathering new feature requests.  I think we should add 
this to that list.

Again, anyone else who has some comments / reactions, please feel free 
to jump in. :-)

charlie ...

Charles Oliver Nutter wrote:
> Oniguruma is a C-based regular expression engine starting to get some 
> attention. The key selling points are its speed and the fact that it 
> can be applied to string content with arbitrary encodings. It will be 
> the default regex engine in Ruby 1.9.
>
> JRuby 1.1 will ship with a port of Oniguruma dubbed "Joni". For us, 
> the benefit is that we'll finally have a fast regex engine that can 
> work with Ruby's encoding-free byte[]-based strings, where before we 
> had to convert to/from char[] for all regex engines. We expect to see 
> great gains in regex performance with JRuby 1.1 when we release the 
> final version in Decemberish timeframe.
>
> But it has occurred to me there could be an even more interesting use 
> of Joni: as a regexp engine that could accept NIO bytebuffers 
> directly. Because it just walks byte[], no decoding is necessary. 
> Because it's encoding-agnostic, any arbitrary byte content could be 
> matched. So in theory it could easily be adapted to be a fast NIO 
> bytebuffer regex engine.
>
> Would there be interest in such a thing? I'm sure there are other 
> NIO-related lists that would be appropriate, but Grizzly is the first 
> actual project that springs to mind when I think of NIO, so I thought 
> I'd toss it out there.
>
> - Charlie
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe at grizzly.dev.java.net
> For additional commands, e-mail: dev-help at grizzly.dev.java.net
>

-- 

Charlie Hunt
Java Performance Engineer

<http://java.sun.com/docs/performance/>


From charles.nutter at sun.com  Fri Nov 16 06:56:05 2007
From: charles.nutter at sun.com (Charles Oliver Nutter)
Date: Fri, 16 Nov 2007 00:56:05 -0600
Subject: encoding-agnostic byte[]-based regexp engine...interested?
In-Reply-To: <473C44C6.7030003@sun.com>
References: <473BB9F6.8090206@sun.com> <473C44C6.7030003@sun.com>
Message-ID: <473D3F05.4030303@sun.com>

charlie hunt wrote:
> Hi Charlie,
> 
> I'm adding OpenJDK's Java SE core libraries since that's where Java NIO 
> lives. I doubt anything could be done at the class libraries level since 
> an API addition / enhancement would likely require JCP activity.  But, 
> there may be some value in raising some awareness at the class libraries 
> level ?

I wouldn't expect this to necessarily be included in Java in the future; 
but this port is rapidly maturing, and having such a thing available in 
the community could show its value for the future.

> I'd like to hear others reactions on this mailing list.  My initial 
> reaction is what you are describing sounds like something that could be 
> very useful for a protocol parser.  The core of Grizzly is protocol 
> independent.  But, this might be a useful be able to offer to those who 
> are implementing the com.sun.grizzly.ProtocolParser<T> interface.  
> ProtocolParser is part of core Grizzly / Grizzly Framework.

Yes, this is exactly what I was thinking. The ability to do such parsing 
without having to decode the incoming content could be very useful.

> I think some additional exploration / investigation is worthy.   We are 
> in the process of gathering new feature requests.  I think we should add 
> this to that list.

If you think that's a possibility. At any rate, the repository for the 
Joni engine is here:

https://svn.codehaus.org/jruby/joni/src/org/joni/

The porter is Marcin Mielczynski, a member of the JRuby team, and all 
credit and kudos should go his way.

- Charlie


From peter.arrenbrecht at gmail.com  Wed Nov 21 08:52:03 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Wed, 21 Nov 2007 09:52:03 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
Message-ID: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>

Hi all,

I recently thought about how to resize hashmaps. When looking at the
JDK 6 source, I saw that java.util.HashMap always transfers old
entries from the old table. When memory is tight, it might be better
to first concatenate the entry lists into one big list, throw away the
old table, allocate the new one, and then fill it from the
concatenated list. So:

	@Override
	void resize( int _newCapacity )
	{
		try {
			fastResize( _newCapacity ); // This would be the current resize code.
		}
		catch (OutOfMemoryError e) {
			tightResize( _newCapacity );
		}
	}

	@SuppressWarnings( "unchecked" )
	private void tightResize( int newCapacity )
	{
		Entry head = joinLists();
		table = new Entry[ newCapacity ];
		threshold = (int) (newCapacity * loadFactor);
		transfer( head, table );
	}

	@SuppressWarnings("unchecked")
	private Entry joinLists()
	{
		final Entry[] src = table;
		final int n = src.length;
		Entry head = null;
		int i = 0;
		while (i < n && null == head) {
			head = src[ i++ ];
		}
		Entry tail = head;
		assert i >= n || null != tail;
		while (i < n) {
			Entry e = src[ i++ ];
			if (null != e) {
				tail.next = e;
				do {
					tail = e;
					e = e.next;
				} while (null != e);
			}
		}
		return head;
	}

	@SuppressWarnings("unchecked")
	private void transfer( Entry head, Entry[] tgt )
	{
		int n = capacity();
		while (head != null) {
			Entry e = head;
			head = head.next;
			int i = indexFor( e.hash, n );
			e.next = tgt[ i ];
			tgt[ i ] = e;
		}
	}


What do you think?
-peo


From peter.arrenbrecht at gmail.com  Wed Nov 21 15:51:28 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Wed, 21 Nov 2007 16:51:28 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
Message-ID: <16fff4530711210751k9b00c25rc2a1c3a03b271dc9@mail.gmail.com>

Hi again, I have gone over my code again and a) discovered a very
stupid mistake rendering the desired effect null and void, and b)
developed a test that demos the effect of the improvement. Here's the
improved code:

	private void tightResize( int newCapacity )
	{
		Entry head = joinLists();
		table = null; // free it first
		table = new Entry[ newCapacity ]; // then reallocate
		threshold = (int) (newCapacity * loadFactor);
		transfer( head, table );
	}

Below you can find the test code. This shows the problem here on
Ubuntu Linux 7.04 with jre 1.6.0:

java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)

The command-line for the improved map is:

  java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new

and for the old map:

  java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap

I have only managed to demo the effect on the server VM. And it is
necessary to leave an Object array of size initial*2+something free,
rather than just initial+something, which I expected. Maybe that is an
effect of the generational collector. Also, there have been spurious
cases where the test failed even with the new map. No idea why.

Here is the test code:

	public static void main( String[] args )
	{
		final int initial = 131072;
		final float loadFactor = 0.5F;
		final HashMap<Integer, Integer> m;
		if (args.length > 0) {
			System.out.println( "Creating better map..." );
			m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
		}
		else {
			System.out.println( "Creating standard map..." );
			m = new HashMap<Integer, Integer>( initial, loadFactor );
		}

		System.out.println( "Priming map (should see no resize here)..." );
		for (int i = 0; i < initial / 2; i++) {
			Integer o = i;
			m.put( o, o );
		}
		Integer o = initial;

		Entry head = blockMemExcept( initial * 2 + initial / 4 );
		System.out.println( "Filled with " + n + " entries." );

		System.out.println( "Adding next element (should see resize here)..." );
		m.put( o, o );

		if (head == null) System.out.println( "Bad." ); // force "head" to
remain in scope
		System.out.println( "Done." );
	}

	/**
	 * Done separately so memBlock goes out of scope cleanly, leaving no
local stack copies pointing
	 * to it.
	 */
	private static Entry blockMemExcept( int exceptObjs )
	{
		System.out.println( "Reserving memory..." );
		Object[] memBlock = new Object[ exceptObjs ];

		System.out.println( "Filling rest of memory..." );
		int i = 0;
		Entry head = null;
		try {
			while (true) {
				head = new Entry( 0, null, null, head );
				i++;
			}
		}
		catch (OutOfMemoryError e) {
			// ignore
		}

		if (memBlock[ 0 ] != null) return null;
		n = i;
		return head;
	}
	private static int n = 0;

Cheers,
-peo


ps. This all runs on copies of HashMap and AbstractMap in
ch.arrrenbrecht.java.util.


On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> Hi all,
>
> I recently thought about how to resize hashmaps. When looking at the
> JDK 6 source, I saw that java.util.HashMap always transfers old
> entries from the old table. When memory is tight, it might be better
> to first concatenate the entry lists into one big list, throw away the
> old table, allocate the new one, and then fill it from the
> concatenated list. So:
>
>         @Override
>         void resize( int _newCapacity )
>         {
>                 try {
>                         fastResize( _newCapacity ); // This would be the current resize code.
>                 }
>                 catch (OutOfMemoryError e) {
>                         tightResize( _newCapacity );
>                 }
>         }
>
>         @SuppressWarnings( "unchecked" )
>         private void tightResize( int newCapacity )
>         {
>                 Entry head = joinLists();
>                 table = new Entry[ newCapacity ];
>                 threshold = (int) (newCapacity * loadFactor);
>                 transfer( head, table );
>         }
>
>         @SuppressWarnings("unchecked")
>         private Entry joinLists()
>         {
>                 final Entry[] src = table;
>                 final int n = src.length;
>                 Entry head = null;
>                 int i = 0;
>                 while (i < n && null == head) {
>                         head = src[ i++ ];
>                 }
>                 Entry tail = head;
>                 assert i >= n || null != tail;
>                 while (i < n) {
>                         Entry e = src[ i++ ];
>                         if (null != e) {
>                                 tail.next = e;
>                                 do {
>                                         tail = e;
>                                         e = e.next;
>                                 } while (null != e);
>                         }
>                 }
>                 return head;
>         }
>
>         @SuppressWarnings("unchecked")
>         private void transfer( Entry head, Entry[] tgt )
>         {
>                 int n = capacity();
>                 while (head != null) {
>                         Entry e = head;
>                         head = head.next;
>                         int i = indexFor( e.hash, n );
>                         e.next = tgt[ i ];
>                         tgt[ i ] = e;
>                 }
>         }
>
>
> What do you think?
> -peo
>


From Martin.Buchholz at Sun.COM  Wed Nov 21 17:37:37 2007
From: Martin.Buchholz at Sun.COM (Martin Buchholz)
Date: Wed, 21 Nov 2007 09:37:37 -0800
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
Message-ID: <47446CE1.30000@sun.com>

Hi Peter,

It's true that under low memory conditions your code would
allow execution to continue under some circumstances.
However, I'm not sure this would be an improvement to the JDK.
Recovery from OOME is fraught with hazards.  We do occasionally
try, but an application becomes much less reliable once OOMEs
start appearing.  Perhaps it's better to fail than to pretend
that the JDK has been bullet-proofed against OOME.
OOME recovery code is rarely executed and hard to test.
The new code would have to be maintained indefinitely,
making future maintenance just a little bit harder for
the maintainers.

If the hashmap is fully populated, most of the memory is tied
up in the Entry objects themselves, not in the table array.
Each Entry object should be about 5 words of memory, while
there's approximately one word used within the table array.
So I don't think we'll see anything close to the factor of
two max memory saving that we might expect.

I would prefer to see engineering work go into something
like auto-reduction of the table array when many elements
have been removed, but that's a hard problem.

Martin

Peter Arrenbrecht wrote:
> Hi all,
> 
> I recently thought about how to resize hashmaps. When looking at the
> JDK 6 source, I saw that java.util.HashMap always transfers old
> entries from the old table. When memory is tight, it might be better
> to first concatenate the entry lists into one big list, throw away the
> old table, allocate the new one, and then fill it from the
> concatenated list. So:
> 
> 	@Override
> 	void resize( int _newCapacity )
> 	{
> 		try {
> 			fastResize( _newCapacity ); // This would be the current resize code.
> 		}
> 		catch (OutOfMemoryError e) {
> 			tightResize( _newCapacity );
> 		}
> 	}
> 
> 	@SuppressWarnings( "unchecked" )
> 	private void tightResize( int newCapacity )
> 	{
> 		Entry head = joinLists();
> 		table = new Entry[ newCapacity ];
> 		threshold = (int) (newCapacity * loadFactor);
> 		transfer( head, table );
> 	}
> 
> 	@SuppressWarnings("unchecked")
> 	private Entry joinLists()
> 	{
> 		final Entry[] src = table;
> 		final int n = src.length;
> 		Entry head = null;
> 		int i = 0;
> 		while (i < n && null == head) {
> 			head = src[ i++ ];
> 		}
> 		Entry tail = head;
> 		assert i >= n || null != tail;
> 		while (i < n) {
> 			Entry e = src[ i++ ];
> 			if (null != e) {
> 				tail.next = e;
> 				do {
> 					tail = e;
> 					e = e.next;
> 				} while (null != e);
> 			}
> 		}
> 		return head;
> 	}
> 
> 	@SuppressWarnings("unchecked")
> 	private void transfer( Entry head, Entry[] tgt )
> 	{
> 		int n = capacity();
> 		while (head != null) {
> 			Entry e = head;
> 			head = head.next;
> 			int i = indexFor( e.hash, n );
> 			e.next = tgt[ i ];
> 			tgt[ i ] = e;
> 		}
> 	}
> 
> 
> What do you think?
> -peo
-------------- next part --------------
An embedded message was scrubbed...
From: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
Subject: Re: Proposal: Better HashMap.resize() when memory is tight
Date: Wed, 21 Nov 2007 16:51:28 +0100
Size: 10453
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071121/19ab2972/AttachedMessage.mht>

From peter.arrenbrecht at gmail.com  Wed Nov 21 20:01:38 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Wed, 21 Nov 2007 21:01:38 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <47446CE1.30000@sun.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
	<47446CE1.30000@sun.com>
Message-ID: <16fff4530711211201h49fff2f2mf4f6632fd072ba81@mail.gmail.com>

Hi Martin

Thanks for responding so quickly and thoughtfully. I just thought that
trying this little bit harder could make the difference when your
already big hashmap overflows by just a few entries. While I agree
that testing this code under OOME conditions would be tiresome (it
took me quite a while to get the demo right, too), its basic soundness
would be very easy to test using direct calls to tightResize(). So a
single test showing it's really an improvement over fastResize() in
terms of max memory footprint would suffice, no? However, since you
say the scenario doesn't warrant the added maintenance burden, I'm
just going to take your word for it. After all, I'm not going to be
the one maintaining it, and I've never seen the problem in practice
myself, either. ;)

But, more generally speaking, does your "no bullet-proofing" argument
mean that in general you don't endorse switching algorithms to try to
cope with tight memory? I can see that starting to "bullet-proof"
could be a never-ending story. However, I think there is a distinction
between what I proposed and some of the wilder schemes one could
contemplate (like growing by less than doubling and having to switch
to less efficient hash->index conversions, or using a TreeMap to hold
overflows, etc.) These latter would affect the overall behaviour of
the HashMap significantly, leading to pervasive code changes. My
change only tries a little harder in one very isolated spot. It does
this with no significant code complexity, and with no effects on the
overall behaviour (other than making it still work in my scenario, of
course). Is that kind of "trying-harder" still bad? Isn't it kind of
similar to what the GC does, too?

Peter


On Nov 21, 2007 6:37 PM, Martin Buchholz <Martin.Buchholz at sun.com> wrote:
> Hi Peter,
>
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards.  We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing.  Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
>
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
>
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
>
> Martin
>
>
> Peter Arrenbrecht wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >       @Override
> >       void resize( int _newCapacity )
> >       {
> >               try {
> >                       fastResize( _newCapacity ); // This would be the current resize code.
> >               }
> >               catch (OutOfMemoryError e) {
> >                       tightResize( _newCapacity );
> >               }
> >       }
> >
> >       @SuppressWarnings( "unchecked" )
> >       private void tightResize( int newCapacity )
> >       {
> >               Entry head = joinLists();
> >               table = new Entry[ newCapacity ];
> >               threshold = (int) (newCapacity * loadFactor);
> >               transfer( head, table );
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private Entry joinLists()
> >       {
> >               final Entry[] src = table;
> >               final int n = src.length;
> >               Entry head = null;
> >               int i = 0;
> >               while (i < n && null == head) {
> >                       head = src[ i++ ];
> >               }
> >               Entry tail = head;
> >               assert i >= n || null != tail;
> >               while (i < n) {
> >                       Entry e = src[ i++ ];
> >                       if (null != e) {
> >                               tail.next = e;
> >                               do {
> >                                       tail = e;
> >                                       e = e.next;
> >                               } while (null != e);
> >                       }
> >               }
> >               return head;
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private void transfer( Entry head, Entry[] tgt )
> >       {
> >               int n = capacity();
> >               while (head != null) {
> >                       Entry e = head;
> >                       head = head.next;
> >                       int i = indexFor( e.hash, n );
> >                       e.next = tgt[ i ];
> >                       tgt[ i ] = e;
> >               }
> >       }
> >
> >
> > What do you think?
> > -peo
>
>
> ---------- Forwarded message ----------
> From: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> To: core-libs-dev at openjdk.java.net
> Date: Wed, 21 Nov 2007 16:51:28 +0100
> Subject: Re: Proposal: Better HashMap.resize() when memory is tight
> Hi again, I have gone over my code again and a) discovered a very
> stupid mistake rendering the desired effect null and void, and b)
> developed a test that demos the effect of the improvement. Here's the
> improved code:
>
>         private void tightResize( int newCapacity )
>         {
>                 Entry head = joinLists();
>                 table = null; // free it first
>                 table = new Entry[ newCapacity ]; // then reallocate
>                 threshold = (int) (newCapacity * loadFactor);
>                 transfer( head, table );
>         }
>
> Below you can find the test code. This shows the problem here on
> Ubuntu Linux 7.04 with jre 1.6.0:
>
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>
> The command-line for the improved map is:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
>
> and for the old map:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
>
> I have only managed to demo the effect on the server VM. And it is
> necessary to leave an Object array of size initial*2+something free,
> rather than just initial+something, which I expected. Maybe that is an
> effect of the generational collector. Also, there have been spurious
> cases where the test failed even with the new map. No idea why.
>
> Here is the test code:
>
>         public static void main( String[] args )
>         {
>                 final int initial = 131072;
>                 final float loadFactor = 0.5F;
>                 final HashMap<Integer, Integer> m;
>                 if (args.length > 0) {
>                         System.out.println( "Creating better map..." );
>                         m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
>                 }
>                 else {
>                         System.out.println( "Creating standard map..." );
>                         m = new HashMap<Integer, Integer>( initial, loadFactor );
>                 }
>
>                 System.out.println( "Priming map (should see no resize here)..." );
>                 for (int i = 0; i < initial / 2; i++) {
>                         Integer o = i;
>                         m.put( o, o );
>                 }
>                 Integer o = initial;
>
>                 Entry head = blockMemExcept( initial * 2 + initial / 4 );
>                 System.out.println( "Filled with " + n + " entries." );
>
>                 System.out.println( "Adding next element (should see resize here)..." );
>                 m.put( o, o );
>
>                 if (head == null) System.out.println( "Bad." ); // force "head" to
> remain in scope
>                 System.out.println( "Done." );
>         }
>
>         /**
>          * Done separately so memBlock goes out of scope cleanly, leaving no
> local stack copies pointing
>          * to it.
>          */
>         private static Entry blockMemExcept( int exceptObjs )
>         {
>                 System.out.println( "Reserving memory..." );
>                 Object[] memBlock = new Object[ exceptObjs ];
>
>                 System.out.println( "Filling rest of memory..." );
>                 int i = 0;
>                 Entry head = null;
>                 try {
>                         while (true) {
>                                 head = new Entry( 0, null, null, head );
>                                 i++;
>                         }
>                 }
>                 catch (OutOfMemoryError e) {
>                         // ignore
>                 }
>
>                 if (memBlock[ 0 ] != null) return null;
>                 n = i;
>                 return head;
>         }
>         private static int n = 0;
>
> Cheers,
> -peo
>
>
> ps. This all runs on copies of HashMap and AbstractMap in
> ch.arrrenbrecht.java.util.
>
>
> On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >         @Override
> >         void resize( int _newCapacity )
> >         {
> >                 try {
> >                         fastResize( _newCapacity ); // This would be the current resize code.
> >                 }
> >                 catch (OutOfMemoryError e) {
> >                         tightResize( _newCapacity );
> >                 }
> >         }
> >
> >         @SuppressWarnings( "unchecked" )
> >         private void tightResize( int newCapacity )
> >         {
> >                 Entry head = joinLists();
> >                 table = new Entry[ newCapacity ];
> >                 threshold = (int) (newCapacity * loadFactor);
> >                 transfer( head, table );
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private Entry joinLists()
> >         {
> >                 final Entry[] src = table;
> >                 final int n = src.length;
> >                 Entry head = null;
> >                 int i = 0;
> >                 while (i < n && null == head) {
> >                         head = src[ i++ ];
> >                 }
> >                 Entry tail = head;
> >                 assert i >= n || null != tail;
> >                 while (i < n) {
> >                         Entry e = src[ i++ ];
> >                         if (null != e) {
> >                                 tail.next = e;
> >                                 do {
> >                                         tail = e;
> >                                         e = e.next;
> >                                 } while (null != e);
> >                         }
> >                 }
> >                 return head;
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private void transfer( Entry head, Entry[] tgt )
> >         {
> >                 int n = capacity();
> >                 while (head != null) {
> >                         Entry e = head;
> >                         head = head.next;
> >                         int i = indexFor( e.hash, n );
> >                         e.next = tgt[ i ];
> >                         tgt[ i ] = e;
> >                 }
> >         }
> >
> >
> > What do you think?
> > -peo
> >
>
>


From roman at kennke.org  Wed Nov 21 20:23:39 2007
From: roman at kennke.org (Roman Kennke)
Date: Wed, 21 Nov 2007 21:23:39 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <47446CE1.30000@sun.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
	<47446CE1.30000@sun.com>
Message-ID: <1195676619.6742.17.camel@mercury>

Hi there,

Why not implement such a thing as a separate library/class. After all,
Map is an interface which can be implemented in many ways and for many
different purposes. I think there are a couple of efforts that go in
this direction, for example javolution:

http://javolution.org/

Cheers, Roman

Am Mittwoch, den 21.11.2007, 09:37 -0800 schrieb Martin Buchholz:
> Hi Peter,
> 
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards.  We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing.  Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
> 
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
> 
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
> 
> Martin
> 
> Peter Arrenbrecht wrote:
> > Hi all,
> > 
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> > 
> > 	@Override
> > 	void resize( int _newCapacity )
> > 	{
> > 		try {
> > 			fastResize( _newCapacity ); // This would be the current resize code.
> > 		}
> > 		catch (OutOfMemoryError e) {
> > 			tightResize( _newCapacity );
> > 		}
> > 	}
> > 
> > 	@SuppressWarnings( "unchecked" )
> > 	private void tightResize( int newCapacity )
> > 	{
> > 		Entry head = joinLists();
> > 		table = new Entry[ newCapacity ];
> > 		threshold = (int) (newCapacity * loadFactor);
> > 		transfer( head, table );
> > 	}
> > 
> > 	@SuppressWarnings("unchecked")
> > 	private Entry joinLists()
> > 	{
> > 		final Entry[] src = table;
> > 		final int n = src.length;
> > 		Entry head = null;
> > 		int i = 0;
> > 		while (i < n && null == head) {
> > 			head = src[ i++ ];
> > 		}
> > 		Entry tail = head;
> > 		assert i >= n || null != tail;
> > 		while (i < n) {
> > 			Entry e = src[ i++ ];
> > 			if (null != e) {
> > 				tail.next = e;
> > 				do {
> > 					tail = e;
> > 					e = e.next;
> > 				} while (null != e);
> > 			}
> > 		}
> > 		return head;
> > 	}
> > 
> > 	@SuppressWarnings("unchecked")
> > 	private void transfer( Entry head, Entry[] tgt )
> > 	{
> > 		int n = capacity();
> > 		while (head != null) {
> > 			Entry e = head;
> > 			head = head.next;
> > 			int i = indexFor( e.hash, n );
> > 			e.next = tgt[ i ];
> > 			tgt[ i ] = e;
> > 		}
> > 	}
> > 
> > 
> > What do you think?
> > -peo
> E-Mail-Nachricht-Anlage (Attached Message)
> > -------- Weitergeleitete Nachricht --------
> > Von: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> > Antwort an: peter.arrenbrecht at gmail.com
> > An: core-libs-dev at openjdk.java.net
> > Betreff: Re: Proposal: Better HashMap.resize() when memory is tight
> > Datum: Wed, 21 Nov 2007 16:51:28 +0100
> > 
> > einfaches Textdokument-Anlage (Attached Message)
> > Hi again, I have gone over my code again and a) discovered a very
> > stupid mistake rendering the desired effect null and void, and b)
> > developed a test that demos the effect of the improvement. Here's the
> > improved code:
> > 
> > 	private void tightResize( int newCapacity )
> > 	{
> > 		Entry head = joinLists();
> > 		table = null; // free it first
> > 		table = new Entry[ newCapacity ]; // then reallocate
> > 		threshold = (int) (newCapacity * loadFactor);
> > 		transfer( head, table );
> > 	}
> > 
> > Below you can find the test code. This shows the problem here on
> > Ubuntu Linux 7.04 with jre 1.6.0:
> > 
> > java version "1.6.0"
> > Java(TM) SE Runtime Environment (build 1.6.0-b105)
> > Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
> > 
> > The command-line for the improved map is:
> > 
> >   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
> > 
> > and for the old map:
> > 
> >   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
> > 
> > I have only managed to demo the effect on the server VM. And it is
> > necessary to leave an Object array of size initial*2+something free,
> > rather than just initial+something, which I expected. Maybe that is an
> > effect of the generational collector. Also, there have been spurious
> > cases where the test failed even with the new map. No idea why.
> > 
> > Here is the test code:
> > 
> > 	public static void main( String[] args )
> > 	{
> > 		final int initial = 131072;
> > 		final float loadFactor = 0.5F;
> > 		final HashMap<Integer, Integer> m;
> > 		if (args.length > 0) {
> > 			System.out.println( "Creating better map..." );
> > 			m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
> > 		}
> > 		else {
> > 			System.out.println( "Creating standard map..." );
> > 			m = new HashMap<Integer, Integer>( initial, loadFactor );
> > 		}
> > 
> > 		System.out.println( "Priming map (should see no resize here)..." );
> > 		for (int i = 0; i < initial / 2; i++) {
> > 			Integer o = i;
> > 			m.put( o, o );
> > 		}
> > 		Integer o = initial;
> > 
> > 		Entry head = blockMemExcept( initial * 2 + initial / 4 );
> > 		System.out.println( "Filled with " + n + " entries." );
> > 
> > 		System.out.println( "Adding next element (should see resize here)..." );
> > 		m.put( o, o );
> > 
> > 		if (head == null) System.out.println( "Bad." ); // force "head" to
> > remain in scope
> > 		System.out.println( "Done." );
> > 	}
> > 
> > 	/**
> > 	 * Done separately so memBlock goes out of scope cleanly, leaving no
> > local stack copies pointing
> > 	 * to it.
> > 	 */
> > 	private static Entry blockMemExcept( int exceptObjs )
> > 	{
> > 		System.out.println( "Reserving memory..." );
> > 		Object[] memBlock = new Object[ exceptObjs ];
> > 
> > 		System.out.println( "Filling rest of memory..." );
> > 		int i = 0;
> > 		Entry head = null;
> > 		try {
> > 			while (true) {
> > 				head = new Entry( 0, null, null, head );
> > 				i++;
> > 			}
> > 		}
> > 		catch (OutOfMemoryError e) {
> > 			// ignore
> > 		}
> > 
> > 		if (memBlock[ 0 ] != null) return null;
> > 		n = i;
> > 		return head;
> > 	}
> > 	private static int n = 0;
> > 
> > Cheers,
> > -peo
> > 
> > 
> > ps. This all runs on copies of HashMap and AbstractMap in
> > ch.arrrenbrecht.java.util.
> > 
> > 
> > On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > > Hi all,
> > >
> > > I recently thought about how to resize hashmaps. When looking at the
> > > JDK 6 source, I saw that java.util.HashMap always transfers old
> > > entries from the old table. When memory is tight, it might be better
> > > to first concatenate the entry lists into one big list, throw away the
> > > old table, allocate the new one, and then fill it from the
> > > concatenated list. So:
> > >
> > >         @Override
> > >         void resize( int _newCapacity )
> > >         {
> > >                 try {
> > >                         fastResize( _newCapacity ); // This would be the current resize code.
> > >                 }
> > >                 catch (OutOfMemoryError e) {
> > >                         tightResize( _newCapacity );
> > >                 }
> > >         }
> > >
> > >         @SuppressWarnings( "unchecked" )
> > >         private void tightResize( int newCapacity )
> > >         {
> > >                 Entry head = joinLists();
> > >                 table = new Entry[ newCapacity ];
> > >                 threshold = (int) (newCapacity * loadFactor);
> > >                 transfer( head, table );
> > >         }
> > >
> > >         @SuppressWarnings("unchecked")
> > >         private Entry joinLists()
> > >         {
> > >                 final Entry[] src = table;
> > >                 final int n = src.length;
> > >                 Entry head = null;
> > >                 int i = 0;
> > >                 while (i < n && null == head) {
> > >                         head = src[ i++ ];
> > >                 }
> > >                 Entry tail = head;
> > >                 assert i >= n || null != tail;
> > >                 while (i < n) {
> > >                         Entry e = src[ i++ ];
> > >                         if (null != e) {
> > >                                 tail.next = e;
> > >                                 do {
> > >                                         tail = e;
> > >                                         e = e.next;
> > >                                 } while (null != e);
> > >                         }
> > >                 }
> > >                 return head;
> > >         }
> > >
> > >         @SuppressWarnings("unchecked")
> > >         private void transfer( Entry head, Entry[] tgt )
> > >         {
> > >                 int n = capacity();
> > >                 while (head != null) {
> > >                         Entry e = head;
> > >                         head = head.next;
> > >                         int i = indexFor( e.hash, n );
> > >                         e.next = tgt[ i ];
> > >                         tgt[ i ] = e;
> > >                 }
> > >         }
> > >
> > >
> > > What do you think?
> > > -peo
> > >
-- 
http://kennke.org/blog/


From peter.arrenbrecht at gmail.com  Wed Nov 21 20:58:16 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Wed, 21 Nov 2007 21:58:16 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <47446CE1.30000@sun.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
	<47446CE1.30000@sun.com>
Message-ID: <16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com>

Hi Martin

> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.

Would you care to elaborate? At first glance, this seems not
especially hard to me, in particular since tightResize() would be able
to switch to a smaller array without causing a short spike of needing
more memory first. But I'm sure I have overlooked something here.

-peter


On Nov 21, 2007 6:37 PM, Martin Buchholz <Martin.Buchholz at sun.com> wrote:
> Hi Peter,
>
> It's true that under low memory conditions your code would
> allow execution to continue under some circumstances.
> However, I'm not sure this would be an improvement to the JDK.
> Recovery from OOME is fraught with hazards.  We do occasionally
> try, but an application becomes much less reliable once OOMEs
> start appearing.  Perhaps it's better to fail than to pretend
> that the JDK has been bullet-proofed against OOME.
> OOME recovery code is rarely executed and hard to test.
> The new code would have to be maintained indefinitely,
> making future maintenance just a little bit harder for
> the maintainers.
>
> If the hashmap is fully populated, most of the memory is tied
> up in the Entry objects themselves, not in the table array.
> Each Entry object should be about 5 words of memory, while
> there's approximately one word used within the table array.
> So I don't think we'll see anything close to the factor of
> two max memory saving that we might expect.
>
> I would prefer to see engineering work go into something
> like auto-reduction of the table array when many elements
> have been removed, but that's a hard problem.
>
> Martin
>
>
> Peter Arrenbrecht wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >       @Override
> >       void resize( int _newCapacity )
> >       {
> >               try {
> >                       fastResize( _newCapacity ); // This would be the current resize code.
> >               }
> >               catch (OutOfMemoryError e) {
> >                       tightResize( _newCapacity );
> >               }
> >       }
> >
> >       @SuppressWarnings( "unchecked" )
> >       private void tightResize( int newCapacity )
> >       {
> >               Entry head = joinLists();
> >               table = new Entry[ newCapacity ];
> >               threshold = (int) (newCapacity * loadFactor);
> >               transfer( head, table );
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private Entry joinLists()
> >       {
> >               final Entry[] src = table;
> >               final int n = src.length;
> >               Entry head = null;
> >               int i = 0;
> >               while (i < n && null == head) {
> >                       head = src[ i++ ];
> >               }
> >               Entry tail = head;
> >               assert i >= n || null != tail;
> >               while (i < n) {
> >                       Entry e = src[ i++ ];
> >                       if (null != e) {
> >                               tail.next = e;
> >                               do {
> >                                       tail = e;
> >                                       e = e.next;
> >                               } while (null != e);
> >                       }
> >               }
> >               return head;
> >       }
> >
> >       @SuppressWarnings("unchecked")
> >       private void transfer( Entry head, Entry[] tgt )
> >       {
> >               int n = capacity();
> >               while (head != null) {
> >                       Entry e = head;
> >                       head = head.next;
> >                       int i = indexFor( e.hash, n );
> >                       e.next = tgt[ i ];
> >                       tgt[ i ] = e;
> >               }
> >       }
> >
> >
> > What do you think?
> > -peo
>
>
> ---------- Forwarded message ----------
> From: Peter Arrenbrecht <peter.arrenbrecht at gmail.com>
> To: core-libs-dev at openjdk.java.net
> Date: Wed, 21 Nov 2007 16:51:28 +0100
> Subject: Re: Proposal: Better HashMap.resize() when memory is tight
> Hi again, I have gone over my code again and a) discovered a very
> stupid mistake rendering the desired effect null and void, and b)
> developed a test that demos the effect of the improvement. Here's the
> improved code:
>
>         private void tightResize( int newCapacity )
>         {
>                 Entry head = joinLists();
>                 table = null; // free it first
>                 table = new Entry[ newCapacity ]; // then reallocate
>                 threshold = (int) (newCapacity * loadFactor);
>                 transfer( head, table );
>         }
>
> Below you can find the test code. This shows the problem here on
> Ubuntu Linux 7.04 with jre 1.6.0:
>
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build 1.6.0-b105)
> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)
>
> The command-line for the improved map is:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap new
>
> and for the old map:
>
>   java -server -Xmx7m -cp bin ch.arrenbrecht.java.util.BetterHashMap
>
> I have only managed to demo the effect on the server VM. And it is
> necessary to leave an Object array of size initial*2+something free,
> rather than just initial+something, which I expected. Maybe that is an
> effect of the generational collector. Also, there have been spurious
> cases where the test failed even with the new map. No idea why.
>
> Here is the test code:
>
>         public static void main( String[] args )
>         {
>                 final int initial = 131072;
>                 final float loadFactor = 0.5F;
>                 final HashMap<Integer, Integer> m;
>                 if (args.length > 0) {
>                         System.out.println( "Creating better map..." );
>                         m = new BetterHashMap<Integer, Integer>( initial, loadFactor );
>                 }
>                 else {
>                         System.out.println( "Creating standard map..." );
>                         m = new HashMap<Integer, Integer>( initial, loadFactor );
>                 }
>
>                 System.out.println( "Priming map (should see no resize here)..." );
>                 for (int i = 0; i < initial / 2; i++) {
>                         Integer o = i;
>                         m.put( o, o );
>                 }
>                 Integer o = initial;
>
>                 Entry head = blockMemExcept( initial * 2 + initial / 4 );
>                 System.out.println( "Filled with " + n + " entries." );
>
>                 System.out.println( "Adding next element (should see resize here)..." );
>                 m.put( o, o );
>
>                 if (head == null) System.out.println( "Bad." ); // force "head" to
> remain in scope
>                 System.out.println( "Done." );
>         }
>
>         /**
>          * Done separately so memBlock goes out of scope cleanly, leaving no
> local stack copies pointing
>          * to it.
>          */
>         private static Entry blockMemExcept( int exceptObjs )
>         {
>                 System.out.println( "Reserving memory..." );
>                 Object[] memBlock = new Object[ exceptObjs ];
>
>                 System.out.println( "Filling rest of memory..." );
>                 int i = 0;
>                 Entry head = null;
>                 try {
>                         while (true) {
>                                 head = new Entry( 0, null, null, head );
>                                 i++;
>                         }
>                 }
>                 catch (OutOfMemoryError e) {
>                         // ignore
>                 }
>
>                 if (memBlock[ 0 ] != null) return null;
>                 n = i;
>                 return head;
>         }
>         private static int n = 0;
>
> Cheers,
> -peo
>
>
> ps. This all runs on copies of HashMap and AbstractMap in
> ch.arrrenbrecht.java.util.
>
>
> On Nov 21, 2007 9:52 AM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > Hi all,
> >
> > I recently thought about how to resize hashmaps. When looking at the
> > JDK 6 source, I saw that java.util.HashMap always transfers old
> > entries from the old table. When memory is tight, it might be better
> > to first concatenate the entry lists into one big list, throw away the
> > old table, allocate the new one, and then fill it from the
> > concatenated list. So:
> >
> >         @Override
> >         void resize( int _newCapacity )
> >         {
> >                 try {
> >                         fastResize( _newCapacity ); // This would be the current resize code.
> >                 }
> >                 catch (OutOfMemoryError e) {
> >                         tightResize( _newCapacity );
> >                 }
> >         }
> >
> >         @SuppressWarnings( "unchecked" )
> >         private void tightResize( int newCapacity )
> >         {
> >                 Entry head = joinLists();
> >                 table = new Entry[ newCapacity ];
> >                 threshold = (int) (newCapacity * loadFactor);
> >                 transfer( head, table );
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private Entry joinLists()
> >         {
> >                 final Entry[] src = table;
> >                 final int n = src.length;
> >                 Entry head = null;
> >                 int i = 0;
> >                 while (i < n && null == head) {
> >                         head = src[ i++ ];
> >                 }
> >                 Entry tail = head;
> >                 assert i >= n || null != tail;
> >                 while (i < n) {
> >                         Entry e = src[ i++ ];
> >                         if (null != e) {
> >                                 tail.next = e;
> >                                 do {
> >                                         tail = e;
> >                                         e = e.next;
> >                                 } while (null != e);
> >                         }
> >                 }
> >                 return head;
> >         }
> >
> >         @SuppressWarnings("unchecked")
> >         private void transfer( Entry head, Entry[] tgt )
> >         {
> >                 int n = capacity();
> >                 while (head != null) {
> >                         Entry e = head;
> >                         head = head.next;
> >                         int i = indexFor( e.hash, n );
> >                         e.next = tgt[ i ];
> >                         tgt[ i ] = e;
> >                 }
> >         }
> >
> >
> > What do you think?
> > -peo
> >
>
>


From peter.arrenbrecht at gmail.com  Wed Nov 21 21:59:11 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Wed, 21 Nov 2007 22:59:11 +0100
Subject: Proposal: Better HashMap.resize() when memory is tight
In-Reply-To: <16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com>
References: <16fff4530711210052t5aa91081tce93513c452a3fdf@mail.gmail.com>
	<47446CE1.30000@sun.com>
	<16fff4530711211258t70f4e417vb533277a1e6372a5@mail.gmail.com>
Message-ID: <16fff4530711211359q30a93e1cp3443a9b6878f97e9@mail.gmail.com>

On Nov 21, 2007 9:58 PM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> Hi Martin
>
> > I would prefer to see engineering work go into something
> > like auto-reduction of the table array when many elements
> > have been removed, but that's a hard problem.
>
> Would you care to elaborate? At first glance, this seems not
> especially hard to me, in particular since tightResize() would be able
> to switch to a smaller array without causing a short spike of needing
> more memory first. But I'm sure I have overlooked something here.

Ah, you don't want it to oscillate. Is that it?
-peter


From peter.arrenbrecht at gmail.com  Thu Nov 22 09:33:15 2007
From: peter.arrenbrecht at gmail.com (Peter Arrenbrecht)
Date: Thu, 22 Nov 2007 10:33:15 +0100
Subject: Shrinking HashMaps (was Re: Proposal: Better HashMap.resize() when
	memory is tight)
Message-ID: <16fff4530711220133v1418fed4o4f75f0655fef059f@mail.gmail.com>

Hi Martin

As per your hint, I've taken some time to think about shrinking
hashmaps. As you said, it is hard to find a general solution. I do, in
fact, believe that we should not even try. I'm asking myself: What are
the scenarios where people remove entries on a big scale from large
hashmaps? I for one always end up throwing the maps away, not removing
things from them. So since I suspect these scenarios to be rare, I
also suspect we won't find a large enough subset with compatible
requirements to warrant aiming for a general solution.

Instead, I propose to allow people to implement shrinking hashmaps
themselves on top of HashMap. This could be done by directly extending
HashMap, or by adding a new descendant, java.util.ShrinkableHashMap. I
have attempted the latter (in order to better show the changes) to see
what would be required.

The attached classes are just a sketch to invite further discussion.
There is ShrinkableHashMap, which would go into java.util so it can
properly access HashMap, and there are two demo user classes that
implement specific shrinking strategies. (Note: The whole thing is
untested and the demos are contrived. I'd like some feedback before I
take this further.)

Key points of ShrinkableHashMap:

	* Expose methods to adjust the capacity (several variants).
	* Expose methods to query the current capacity and load factor.
	* Expose override point to be notified when the capacity changes.
	* Expose override point to be notified when the size is reduced
(requires change to HashMap).
	* Use the fallback on tightResize() when super.resize() fails.

I chose to use tightResize() here because shrinking hashmaps is
something people might want to do especially when memory is low, so if
we can do it without needing additional memory, we should.

Since we're talking strategies here, another approach might be to use
explicit strategy interfaces so people could supply implementations
thereof to (Shrinkable)HashMap. I haven't explored this yet.

As you can see, this change would open up a fair bit of HashMap's
heretofore non-public interface. You would know better whether this is
warranted, i.e. whether there is sufficient demand for being able to
shrink hashmaps.

Is this a direction to follow, do you think? If not, what would you suggest?

-Peter


ps. Here's the necessary change to HashMap (required to properly
support removals through the key and entry sets):

@@ -591,7 +591,7 @@ public class HashMap<K,V>
                     table[i] = next;
                 else
                     prev.next = next;
-                e.recordRemoval(this);
+                removed(e);
                 return e;
             }
             prev = e;
@@ -624,7 +624,7 @@ public class HashMap<K,V>
                     table[i] = next;
                 else
                     prev.next = next;
-                e.recordRemoval(this);
+                removed(e);
                 return e;
             }
             prev = e;
@@ -632,6 +632,13 @@ public class HashMap<K,V>
         }

         return e;
+    }
+
+    /**
+     * Gives both map and entry descendants a chance to react to
entry removals.
+     */
+    void removed(Entry<K,V> e) {
+    	e.recordRemoval(this);
     }


On Nov 21, 2007 10:59 PM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> On Nov 21, 2007 9:58 PM, Peter Arrenbrecht <peter.arrenbrecht at gmail.com> wrote:
> > Hi Martin
> >
> > > I would prefer to see engineering work go into something
> > > like auto-reduction of the table array when many elements
> > > have been removed, but that's a hard problem.
> >
> > Would you care to elaborate? At first glance, this seems not
> > especially hard to me, in particular since tightResize() would be able
> > to switch to a smaller array without causing a short spike of needing
> > more memory first. But I'm sure I have overlooked something here.
>
> Ah, you don't want it to oscillate. Is that it?
> -peter
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ShrinkableHashMap.java
Type: text/x-java
Size: 4193 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071122/e1823a80/ShrinkableHashMap.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AggressivelyShrinkingHashMapDescendant.java
Type: text/x-java
Size: 730 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071122/e1823a80/AggressivelyShrinkingHashMapDescendant.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SlowlyShrinkingHashMapDescendant.java
Type: text/x-java
Size: 1030 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071122/e1823a80/SlowlyShrinkingHashMapDescendant.java>

From neugens at limasoftware.net  Fri Nov 23 19:08:04 2007
From: neugens at limasoftware.net (Mario Torre)
Date: Fri, 23 Nov 2007 20:08:04 +0100
Subject: [PATCH]: Performance improvement on CopyOnWriteArrayList
Message-ID: <1195844884.3311.17.camel@nirvana.limasoftware.net>

Hello all!

Attached to this mail is a small patch that improves the performance in
the removeAll and retainAll methods of CopyOnWriteArrayList.

The patch simply checks if the input collection, in both method, is void
or contains elements, in the former case no action is performed on the
underlying storage, while in retainAll the list is cleared and true is
returned, so actually these methods run in constant time when a
collection with no elements is passed in.

I did a similar change in Classpath (not yet in CVS), and with the
following test the improvement is big in both cases (almost 7 seconds on
my computer):

package test;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.CopyOnWriteArrayList;

public class CopyOnWriteListPerformance
{
  /**
   * @param args
   */
  public static void main(String[] args)
  {
    List<Integer> srcList = new ArrayList<Integer>();
    for (int i = 0; i < 10000; i++)
      srcList.add(i);
    
    CopyOnWriteArrayList<Integer> list =
      new CopyOnWriteArrayList<Integer>(srcList);
 
    srcList.clear();
    
    long start = System.currentTimeMillis();
 
    list.retainAll(srcList);
    
    long stop = System.currentTimeMillis();
    
    System.out.println("starting time: " + start);
    System.out.println("end time: " + stop); 
    System.out.println("total running time: " + (stop - start) +
                       " (approx. " + ((stop - start) / 1000) + "
seconds)");
  }
}

Of course, I don't think that this method is used this way in 99% of the
cases; honestly, I think very few would pass intentionally a void list
to retainAll, but still, the check is harmless and represent a huge
improvement if someone needs it.

The patch apply to b23.

As a final note, I've already signed the SCA.

Thanks for looking,
Mario
-- 
Lima Software - http://www.limasoftware.net/
GNU Classpath Developer - http://www.classpath.org/
Fedora Ambassador - http://fedoraproject.org/wiki/MarioTorre
Jabber: neugens at jabber.org
pgp key: http://subkeys.pgp.net/ PGP Key ID: 80F240CF
Fingerprint: BA39 9666 94EC 8B73 27FA  FC7C 4086 63E3 80F2 40CF

Please, support open standards:
http://opendocumentfellowship.org/petition/
http://www.nosoftwarepatents.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copy-on-write-array-list-performance.patch
Type: text/x-patch
Size: 866 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071123/d207a5e8/copy-on-write-array-list-performance.patch>

From markus.gaisbauer at gmail.com  Fri Nov 23 21:58:45 2007
From: markus.gaisbauer at gmail.com (Markus Gaisbauer)
Date: Fri, 23 Nov 2007 22:58:45 +0100
Subject: [PATCH] Performance bug in String(byte[],int,int,Charset)
Message-ID: <47474D15.4060504@gmail.com>

A bug in java.lang.StringCoding causes a full and unnecessary copy of
the byte array given as the first argument.
This results in severe slow down of the Constructor if the byte array is
big.

The attached patch, should fix the problem.


Unfortunately I do not (yet) have an official bug id for this, as this
seems to take a while (reported 2 weeks ago).

To reproduce the problem run the following test program:

import java.nio.charset.Charset;

public class StringTest {

       public static void main(String[] args) throws Exception {
               long before;
               long after;
               byte[] data;

               data = new byte[1024*1024*16]; // 16 megabyte
               data[0] = 'X';

               // warmup
               new String(data, 0, 1);
               new String(data, 0, 1, Charset.forName("UTF8"));
               new String(data, 0, 1, "UTF8");

               before = System.nanoTime();
               new String(data, 0, 1);
               after = System.nanoTime();
               System.out.println((after - before) / 1000000 + "ms");

               before = System.nanoTime();
               new String(data, 0, 1, Charset.forName("UTF8"));
               after = System.nanoTime();
               System.out.println((after - before) / 1000000 + "ms");

               before = System.nanoTime();
               new String(data, 0, 1, "UTF8");
               after = System.nanoTime();
               System.out.println((after - before) / 1000000 + "ms");
       }

}
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: StringCoding.diff
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071123/029c10e7/StringCoding.diff>

From i30817 at gmail.com  Sat Nov 24 23:18:19 2007
From: i30817 at gmail.com (Paulo Levi)
Date: Sat, 24 Nov 2007 23:18:19 +0000
Subject: core-libs-dev Digest, Vol 7, Issue 8
In-Reply-To: <mailman.5.1195934400.27401.core-libs-dev@openjdk.java.net>
References: <mailman.5.1195934400.27401.core-libs-dev@openjdk.java.net>
Message-ID: <212322090711241518t34ed6dbase5c59b28a3e928b9@mail.gmail.com>

Could you look in fixing the insertString(int where, String str) or
was it replace(int position, String str, int addsize), method in
GapContent ?
Currently it makes a totally unneeded copy that stresses the garbage
collector where it could use something like this (i removed the
undoableEdit on purpose)
public UndoableEdit insertString(int where, String str) throws
BadLocationException {
            if (where > length() || where < 0) {
                throw new BadLocationException("Invalid insert", length());
            }
            //char [] s = str.toCharArray();
            //replace(where, 0, s, s.length);
            replace(where, str, str.length());

            return null;
        }


        protected void replace(int position, String addItems, int addSize) {
            if (addSize == 0) {
                return;
            } else {
                int end = open(position, addSize);
                //System.arraycopy(addItems, rmSize, array, end, endSize);
                addItems.getChars(0, addSize, (char[]) getArray(), end);
            }
        }

Another thing:
the default smallattributeset created in the DefaultStyledDocument
takes far to much time to be found in a hashmap (where it lives) i
added a small check for the degenerate case where the attribute count
is equal to 0 to not have to enter containsattributes. In my
application this was a hotstop, i don't remember exactly where (
I think it was setCharacterAttributes of DefaultStyledDocument, that
caused in turn an addEdit to a DefaultDocumentEvent that used a
hashmap, and that edit would be got later).

protected SmallAttributeSet createSmallAttributeSet(AttributeSet a) {
            return new SmallAttributeSet(a){
                //hashcode of superclass. Redefined to see if
weakhasmap behaves better
                @Override
                public boolean equals(Object obj){
                    if (obj instanceof AttributeSet) {
                        AttributeSet attrs = (AttributeSet) obj;
                        return getAttributeCount() == attrs.getAttributeCount()
                        && (getAttributeCount() == 0 ||
containsAttributes(attrs));
                    }
                    return false;
                }

            };
        }


From Alan.Bateman at Sun.COM  Sun Nov 25 14:22:28 2007
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Sun, 25 Nov 2007 14:22:28 +0000
Subject: [Fwd: Re: core-libs-dev Digest, Vol 7, Issue 8]
Message-ID: <47498524.9040105@sun.com>

I assume this was meant for swing-dev.
-------------- next part --------------
An embedded message was scrubbed...
From: Paulo Levi <i30817 at gmail.com>
Subject: Re: core-libs-dev Digest, Vol 7, Issue 8
Date: Sat, 24 Nov 2007 23:18:19 +0000
Size: 7118
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20071125/b69cd176/core-libs-devDigestVol7Issue8.eml>

From forax at univ-mlv.fr  Sun Nov 25 15:20:52 2007
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Sun, 25 Nov 2007 16:20:52 +0100
Subject: [PATCH] Performance bug in String(byte[],int,int,Charset)
In-Reply-To: <47474D15.4060504@gmail.com>
References: <47474D15.4060504@gmail.com>
Message-ID: <474992D4.3010908@univ-mlv.fr>

Markus Gaisbauer a ?crit :
> A bug in java.lang.StringCoding causes a full and unnecessary copy of
> the byte array given as the first argument.
>   
it's not a bug, it's a feature :)
i think this copy is a defensive copy to avoid malicious charser 
(decoder) to
access the underlying buffer.

By the way, using clone() seams better than Arrays.copyOf() here.

byte[] b = ba.clone();


> This results in severe slow down of the Constructor if the byte array is
> big.
>   
R?mi
> The attached patch, should fix the problem.
>
>
> Unfortunately I do not (yet) have an official bug id for this, as this
> seems to take a while (reported 2 weeks ago).
>
> To reproduce the problem run the following test program:
>
> import java.nio.charset.Charset;
>
> public class StringTest {
>
>        public static void main(String[] args) throws Exception {
>                long before;
>                long after;
>                byte[] data;
>
>                data = new byte[1024*1024*16]; // 16 megabyte
>                data[0] = 'X';
>
>                // warmup
>                new String(data, 0, 1);
>                new String(data, 0, 1, Charset.forName("UTF8"));
>                new String(data, 0, 1, "UTF8");
>
>                before = System.nanoTime();
>                new String(data, 0, 1);
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>
>                before = System.nanoTime();
>                new String(data, 0, 1, Charset.forName("UTF8"));
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>
>                before = System.nanoTime();
>                new String(data, 0, 1, "UTF8");
>                after = System.nanoTime();
>                System.out.println((after - before) / 1000000 + "ms");
>        }
>
> }
>   
> ------------------------------------------------------------------------
>
> Index: StringCoding.java
> ===================================================================
> --- StringCoding.java	(revision 258)
> +++ StringCoding.java	(working copy)
> @@ -193,7 +193,6 @@
>  
>      static char[] decode(Charset cs, byte[] ba, int off, int len) {
>          StringDecoder sd = new StringDecoder(cs, cs.name());
> -        byte[] b = Arrays.copyOf(ba, ba.length);
>          return sd.decode(b, off, len);
>      }
>  
>