hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

Thu Nov 15 22:49:34 UTC 2012

Hi,

This change is 6 months old now. I wonder if Oracle received any complaints
from the users since then. I mean complaints that are based on real
observations of performance degradation in real code - not only speculation.

Regards, Peter

2012/11/15 Zhong Yu <zhong.j.yu at gmail.com>

> Since this change is to achieve minor performance boost, it's not fair
> to defend it by saying that it only incurs minor performance
> penalties.
>
> Java programs are infested with strings, most of which could have used
> a more appropriate type, but it is the insane reality. Any change to
> the behavior of strings should have been backed up by a much more
> thorough analysis.
>
> Every usage of substring() was (hopefully) the result of some
> conscious reasoning about space-time. Even if this change does not
> significantly alter an application's performance, it invalidates all
> the reasoning, that's the worst blow in my book. There's no problem if
> substring() does copying from day one, but 17 years have passed.
>
> Zhong Yu
>
> On Wed, Nov 14, 2012 at 6:58 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
> > Personally, I feel like the concern is a bit overstated:
> >
> > 1) the n in O(n) is likely actually fairly small in practice (at least in
> > what I'd consider sane code)
> > 2) I think a lot of people that worry about perf probably aren't using
> > substring() anyway
> > 3) copying char[] is optimized by jit - this is basically a memcpy()-like
> > call, which modern machines handle well
> > 4) the upside is strings are 8 bytes smaller
> > 5) .NET substring() has always allocated new storage (via an optimized
> > internal VM call) and never shared the char[] and I haven't come across
> any
> > complaints or seen serious perf problems myself (granted I seldom use
> > substring)
> >
> > So I don't know if this is anything to worry about in practice.
> >
> > Sent from my phone
> >
> > On Nov 14, 2012 5:26 PM, "Zhong Yu" <zhong.j.yu at gmail.com> wrote:
> >>
> >> On 06/03/2012 11:35 PM, Mike Duigou wrote:
> >> > [I trimmed the distribution list]
> >> >
> >> > On Jun 3 2012, at 13:44 , Peter Levart wrote:
> >> >
> >> >> On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.comwrote:
> >> >>> Changeset: 2c773daa825d
> >> >>> Author:    mduigou
> >> >>> Date:      2012-05-17 10:06 -0700
> >> >>> URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
> >> >>>
> >> >>> 6924259: Remove offset and count fields from java.lang.String
> >> >>> Summary: Removes the use of shared character array buffers by String
> >> >>> along
> >> >>> with the two fields needed to support the use of shared buffers.
> >> >> Wow, that's quite a change.
> >> > Indeed. It was a long time in development. It is a change which is
> >> > expected to be overall beneficial though and in the general case a
> positive
> >> > win.
> >>
> >> Wow!
> >>
> >> If the previous behavior of substring() was once a bug, by now it has
> >> become a well known feature. People know about it, and people depend
> >> on it.
> >>
> >> This change is a big surprise. Changing O(1) to O(n) is a breach of
> >> contract. It'll break lots of old code; and meanwhile lots of new code
> >> are still being written based on the old assumption. After people
> >> learned about the new behavior, they need to comb through and rewrite
> >> their code.
> >>
> >> The worst part is the same code performs very differently on different
> >> versions of JDK. What's a programmer supposed to do if his code
> >> targets JDK6 and above? If the cost of strings are no longer certain,
> >> what else can we believe in?
> >>
> >> Is there any chance in hell to roll it back? Maybe add a new method
> >> for the new behavior?
> >>
> >> Zhong Yu
>