hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String
Zhong Yu
zhong.j.yu at gmail.com
Thu Nov 15 04:56:47 UTC 2012
Since this change is to achieve minor performance boost, it's not fair
to defend it by saying that it only incurs minor performance
penalties.
Java programs are infested with strings, most of which could have used
a more appropriate type, but it is the insane reality. Any change to
the behavior of strings should have been backed up by a much more
thorough analysis.
Every usage of substring() was (hopefully) the result of some
conscious reasoning about space-time. Even if this change does not
significantly alter an application's performance, it invalidates all
the reasoning, that's the worst blow in my book. There's no problem if
substring() does copying from day one, but 17 years have passed.
Zhong Yu
On Wed, Nov 14, 2012 at 6:58 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> Personally, I feel like the concern is a bit overstated:
>
> 1) the n in O(n) is likely actually fairly small in practice (at least in
> what I'd consider sane code)
> 2) I think a lot of people that worry about perf probably aren't using
> substring() anyway
> 3) copying char[] is optimized by jit - this is basically a memcpy()-like
> call, which modern machines handle well
> 4) the upside is strings are 8 bytes smaller
> 5) .NET substring() has always allocated new storage (via an optimized
> internal VM call) and never shared the char[] and I haven't come across any
> complaints or seen serious perf problems myself (granted I seldom use
> substring)
>
> So I don't know if this is anything to worry about in practice.
>
> Sent from my phone
>
> On Nov 14, 2012 5:26 PM, "Zhong Yu" <zhong.j.yu at gmail.com> wrote:
>>
>> On 06/03/2012 11:35 PM, Mike Duigou wrote:
>> > [I trimmed the distribution list]
>> >
>> > On Jun 3 2012, at 13:44 , Peter Levart wrote:
>> >
>> >> On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.com wrote:
>> >>> Changeset: 2c773daa825d
>> >>> Author: mduigou
>> >>> Date: 2012-05-17 10:06 -0700
>> >>> URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
>> >>>
>> >>> 6924259: Remove offset and count fields from java.lang.String
>> >>> Summary: Removes the use of shared character array buffers by String
>> >>> along
>> >>> with the two fields needed to support the use of shared buffers.
>> >> Wow, that's quite a change.
>> > Indeed. It was a long time in development. It is a change which is
>> > expected to be overall beneficial though and in the general case a positive
>> > win.
>>
>> Wow!
>>
>> If the previous behavior of substring() was once a bug, by now it has
>> become a well known feature. People know about it, and people depend
>> on it.
>>
>> This change is a big surprise. Changing O(1) to O(n) is a breach of
>> contract. It'll break lots of old code; and meanwhile lots of new code
>> are still being written based on the old assumption. After people
>> learned about the new behavior, they need to comb through and rewrite
>> their code.
>>
>> The worst part is the same code performs very differently on different
>> versions of JDK. What's a programmer supposed to do if his code
>> targets JDK6 and above? If the cost of strings are no longer certain,
>> what else can we believe in?
>>
>> Is there any chance in hell to roll it back? Maybe add a new method
>> for the new behavior?
>>
>> Zhong Yu
More information about the core-libs-dev
mailing list