hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

Mon Jun 4 12:41:14 UTC 2012

On 06/03/2012 11:35 PM, Mike Duigou wrote:
> [I trimmed the distribution list]
>
> On Jun 3 2012, at 13:44 , Peter Levart wrote:
>
>> On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.com wrote:
>>> Changeset: 2c773daa825d
>>> Author:    mduigou
>>> Date:      2012-05-17 10:06 -0700
>>> URL:       http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
>>>
>>> 6924259: Remove offset and count fields from java.lang.String
>>> Summary: Removes the use of shared character array buffers by String along
>>> with the two fields needed to support the use of shared buffers.
>> Wow, that's quite a change.
> Indeed. It was a long time in development. It is a change which is expected to be overall beneficial though and in the general case a positive win.
>
>> So .substring() is not O(1) any more?
> No. Though with object allocation it probably was only ever roughly O(1) anyway.

Allocation fast path just bump a pointer, so it's O(1).

There are two advantages of the new code.
The String object and the array of chars are now co-located in memory
(at least for small/medium strings) so cpu caches are happy.

This fix a longstanding memory leak issue
   see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4513622
that why some parsers already doesn't use the substring() trick.
BTW Mike, you can now close this bug.

>
>> Doesn't this have impact on the performance of parsers and such that rely on
>> the performance caracteristics of the .substring() ?
> It does have an impact. We've seen as much as a couple of percent on some benchmarks. Parsers which use substring for extraction are definitely impacted by this change.
>
>> Have you considered then implementing .subSequence() not in terms of just
>> delegating to .substring() but returning a special CharSequence view over the
>> chars of the sub-sequence?
> It does look that String.subSequence() returning a special view rather than a substring would be a good optimization and probably a very good compromise for parser developers. Please create an issue and if you have the time and expertise a patch would speed things along (though unfortunately almost certainly too late for inclusion in 7u6).

Given that Integer.parseInt() or Double.parseDouble() takes a String and
not a CharSequence, yes you can create a CharSequence view but the only 
way to use it
is to call toString() on it.

>
> Mike

cheers,
Rémi