question regarding the java.lang.String design

Christian Wimmer wimmer at ssw.jku.at
Fri Jan 30 12:47:32 PST 2009


Hi,

>> While I am looking at the java.lang.String implementation, I  
>> noticed that it has "offset" and "count" field in java.lang.String.  
>> For the offset field, I only found two places which set that field,  
>> but I believe they can be got rid of too. The two places are  
>> String(StringBuffer buffer) & String(StringBuilder builder).
>>
>> My question is that if String is immutable, why do we need to carry  
>> these two fields? String could be more compacted without these two  
>> fields. The equals to method can be more efficiently implemented as  
>> just calling java.util.Array.equals(v1, v2) which is intrinsified  
>> on x86 at least.

You might want to look at a research project we had in the last years  
where this was the first step. You find the references on this page in  
the section "Optimization of Strings": http://wikis.sun.com/display/HotSpotInternals/Publications+JKU
Just eliminating the offset field without other changes is not really  
beneficial, because you loose the optimization of e.g. the method  
"substring", which needs to do an arraycopy when you don't have the  
offset field.

>> Another crazy thought is that we can compact the character array to  
>> a byte array if we don't have any characters other than ASCII  
>> (which we might use a boolean flag to indicate that).

That's a thing we only thought about but never tried, so it would be  
interesting to try it out. The only problem is that prototyping such  
changes is really time-consuming.

Christian



More information about the hotspot-dev mailing list