String concatenation tweaks

Peter Levart peter.levart at gmail.com
Mon Jun 1 13:38:44 UTC 2015


Hi Aleksey,

On 06/01/2015 09:49 AM, Aleksey Shipilev wrote:
> Hi Peter,
>
> On 06/01/2015 12:33 AM, Peter Levart wrote:
>> One way to tackle this is to have a javac option to emit classical
>> StringBuilder-based code and then build the (java.base module at least)
>> with this option. So only other modules and user code would use indy
>> based concatenation.
> If you read my notes about this:
>   http://cr.openjdk.java.net/~shade/scratch/string-concat-indy/notes.txt
>
> You will see the mention of "java.base is exempt from indy string
> concat, otherwise the initialization circularity ensues". Indeed, there
> is a patch that disables indy string concat for java.base:
>   http://cr.openjdk.java.net/~shade/scratch/string-concat-indy/patch-root-1.patch

I must have missed that. Thanks for pointing out.

>
>> This will also eliminate worries about startup time.
> It would not, because, as I was saying in the notes, the significant
> time is spent dealing with indy infrastructure for every user string
> concat. In other words, a simple smoke test with HelloWorld concating a
> simple string suffers quite a bit.

That's the overhead of initializing (generating) code for each new 
unused shape? If the same shape repeats at some other call-site, then it 
should already be cached, right?

One thing I noticed after briefly checking the code is the following. 
Javac, as I understand, emits the invokedynamic using compile-time 
argument types. From them the MethodType is constructed when the indy is 
invoked the 1st time and gets passed to the bootstrap method. Are the 
argument types truncated or passed as compiler knows them? Passing more 
type info might at first seem beneficial to potential strategies that 
might use it to construct better specializations, but semantically we 
don't need all the reference types. StringBuilder.append() overloads 
only differentiate among 4 distinct reference types:

     Object, String, StringBuffer, CharSequence

But internally StringBuilder actually dispatches dynamically to 5 
different run-time cases and treats them differently:

     null, Object, String, AbstractStringBuilder, CharSequence

Javac could truncate the reference types to one of the above 5 before 
emitting invokedynamic. The null literal value could be passed via the 
Void parameter type to differentiate it from other types. Why is 
truncating necessary?

- the key space for the cached shapes is reduced this way. There are 
only 8 primitive + 5 reference = 13 types possible at each argument 
position this way.
- passing truncated reference types to invokedynamic means that the 
bootstrap method doesn't have to do the truncation.
- the truncation has to be performed anyway to get rid of custom/user 
reference types which, if used in the MethodType key for caching, will 
cause Class[Loader] leaks.

What do you think?

Regards, Peter




More information about the compiler-dev mailing list