String concatenation tweaks

John Rose john.r.rose at oracle.com
Tue Jun 2 02:39:03 UTC 2015


On Jun 1, 2015, at 7:52 AM, Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:
> 
> Hi Peter,
> 
>> On 06/01/2015 04:38 PM, Peter Levart wrote:
>>> On 06/01/2015 09:49 AM, Aleksey Shipilev wrote:
>>>> On 06/01/2015 12:33 AM, Peter Levart wrote:
>>>> This will also eliminate worries about startup time.
>>> It would not, because, as I was saying in the notes, the significant
>>> time is spent dealing with indy infrastructure for every user string
>>> concat. In other words, a simple smoke test with HelloWorld concating a
>>> simple string suffers quite a bit.
>> 
>> That's the overhead of initializing (generating) code for each new
>> unused shape? If the same shape repeats at some other call-site, then it
>> should already be cached, right?
> 
> I haven't quantified carefully it yet, but a brief look at profiles
> point fingers at actual construction, yes. The cached shapes are not
> showing up in profiles. But again, I should construct more accurate
> stress test for it.
> 
> 
>> One thing I noticed after briefly checking the code is the following.
>> Javac, as I understand, emits the invokedynamic using compile-time
>> argument types. From them the MethodType is constructed when the indy is
>> invoked the 1st time and gets passed to the bootstrap method. Are the
>> argument types truncated or passed as compiler knows them? 
> 
> Yes, we pass the exact types javac knows about. Sometimes it is
> problematic, since we need to substitute a bottom type (null) with
> something else. Current prototype says it's Object.class.
> 
> I don't fully understand if Void is a good substitute for a bottom type.
> Is it? Asking for a compiler friend here.

Void is the convention used by MH.invoke and it works here too. The only ref of that type is null. 

> 
>> Passing more
>> type info might at first seem beneficial to potential strategies that
>> might use it to construct better specializations, but semantically we
>> don't need all the reference types. StringBuilder.append() overloads
>> only differentiate among 4 distinct reference types:
>> 
>>    Object, String, StringBuffer, CharSequence
>> 
>> But internally StringBuilder actually dispatches dynamically to 5
>> different run-time cases and treats them differently:
>> 
>>    null, Object, String, AbstractStringBuilder, CharSequence
>> 
>> Javac could truncate the reference types to one of the above 5 before
>> emitting invokedynamic. The null literal value could be passed via the
>> Void parameter type to differentiate it from other types. Why is
>> truncating necessary?
>> 
>> - the key space for the cached shapes is reduced this way. There are
>> only 8 primitive + 5 reference = 13 types possible at each argument
>> position this way.
>> - passing truncated reference types to invokedynamic means that the
>> bootstrap method doesn't have to do the truncation.
>> - the truncation has to be performed anyway to get rid of custom/user
>> reference types which, if used in the MethodType key for caching, will
>> cause Class[Loader] leaks.
>> 
>> What do you think?
> 
> I think since we need to get the javac bytecode part future-proof, we
> are better off passing the concrete type info to the indy bootstrap.
> Bootstrap can then decide if it wants to collapse the types back to
> those 4-5 variants, solving both the explosion of shapes, and the class
> leaks.

That sounds reasonable. This is an intended use pattern for Indy–to capture accurate static types. 

> (Note to self: current prototype collapses the types *after* checking
> with cache, need to fix that possible class leak, thanks!

Yep. A late asType call will fix things up. We can optimize the pattern more if needed. 

> We are not inherently limited with StringBuilder API to do the
> concatenation. This compiler improvement actually opens up the way for
> specialized implementations that span more than just current 4 reference
> types.
> 
> Example case: would it make sense to null-check and unbox Integer before
> pushing it on to append() chain? This will set us up for
> OptoStringConcat for new SB().append(String).append(Integer).toString() :)

Yes. You could express this quite directly with a filterArgs or asType transform. If that creates too many LFs we can optimize that also. 

Note to other self:  asType LFs are not well cached at present. Fix. 

– John

> Thanks,
> -Aleksey.
> 
> 
> 
> 


More information about the compiler-dev mailing list