String concatenation tweaks

Aleksey Shipilev aleksey.shipilev at oracle.com
Mon Jun 1 14:52:56 UTC 2015


Hi Peter,

On 06/01/2015 04:38 PM, Peter Levart wrote:
> On 06/01/2015 09:49 AM, Aleksey Shipilev wrote:
>> On 06/01/2015 12:33 AM, Peter Levart wrote:
>>> This will also eliminate worries about startup time.
>> It would not, because, as I was saying in the notes, the significant
>> time is spent dealing with indy infrastructure for every user string
>> concat. In other words, a simple smoke test with HelloWorld concating a
>> simple string suffers quite a bit.
> 
> That's the overhead of initializing (generating) code for each new
> unused shape? If the same shape repeats at some other call-site, then it
> should already be cached, right?

I haven't quantified carefully it yet, but a brief look at profiles
point fingers at actual construction, yes. The cached shapes are not
showing up in profiles. But again, I should construct more accurate
stress test for it.


> One thing I noticed after briefly checking the code is the following.
> Javac, as I understand, emits the invokedynamic using compile-time
> argument types. From them the MethodType is constructed when the indy is
> invoked the 1st time and gets passed to the bootstrap method. Are the
> argument types truncated or passed as compiler knows them? 

Yes, we pass the exact types javac knows about. Sometimes it is
problematic, since we need to substitute a bottom type (null) with
something else. Current prototype says it's Object.class.

I don't fully understand if Void is a good substitute for a bottom type.
Is it? Asking for a compiler friend here.


> Passing more
> type info might at first seem beneficial to potential strategies that
> might use it to construct better specializations, but semantically we
> don't need all the reference types. StringBuilder.append() overloads
> only differentiate among 4 distinct reference types:
> 
>     Object, String, StringBuffer, CharSequence
> 
> But internally StringBuilder actually dispatches dynamically to 5
> different run-time cases and treats them differently:
> 
>     null, Object, String, AbstractStringBuilder, CharSequence
> 
> Javac could truncate the reference types to one of the above 5 before
> emitting invokedynamic. The null literal value could be passed via the
> Void parameter type to differentiate it from other types. Why is
> truncating necessary?
> 
> - the key space for the cached shapes is reduced this way. There are
> only 8 primitive + 5 reference = 13 types possible at each argument
> position this way.
> - passing truncated reference types to invokedynamic means that the
> bootstrap method doesn't have to do the truncation.
> - the truncation has to be performed anyway to get rid of custom/user
> reference types which, if used in the MethodType key for caching, will
> cause Class[Loader] leaks.
> 
> What do you think?

I think since we need to get the javac bytecode part future-proof, we
are better off passing the concrete type info to the indy bootstrap.
Bootstrap can then decide if it wants to collapse the types back to
those 4-5 variants, solving both the explosion of shapes, and the class
leaks.

(Note to self: current prototype collapses the types *after* checking
with cache, need to fix that possible class leak, thanks!)

We are not inherently limited with StringBuilder API to do the
concatenation. This compiler improvement actually opens up the way for
specialized implementations that span more than just current 4 reference
types.

Example case: would it make sense to null-check and unbox Integer before
pushing it on to append() chain? This will set us up for
OptoStringConcat for new SB().append(String).append(Integer).toString() :)

Thanks,
-Aleksey.




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/compiler-dev/attachments/20150601/85c26c0a/signature-0001.asc>


More information about the compiler-dev mailing list