RFR (M): 8024635: Caching MethodType's descriptor string improves lambda linkage performance
Christian Thalinger
christian.thalinger at oracle.com
Fri Sep 27 13:12:20 PDT 2013
Looks good.
On Sep 27, 2013, at 5:39 AM, Sergey Kuksenko <sergey.kuksenko at oracle.com> wrote:
>
> Updated version at
> http://cr.openjdk.java.net/~skuksenko/jsr335/8024635/webrev.02/
> according comments:
> - remove "null check" comment near Object.requireNonNull calls
> - distort/(change parameters order) in key-purposed MethodType constructor
> - move count-slot logic from checkPType to checkPTypes.
>
>
>
> On 09/26/2013 11:09 PM, John Rose wrote:
>> (Quick note on format: I have changed the subject line to include the conventional bug number and size estimate. The bug number makes it easier to track in mailers.)
>>
>> On Sep 26, 2013, at 9:22 AM, Sergey Kuksenko <sergey.kuksenko at oracle.com> wrote:
>>
>>> Hi All,
>>>
>>> I updated fix.
>>> You may find it here
>>> http://cr.openjdk.java.net/~skuksenko/jsr335/8024635/webrev.01/
>>>
>>> See my comments and new webrev descition below
>>>
>>> On 09/18/2013 12:59 AM, John Rose wrote:
>>>>>> We can tweak that later if necessary. There are probably only a small number of such strings, so maybe a small leaky map would do the trick as well.
>>>>> In case of small amount of such strings we will get a huge overhead from
>>>>> any kind of map.
>>>> I would expect O(10^1.5) memory references and branches. Depends on what you mean by "huge".
>>> Sure. I think the question to use external map or field may be decided
>>> later when/if it will be needed.
>>> Just some statictics, I've collected on nashorn/octane benchmarks
>>> (statistics collected only for the single(first) benchmark iteration:
>>> mandreel: 514 unique strings, toMethodDescriptorString called 69047 times
>>> box2d: 560 unique strings, 34776 toMethodDescriptorString invocations.
>>
>> Good data; thanks.
>>
>>>
>>>> Would there be any benefit from recording the original string from the constant pool? The JVM path for this is SystemDictionary::find_method_handle_type, which makes an upcall to MethodHandleNatives.findMethodHandleType. Currently only the ptypes and rtype are passed, but the JVM could also convert the internal Symbol to a java.lang.String and pass that also. In that case, MT's created by JVM upcalls would be pre-equipped with descriptor strings.
>>>>
>>>> This is probably worth an experiment, although it requires some JVM work.
>>>
>>> I am definitely sure that we shouldn't do that.
>>> Right now caching desriptor strings is internal decision of MethodType.
>>> Otherwise it will make interfaces more complex.
>>
>> Yes, I agree. Also, it would only benefit the 514 calls which introduce unique strings, whereas your change helps the 69047 calls for descriptor strings.
>>
>>>> I hope you get my overall point: Hand optimizations have a continuing cost, related to obscuring the logical structure of the code. The transforming engineer might make a mistake, and later maintaining engineers might also make mistakes.
>>>>
>>>> https://blogs.oracle.com/jrose/entry/three_software_proverbs
>>>
>>> And it doesn't mean that we should afraid any kind of optimization.
>>> Lucky positions - someone(or something) will optimize it for us. But
>>> sometimes JIT (even smart JIT) is not enough.
>>
>> Sometimes. When that happens we reluctantly hand-optimize our code.
>>
>>> Let's back to the patch.construstors
>>> I decided not change original checkPType/checkRType code, except
>>> explicit Objects.requireNonNull. The problem here that we are creating
>>> new MethodType objects which are already exists, we have to create MT as
>>> a key for searching in the internedTable. Ratio between already exiting
>>> and new MethodTypes a quite huge.
>>> Here is some statistics I've collected on nashorn/octane benchmarks
>>> (statistics collected only for the single(first) benchmark iteration:
>>>
>>> mandreel:
>>> - 1739 unique MethodTypes,
>>> - 878414 MethodType.makeImpl requests;
>>> box2d:
>>> - 1861 unique MethodTypes,
>>> - 527486 MethodType.makeImpl requests;
>>> gameboy:
>>> - 2062 unique MethodTypes,
>>> - 311378 MethodType.makeImpl requests;
>>>
>>> So
>>> 1. MethodType constructor do checkPTypes/checkRType which are frequently
>>> (99%) useless - we already have the same (checked) MethodType in the
>>> internTable.
>>> 2. Less than 1% of created MethodTypes will be stored into internTable,
>>> but reusing that MethodType objects prevent ti from scalar replacement.
>>> (I've heard that Graal may do flow-sensitive scalar replacement, but C2
>>> can't do).
>>>
>>> What I did. I separate constructors:
>>> - for long lived MethodTypes with all checks,
>>> - for short lived MethodTypes used only as key for searching in the
>>> InterTable, no checks are performed.
>>> if we didn't find MethodType in the table we always create a new one
>>> (with checks) that is required in less than 1% cases. And we remove at
>>> least one obstacle for scalar replacement. Unfortunately, scalaer
>>> replacement for expression "internTable.get(new MethodType(rtype,
>>> ptypes))" was not performed, the reason may be evaluated, and hope it
>>> would be easier to achieve scalarization in this case.
>>
>> Brilliant. I love this.
>>
>> The constructor signatures are too similar given the subtle difference between their semantics. Please distort the signature of the extra-sensitive constructor. (Imagine the wrong one accidentally invoked by helpful IDE code completers.) Put in a leading int dummy argument, or reverse the parameters, or some similar hack. (If we had named constructors, we'd hack the name instead.) I admit this is of marginal benefit, since there is only one place where the constructors are called, but it's good style. If there were more places calling the constructor, we'd absolutely want a clear distinction between checked and unchecked.
>>
>> If it still makes sense, I wouldn't mind moving the slot-count logic (long/double adjustment) out of checkPType into checkPTypes. But I think that's moot with this much better change.
>> - slots += checkPtype(ptype);
>> + checkPtype(ptype);
>> + if (ptype == double.class || ptype == long.class) {
>> + slots++;
>> + }
>>
>> Good work. Reviewed. (You need another reviewer, I think.)
>>
>> — John
>>
>
>
> --
> Best regards,
> Sergey Kuksenko
More information about the hotspot-compiler-dev
mailing list