RFR (S) CR 7177472: JSR292: MethodType interning penalizes scalability

Sat Jun 8 20:29:10 UTC 2013

On 06/08/2013 10:25 PM, Peter Levart wrote:
>
> On 06/08/2013 10:17 PM, Peter Levart wrote:
>>
>> On 06/08/2013 12:21 PM, Remi Forax wrote:
>>> On 06/07/2013 12:01 PM, Aleksey Shipilev wrote:
>>>> (posting this to hotspot-dev@ and cc-ing core-libs-dev@, as 
>>>> Christian T.
>>>> suggested offline)
>>>>
>>>> Hi guys,
>>>
>>> Hi Aleksey,
>>>
>>>>
>>>> The fix for scalability problem is here:
>>>> http://cr.openjdk.java.net/~shade/7177472/webrev.00/
>>>
>>> in add, the do/while does something a little weird, if putIfAbsent 
>>> returns null,
>>> interned is equals to elem, there is no need to do a e.get() in that 
>>> case,
>>> I would prefer a code like this:
>>>
>>>             T interned;
>>>             do {
>>>                 expungeStaleElements();
>>>                 WeakEntry<T> e = new WeakEntry<>(elem, stale);
>>>                 WeakEntry<T> exist = map.putIfAbsent(e, e);
>>>                 interned = (exist == null)? elem: exist.get();
>>>             } while (interned == null);
>>>             return interned;
>>
>> Hi Remi, Aleksey,
>>
>> In case the loop retries, there's no need to construct another 
>> WeakEntry...
>>
>>             T interned;
>>              WeakEntry<T> e = new WeakEntry<>(elem, stale);
>>             do {
>>                 expungeStaleElements();
>>                 WeakEntry<T> exist = map.putIfAbsent(e, e);
>>                 interned = (exist == null)? elem: exist.get();
>>             } while (interned == null);
>>             return interned;
>>
>> Regards, Peter
>
> Aleksey,
>
> Also, in a retry-loop you might be spin-waiting for the 
> ReferenceHandler thread to transfer the cleared WeakEntry to the 
> ReferenceQueue. You might want to do a map.replace(e, exist, e) in 
> case exist != null and exist.get() == null to make it independent of 
> ReferenceHandler thread's processing. In this case only a single 
> out-of-loop call to expungeStaleElements() would be enough.

Scrap that. when exist.get() == null, exist.equals(e) == false, so 
putIfAbsent should succeed.

>
> Peter
>
>>
>>>
>>> The cast to WeakEntry in expungeStaleElements is not needed.
>>>
>>> In WeakEntry.equals, the null check is not needed because null 
>>> instanceof WeakEntry returns false,
>>> and you don't need to cast obj.get() to a WeakEntry<T> given you 
>>> only to call equals on entry.get().
>>> So the code should be:
>>>
>>>
>>>            public boolean equals(Object obj) {
>>>              if (!(obj instanceof WeakEntry)) {
>>>                return false;
>>>              }
>>>                Object that = ((WeakEntry<?>)obj).get();
>>>              Object mine = get();
>>>              return (that == null || mine == null)? this == obj: 
>>> mine.equals(that);
>>>          }
>>>
>>>
>>> Otherwise, it looks good.
>>>
>>> Rémi
>>>
>>>>
>>>> Testing:
>>>>    - Linux x86_64 builds OK
>>>>    - Linux x86_64 java/lang/invoke/ jtreg passes OK
>>>>
>>>> Since this is a proof-of-concept patch at this point, I kept the 
>>>> testing
>>>> minimal. Let me know what other testing you think is needed before the
>>>> integration.
>>>>
>>>> Thanks,
>>>> Aleksey.
>>>>
>>>> *** Rationale and details for the issue follow: ***
>>>>
>>>> This issue was in limbo for an year. The performance issue with
>>>> MethodType.methodType factory leads to abysmal scalability on the
>>>> targeted tests. While this can arguably demonstrated on larger
>>>> workloads, the targeted microbenchmarks are showcasing it nicely. This
>>>> issue was blocking the JSR292 API performance testing, because on 
>>>> large
>>>> enough test server, you will inevitably be bitten by this.
>>>>
>>>> If you run the JMH-enabled [1] microbenchmark:
>>>> http://openjdk.java.net/projects/code-tools/jmh/
>>>> http://cr.openjdk.java.net/~shade/7177472/jsr292bench.tar.gz
>>>>
>>>> ...on the current jdk8/jdk8, 2x2 i5 2.0 Ghz, Linux x86_64, you will 
>>>> see
>>>> the scalability is going down. ("testCached" makes sure the 
>>>> reference to
>>>> MethodType is always strongly reachable, "testNew" makes sure the
>>>> reference is mostly non-reachable, prompting cache re-fill, "testOOB"
>>>> does not make any precautions about that, and also does not suffer 
>>>> from
>>>> the static overheads entailed by the reference management in first two
>>>> tests).
>>>>
>>>> 1 thread:
>>>> MethodTypeBench.testCached     78 +-  1 nsec/op
>>>> MethodTypeBench.testNew       120 +-  1 nsec/op
>>>> MethodTypeBench.testOOB        41 +-  1 nsec/op
>>>>
>>>> 4 threads:
>>>> MethodTypeBench.testCached    495 +- 40 nsec/op
>>>> MethodTypeBench.testNew       611 +- 37 nsec/op
>>>> MethodTypeBench.testOOB       377 +- 10 nsec/op
>>>>
>>>> In fact, on larger machines it may be as bad as tens/hundreds of
>>>> microseconds to look up the MethodType. With the patch applied, on the
>>>> same 2x2 machine:
>>>>
>>>> 1 thread:
>>>> MethodTypeBench.testCached     92 +-  1 nsec/op
>>>> MethodTypeBench.testNew       101 +-  1 nsec/op
>>>> MethodTypeBench.testOOB        49 +-  1 nsec/op
>>>>
>>>> 4 threads:
>>>> MethodTypeBench.testCached    129 +-  1 nsec/op
>>>> MethodTypeBench.testNew       142 +- 10 nsec/op
>>>> MethodTypeBench.testOOB        89 +-  3 nsec/op
>>>>
>>>>
>>>> ...which means the scalability is back: the average time to look up 
>>>> the
>>>> MethodType is not growing spectacularly when the thread count go up.
>>>> Again, the effect is astonishing on large machines.
>>>>
>>>> Notice that we took some hit in single-threaded performance, and 
>>>> that is
>>>> due to two reasons: a) WeakEntry allocation in get(); b) CHM. With the
>>>> upcoming CHMv8, this starts to look better:
>>>>
>>>> 1 thread:
>>>> MethodTypeBench.testCached     87 +-  1 nsec/op
>>>> MethodTypeBench.testNew        95 +-  1 nsec/op
>>>> MethodTypeBench.testOOB        52 +-  1 nsec/op
>>>>
>>>> 4 threads:
>>>> MethodTypeBench.testCached    121 +-  2 nsec/op
>>>> MethodTypeBench.testNew       141 +-  1 nsec/op
>>>> MethodTypeBench.testOOB        93 +-  5 nsec/op
>>>>
>>>> ...which seems to indicate this change is future-proof.
>>>
>>
>