RFR (S) CR 7177472: JSR292: MethodType interning penalizes scalability

Sat Jun 8 20:17:28 UTC 2013

On 06/08/2013 12:21 PM, Remi Forax wrote:
> On 06/07/2013 12:01 PM, Aleksey Shipilev wrote:
>> (posting this to hotspot-dev@ and cc-ing core-libs-dev@, as Christian T.
>> suggested offline)
>>
>> Hi guys,
>
> Hi Aleksey,
>
>>
>> The fix for scalability problem is here:
>>    http://cr.openjdk.java.net/~shade/7177472/webrev.00/
>
> in add, the do/while does something a little weird, if putIfAbsent 
> returns null,
> interned is equals to elem, there is no need to do a e.get() in that 
> case,
> I would prefer a code like this:
>
>             T interned;
>             do {
>                 expungeStaleElements();
>                 WeakEntry<T> e = new WeakEntry<>(elem, stale);
>                 WeakEntry<T> exist = map.putIfAbsent(e, e);
>                 interned = (exist == null)? elem: exist.get();
>             } while (interned == null);
>             return interned;

Hi Remi, Aleksey,

In case the loop retries, there's no need to construct another WeakEntry...

             T interned;
              WeakEntry<T> e = new WeakEntry<>(elem, stale);
             do {
                 expungeStaleElements();
                 WeakEntry<T> exist = map.putIfAbsent(e, e);
                 interned = (exist == null)? elem: exist.get();
             } while (interned == null);
             return interned;

Regards, Peter

>
> The cast to WeakEntry in expungeStaleElements is not needed.
>
> In WeakEntry.equals, the null check is not needed because null 
> instanceof WeakEntry returns false,
> and you don't need to cast obj.get() to a WeakEntry<T> given you only 
> to call equals on entry.get().
> So the code should be:
>
>
>            public boolean equals(Object obj) {
>              if (!(obj instanceof WeakEntry)) {
>                return false;
>              }
>                Object that = ((WeakEntry<?>)obj).get();
>              Object mine = get();
>              return (that == null || mine == null)? this == obj: 
> mine.equals(that);
>          }
>
>
> Otherwise, it looks good.
>
> Rémi
>
>>
>> Testing:
>>    - Linux x86_64 builds OK
>>    - Linux x86_64 java/lang/invoke/ jtreg passes OK
>>
>> Since this is a proof-of-concept patch at this point, I kept the testing
>> minimal. Let me know what other testing you think is needed before the
>> integration.
>>
>> Thanks,
>> Aleksey.
>>
>> *** Rationale and details for the issue follow: ***
>>
>> This issue was in limbo for an year. The performance issue with
>> MethodType.methodType factory leads to abysmal scalability on the
>> targeted tests. While this can arguably demonstrated on larger
>> workloads, the targeted microbenchmarks are showcasing it nicely. This
>> issue was blocking the JSR292 API performance testing, because on large
>> enough test server, you will inevitably be bitten by this.
>>
>> If you run the JMH-enabled [1] microbenchmark:
>>   http://openjdk.java.net/projects/code-tools/jmh/
>>   http://cr.openjdk.java.net/~shade/7177472/jsr292bench.tar.gz
>>
>> ...on the current jdk8/jdk8, 2x2 i5 2.0 Ghz, Linux x86_64, you will see
>> the scalability is going down. ("testCached" makes sure the reference to
>> MethodType is always strongly reachable, "testNew" makes sure the
>> reference is mostly non-reachable, prompting cache re-fill, "testOOB"
>> does not make any precautions about that, and also does not suffer from
>> the static overheads entailed by the reference management in first two
>> tests).
>>
>> 1 thread:
>> MethodTypeBench.testCached     78 +-  1 nsec/op
>> MethodTypeBench.testNew       120 +-  1 nsec/op
>> MethodTypeBench.testOOB        41 +-  1 nsec/op
>>
>> 4 threads:
>> MethodTypeBench.testCached    495 +- 40 nsec/op
>> MethodTypeBench.testNew       611 +- 37 nsec/op
>> MethodTypeBench.testOOB       377 +- 10 nsec/op
>>
>> In fact, on larger machines it may be as bad as tens/hundreds of
>> microseconds to look up the MethodType. With the patch applied, on the
>> same 2x2 machine:
>>
>> 1 thread:
>> MethodTypeBench.testCached     92 +-  1 nsec/op
>> MethodTypeBench.testNew       101 +-  1 nsec/op
>> MethodTypeBench.testOOB        49 +-  1 nsec/op
>>
>> 4 threads:
>> MethodTypeBench.testCached    129 +-  1 nsec/op
>> MethodTypeBench.testNew       142 +- 10 nsec/op
>> MethodTypeBench.testOOB        89 +-  3 nsec/op
>>
>>
>> ...which means the scalability is back: the average time to look up the
>> MethodType is not growing spectacularly when the thread count go up.
>> Again, the effect is astonishing on large machines.
>>
>> Notice that we took some hit in single-threaded performance, and that is
>> due to two reasons: a) WeakEntry allocation in get(); b) CHM. With the
>> upcoming CHMv8, this starts to look better:
>>
>> 1 thread:
>> MethodTypeBench.testCached     87 +-  1 nsec/op
>> MethodTypeBench.testNew        95 +-  1 nsec/op
>> MethodTypeBench.testOOB        52 +-  1 nsec/op
>>
>> 4 threads:
>> MethodTypeBench.testCached    121 +-  2 nsec/op
>> MethodTypeBench.testNew       141 +-  1 nsec/op
>> MethodTypeBench.testOOB        93 +-  5 nsec/op
>>
>> ...which seems to indicate this change is future-proof.
>