RFR (S) CR 7177472: JSR292: MethodType interning penalizes scalability

Sat Jun 8 10:21:14 UTC 2013

On 06/07/2013 12:01 PM, Aleksey Shipilev wrote:
> (posting this to hotspot-dev@ and cc-ing core-libs-dev@, as Christian T.
> suggested offline)
>
> Hi guys,

Hi Aleksey,

>
> The fix for scalability problem is here:
>    http://cr.openjdk.java.net/~shade/7177472/webrev.00/

in add, the do/while does something a little weird, if putIfAbsent 
returns null,
interned is equals to elem, there is no need to do a e.get() in that case,
I would prefer a code like this:

             T interned;
             do {
                 expungeStaleElements();
                 WeakEntry<T> e = new WeakEntry<>(elem, stale);
                 WeakEntry<T> exist = map.putIfAbsent(e, e);
                 interned = (exist == null)? elem: exist.get();
             } while (interned == null);
             return interned;

The cast to WeakEntry in expungeStaleElements is not needed.

In WeakEntry.equals, the null check is not needed because null 
instanceof WeakEntry returns false,
and you don't need to cast obj.get() to a WeakEntry<T> given you only to 
call equals on entry.get().
So the code should be:

            public boolean equals(Object obj) {
              if (!(obj instanceof WeakEntry)) {
                return false;
              }
                Object that = ((WeakEntry<?>)obj).get();
              Object mine = get();
              return (that == null || mine == null)? this == obj: mine.equals(that);
          }

Otherwise, it looks good.

Rémi

>
> Testing:
>    - Linux x86_64 builds OK
>    - Linux x86_64 java/lang/invoke/ jtreg passes OK
>
> Since this is a proof-of-concept patch at this point, I kept the testing
> minimal. Let me know what other testing you think is needed before the
> integration.
>
> Thanks,
> Aleksey.
>
> *** Rationale and details for the issue follow: ***
>
> This issue was in limbo for an year. The performance issue with
> MethodType.methodType factory leads to abysmal scalability on the
> targeted tests. While this can arguably demonstrated on larger
> workloads, the targeted microbenchmarks are showcasing it nicely. This
> issue was blocking the JSR292 API performance testing, because on large
> enough test server, you will inevitably be bitten by this.
>
> If you run the JMH-enabled [1] microbenchmark:
>   http://openjdk.java.net/projects/code-tools/jmh/
>   http://cr.openjdk.java.net/~shade/7177472/jsr292bench.tar.gz
>
> ...on the current jdk8/jdk8, 2x2 i5 2.0 Ghz, Linux x86_64, you will see
> the scalability is going down. ("testCached" makes sure the reference to
> MethodType is always strongly reachable, "testNew" makes sure the
> reference is mostly non-reachable, prompting cache re-fill, "testOOB"
> does not make any precautions about that, and also does not suffer from
> the static overheads entailed by the reference management in first two
> tests).
>
> 1 thread:
> MethodTypeBench.testCached     78 +-  1 nsec/op
> MethodTypeBench.testNew       120 +-  1 nsec/op
> MethodTypeBench.testOOB        41 +-  1 nsec/op
>
> 4 threads:
> MethodTypeBench.testCached    495 +- 40 nsec/op
> MethodTypeBench.testNew       611 +- 37 nsec/op
> MethodTypeBench.testOOB       377 +- 10 nsec/op
>
> In fact, on larger machines it may be as bad as tens/hundreds of
> microseconds to look up the MethodType. With the patch applied, on the
> same 2x2 machine:
>
> 1 thread:
> MethodTypeBench.testCached     92 +-  1 nsec/op
> MethodTypeBench.testNew       101 +-  1 nsec/op
> MethodTypeBench.testOOB        49 +-  1 nsec/op
>
> 4 threads:
> MethodTypeBench.testCached    129 +-  1 nsec/op
> MethodTypeBench.testNew       142 +- 10 nsec/op
> MethodTypeBench.testOOB        89 +-  3 nsec/op
>
>
> ...which means the scalability is back: the average time to look up the
> MethodType is not growing spectacularly when the thread count go up.
> Again, the effect is astonishing on large machines.
>
> Notice that we took some hit in single-threaded performance, and that is
> due to two reasons: a) WeakEntry allocation in get(); b) CHM. With the
> upcoming CHMv8, this starts to look better:
>
> 1 thread:
> MethodTypeBench.testCached     87 +-  1 nsec/op
> MethodTypeBench.testNew        95 +-  1 nsec/op
> MethodTypeBench.testOOB        52 +-  1 nsec/op
>
> 4 threads:
> MethodTypeBench.testCached    121 +-  2 nsec/op
> MethodTypeBench.testNew       141 +-  1 nsec/op
> MethodTypeBench.testOOB        93 +-  5 nsec/op
>
> ...which seems to indicate this change is future-proof.