ReentrantLock performance regression between JDK5 and 6/7?

Clemens Eisserer linuxhippy at gmail.com
Thu Aug 11 15:38:18 PDT 2011


Hi Tom,

Thanks for taking a look.

I believe this was caused by the switch to using lock addl[esp], 0 instead
> of mfence for volatile membars, 6822204.  My review request for that said
> that at the time I didn't measure any performance change for Intel,
> http://cr.openjdk.java.net/~never/6822204.  On your microbenchmark I can
> measure the difference though so I'm going to remeasure derby which
> previously showed the big difference.  We may want to make the lock addl be
> AMD specific.
>

I remember Dave Dice's blog entry about a (as far as I understand) similar
issue: http://blogs.oracle.com/dave/resource/NHM-Pipeline-Blog-V2.txt
My hardware is one generation before nehalem, which could explain the
slowdown.

Thanks, Clemens


>
> tom
>
> On Aug 11, 2011, at 11:05 AM, Clemens Eisserer wrote:
>
> > Hi Vitaly,
> >
> > I tried this bench on 6u23 and if I first run that code in a 10k
> iteration loop and then time the 1mm iteration loop I get about 10 ms
> speedup.  The first loop would trigger jit compilation (10k is the default
> threshold I believe) and second should run without compilation interruption.
> >
> > Can you try the same? Also might be interesting to time it under the
> interpreter (-Xint).
> >
> > I changed the testcase a bit, to no longer rely on OSR - as lockBench()
> will for sure soon hit the compilation threshold after a few runs.
> >
> > I get the following timings for 1m runs:
> >
> > jdk7-server: 53ms
> > jdk7-client: 62ms
> > jdk7-xint  : 955ms
> >
> > jdk6-xint  : 1000ms
> > jdk6-client: 68ms
> > jdk6-server: 52ms
> >
> > jdk5-server: 40ms
> > jdk5-client: 61ms
> > jdk5-xint  : 832ms
> >
> > So JDK7 is slower in every case, the regression seems to have landed in
> jdk6 (I was using openjdk6).
> >
> > Should I file a bug-report about this behaviour?
> >
> > Thanks, Clemens
> >
> >
> > public class LockPerf {
> >     static ReentrantLock lock = new ReentrantLock();
> >
> >     public static void main(String[] args) {
> >      while (true) {
> >           long start2 = System.nanoTime();
> >           for(int i=0; i < 1000; i++) {
> >           lockBench();
> >         }
> >         System.out.println("Lock bench: " + ((System.nanoTime() -
> start2)) / 1000000);
> >     }
> >     }
> >
> >     private static void lockBench() {
> >         for (int i = 0; i < 1000; i++) {
> >           lock.lock();
> >           lock.unlock();
> >         }
> >     }
> > }
> >
> >
> > On Aug 11, 2011 11:38 AM, "Clemens Eisserer" <linuxhippy at gmail.com>
> wrote:
> > > Hi Vitaly,
> > >
> > > Which OS are you using?
> > >>
> > > Linux-3.0 (Fedora 15)
> > >
> > >
> > >> Also, you should use System.nanoTime() for this type of timing as it
> gives
> > >> you a more precise timer.
> > >>
> > > I tried, but results remained the same. ~53ms for jdk6/7, ~41 for JDK5.
> > > I was using the server compiler both times.
> > >
> > > Thanks, Clemens
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20110812/9d4b2a26/attachment.html 


More information about the hotspot-runtime-dev mailing list