ArrayList loop regression ?
Tom Rodriguez
tom.rodriguez at oracle.com
Fri Mar 4 12:32:51 PST 2011
On Feb 27, 2011, at 10:16 AM, Rémi Forax wrote:
> A student sent me a micro-benchmark that shows a regression in the way
> hotspot compiles indexed loop on linux x64.
>
> Basically jdk7:
> java version "1.7.0-ea"
> Java(TM) SE Runtime Environment (build 1.7.0-ea-b129)
> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b01, mixed mode)
> is really slow compared to jdk6:
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
I'm seeing bimodal behaviour, even with 6u21.
smite ~ % /java/re/jdk/1.6.0_21/latest/binaries/solaris-amd64/bin/java -server -Xbatch ArrayListIteration2
495000000 29346625
495000000 27204499
495000000 19873904
495000000 19784322
495000000 19766402
495000000 19774330
495000000 19826623
495000000 19786293
495000000 21203493
495000000 19809224
495000000 19812661
495000000 19794664
495000000 19752474
495000000 19798351
495000000 19836035
495000000 19734894
495000000 19869816
495000000 19742165
495000000 19724705
495000000 19826919
smite ~ % /java/re/jdk/1.6.0_21/latest/binaries/solaris-amd64/bin/java -server -Xbatch ArrayListIteration2
495000000 26317969
495000000 24463039
495000000 17135392
495000000 17036236
495000000 18216820
495000000 17632804
495000000 17067382
495000000 17020512
495000000 17044797
495000000 17111037
495000000 17056577
495000000 17338273
495000000 17874496
495000000 17690025
495000000 17061152
495000000 17039536
495000000 17097877
495000000 17053703
495000000 17086628
495000000 17308508
That 17 million score is the best I've seen on the particular machine I'm running on. I guess the gap is similar to what you are seeing. I see the same thing with every 6 and 7 release I've tried. I even see the bimodal scores with 1.5.0.
I captured LogCompilation output for a fast and slow run and they performed exactly the same compiles with the same inlining so it's not obvious why one is faster than the other. I just captured PrintNMethods for slow and fast runs and they are _exactly_ the same, including register assignment and code location in memory. So the only thing left is the location of the Java objects in memory. The array is 10 million elements long which is pretty large. I'm running on an opteron which is NUMA so maybe sometime the memory ends up on a different board so sometimes the latency is higher? The array size is so large that's it's basically a memory bandwidth test. What kind of machines have you tested on? How does a smaller array iterated more times behave?
tom
>
> [forax at localhost src]$ java -server -Xbatch ArrayListIteration2
> 495000000 23562949
> 495000000 28475066
> 495000000 10691187
> 495000000 17820582
> 495000000 17910601
> 495000000 17231943
> 495000000 17834819
> 495000000 17444713
> 495000000 17831591
> 495000000 17880947
> 495000000 17405874
> 495000000 17823323
> 495000000 17254145
> 495000000 17815812
> 495000000 17534777
> 495000000 17589371
> 495000000 17830668
> 495000000 17389534
> 495000000 17884535
> 495000000 17484575
>
> [forax at localhost src]$ /usr/jdk/jdk1.6.0_21/bin/java -server -Xbatch ArrayListIteration2
> 495000000 28299043
> 495000000 23543527
> 495000000 11718848
> 495000000 12008201
> 495000000 12821666
> 495000000 11661011
> 495000000 12502804
> 495000000 11763580
> 495000000 11663503
> 495000000 12620776
> 495000000 11649228
> 495000000 12593051
> 495000000 11805152
> 495000000 11702461
> 495000000 12368391
> 495000000 11615139
> 495000000 12511620
> 495000000 11726843
> 495000000 11796660
> 495000000 12379443
>
> I've also tested with an older 32bit VM that doesn't show any regression on 32bits :
> java version "1.7.0-ea"
> Java(TM) SE Runtime Environment (build 1.7.0-ea-b123)
> Java HotSpot(TM) Server VM (build 20.0-b04, mixed mode)
>
> Rémi
>
> <ArrayListIteration2.java>
More information about the hotspot-compiler-dev
mailing list