ArrayList loop regression ?
Rémi Forax
forax at univ-mlv.fr
Fri Mar 4 15:55:19 PST 2011
On 03/04/2011 09:32 PM, Tom Rodriguez wrote:
> On Feb 27, 2011, at 10:16 AM, Rémi Forax wrote:
>
>> A student sent me a micro-benchmark that shows a regression in the way
>> hotspot compiles indexed loop on linux x64.
>>
>> Basically jdk7:
>> java version "1.7.0-ea"
>> Java(TM) SE Runtime Environment (build 1.7.0-ea-b129)
>> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b01, mixed mode)
>> is really slow compared to jdk6:
>> java version "1.6.0_21"
>> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
>> Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
> I'm seeing bimodal behaviour, even with 6u21.
>
> smite ~ % /java/re/jdk/1.6.0_21/latest/binaries/solaris-amd64/bin/java -server -Xbatch ArrayListIteration2
> 495000000 29346625
> 495000000 27204499
> 495000000 19873904
> 495000000 19784322
> 495000000 19766402
> 495000000 19774330
> 495000000 19826623
> 495000000 19786293
> 495000000 21203493
> 495000000 19809224
> 495000000 19812661
> 495000000 19794664
> 495000000 19752474
> 495000000 19798351
> 495000000 19836035
> 495000000 19734894
> 495000000 19869816
> 495000000 19742165
> 495000000 19724705
> 495000000 19826919
> smite ~ % /java/re/jdk/1.6.0_21/latest/binaries/solaris-amd64/bin/java -server -Xbatch ArrayListIteration2
> 495000000 26317969
> 495000000 24463039
> 495000000 17135392
> 495000000 17036236
> 495000000 18216820
> 495000000 17632804
> 495000000 17067382
> 495000000 17020512
> 495000000 17044797
> 495000000 17111037
> 495000000 17056577
> 495000000 17338273
> 495000000 17874496
> 495000000 17690025
> 495000000 17061152
> 495000000 17039536
> 495000000 17097877
> 495000000 17053703
> 495000000 17086628
> 495000000 17308508
>
> That 17 million score is the best I've seen on the particular machine I'm running on. I guess the gap is similar to what you are seeing. I see the same thing with every 6 and 7 release I've tried. I even see the bimodal scores with 1.5.0.
>
> I captured LogCompilation output for a fast and slow run and they performed exactly the same compiles with the same inlining so it's not obvious why one is faster than the other. I just captured PrintNMethods for slow and fast runs and they are _exactly_ the same, including register assignment and code location in memory. So the only thing left is the location of the Java objects in memory. The array is 10 million elements long which is pretty large. I'm running on an opteron which is NUMA so maybe sometime the memory ends up on a different board so sometimes the latency is higher? The array size is so large that's it's basically a memory bandwidth test. What kind of machines have you tested on?
Intel(R) Core(TM) i7 CPU Q 840 @ 1.87GHz
> How does a smaller array iterated more times behave?
It works as expected.
I have, yet again, to pach my abstract Von Neumann machine in my head.
BTW, with JDK7, the performance of the loop with an iterator is really
close to
the one with an index. It's impressive.
> tom
Many thanks Tom.
Rémi
More information about the hotspot-compiler-dev
mailing list