Field access optimisations inside loops (question).
Ulf Zibis
Ulf.Zibis at gmx.de
Fri Jan 8 01:30:35 PST 2010
1. I'm wondering, why there is only moved a byte (MOV8) to the 'value'
variable instead of a int. (what is the element size in list?)
2. I agree, if:
- list is private member in this class,
- list is not reachable by other methods of this class from a concurrent
thread.
For similar problem see: Bug Id 6914095
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6914095>.
-Ulf
Am 08.01.2010 08:52, Dawid Weiss schrieb:
> Hi everyone,
>
> I have been wondering about one thing and failed to find the answer so far.
>
> I observed the following behavior from the HotSpot compiler. Consider
> the following two loops:
>
> 1.
> for (int i = 0; i < list.size(); i++)
> {
> value = list.get(i);
> }
>
> 2.
> final int size = list.size();
> final int [] buffer = list.buffer;
> for (int i = 0; i < size; i++)
> {
> value = buffer[i];
> }
>
> The list variable's class is simple and the implementation of size()
> and get() is trivial, basically:
>
> get == return buffer[index];
> size == return elementsCount;
>
> (no boundary checks, inheritance or anything like this).
>
> What's interesting is that loop (2) is at least TWICE as fast as loop
> (1) (on various CPU architectures, multi and single-core systems). The "value"
> variable is a public static volatile to force the compiler to actually
> read the buffer's content and store is somewhere. Example timings:
>
> testSimpleGetLoop : time.bench: 1.97, round: 0.20 [+- 0.00],
> round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00
> testDirectBufferLoop : time.bench: 0.57, round: 0.06 [+- 0.01],
> round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00
>
> (note the "round" field above, in brackets are standard deviations
> from multiple test runs).
>
> I looked at the assembly dumped by the HotSpot and from it I can
> conclude that in case of loop (1) the generated code always attempts
> to re-read the fields referenced through the list field (it is
> object-scoped, private, final,
> non-volatile, direct class access -- no interfaces). See listing (A)
> below. My understanding of the JMM was that
> in cases such as this one, the compiler can safely assume no side
> effects for the current thread and move field references to registers.
> This is exactly what happens in case (2) (see listing (B) below) --
> the references to
> local variables are moved to regiters, with additional loop unrolling
> performed by the compiler.
>
> My question is if there is anything that prevents the compiler from
> caching the "list.buffer" pointer in
> version (1) of the loop? Note that get and size methods are inlined
> properly, but the field access is still
> performed on every loop iteration.
>
> I'd appreciate a hotspot code pointer where this situation is
> considered or a JVM reference pointer
> which makes the compiler behave as it does. Apologies in advance if
> I'm missing something
> naively simple.
>
> Dawid
>
>
> (A) simple get loop, pseudo-assembly dump (after warmup).
>
> 030 B4: # B11 B5 <- B3 B8 Loop: B4-B8 inner Freq: 220753
> 030 MOV EDX,[EBP + #12] ! Field .buffer
> 033 NullCheck EBP
> 033
> 033 B5: # B12 B6 <- B4 Freq: 220753
> 033 MOV EBP,[EDX + #8]
> 036 NullCheck EDX
> 036
> 036 B6: # B10 B7 <- B5 Freq: 220752
> 036 CMPu EDI,EBP
> 038 Jnb,us B10 P=0.000001 C=-1.000000
> 038
> 03a B7: # B13 B8 <- B6 Freq: 220752
> 03a MOVSX8 EAX,[EDX + #12 + EDI] # byte
> 03f MEMBAR-release ! (empty encoding)
> 03f MOV8 [ECX + precise klass Benchmark:
> 0x048e80e8:Constant:exact *],EAX ! Field Volatile value
> 045 MEMBAR-volatile (unnecessary so empty encoding)
> 045 LOCK ADDL [ESP + #0], 0 ! membar_volatile
> 04a MOV EBP,[EBX + #12] ! Field .list
> 04d MOV EDX,[EBP + #8] # int ! Field .elementsCount
> 050 NullCheck EBP
> 050
> 050 B8: # B4 B9 <- B7 Freq: 220752
> 050 INC EDI
> 051 TSTL #polladdr,EAX ! Safepoint: poll for GC
> 057 CMP EDI,EDX
> 059 Jl,s B4 P=1.000000 C=179200.000000
>
>
> (B) direct buffer access loop (after warmup).
>
> 01a MOV ECX,[ECX + #12] ! Field .list
> 01d MOV EAX,[ECX + #8] # int ! Field .elementsCount
> 020 NullCheck ECX
> 020
> 020 B2: # B12 B3 <- B1 Freq: 0.999999
> 020 TEST EAX,EAX
> 022 Jle B12 P=0.000000 C=1.000000
> 022
> 028 B3: # B4 <- B2 Freq: 0.999999
> 028 MOV EDI,[ECX + #12] ! Field .buffer
> 02b XOR EBX,EBX
> 02d MOV EBP,#360
> 02d
> 032 B4: # B14 B5 <- B3 B6 Loop: B4-B6 inner stride: not constant
> pre of N158 Freq: 1.99999
> 032 MOV EDX,[EDI + #8]
> 035 NullCheck EDI
> 035
> 035 B5: # B13 B6 <- B4 Freq: 1.99999
> 035 CMPu EBX,EDX
> 037 Jnb,u B13 P=0.000001 C=-1.000000
> 037
> 03d B6: # B4 B7 <- B5 Freq: 1.99999
> 03d MOVSX8 ECX,[EDI + #12 + EBX] # byte
> 042 MEMBAR-release ! (empty encoding)
> 042 MOV8 [EBP + precise klass : 0x04161d30:Constant:exact *],ECX
> ! Field Volatile.value
> 048 MEMBAR-volatile (unnecessary so empty encoding)
> 048 LOCK ADDL [ESP + #0], 0 ! membar_volatile
> 04d INC EBX
> 04e CMP EBX,#1
> 051 Jl,s B4 # Loop end P=0.500000 C=334848.000000
>
> 053 B7: # B9 B8 <- B6 Freq: 0.999994
> 053 MOV ESI,EAX
> 055 MIN ESI,EDX
> 05b SUB ESI,EBX
> 05d AND ESI,#-4
> 060 ADD ESI,EBX
> 062 CMP EBX,ESI
> 064 Jge,s B9 P=0.000001 C=-1.000000
> NOP # 10 bytes pad for loops and calls
>
> 070 B8: # B8 B9 <- B7 B8 Loop: B8-B8 inner stride: not constant
> main of N116 Freq: 999993
> 070 MOVSX8 ECX,[EDI + #12 + EBX] # byte
> 075 MEMBAR-release ! (empty encoding)
> 075 MOV8 [EBP + precise klass : 0x04161d30:Constant:exact *],ECX
> ! Field Volatile.value
> 07b MEMBAR-volatile (unnecessary so empty encoding)
> 07b MEMBAR-volatile (unnecessary so empty encoding)
> 07b MOVSX8 ECX,[EDI + #13 + EBX] # byte
> 080 MEMBAR-release ! (empty encoding)
> 080 MOV8 [EBP + precise klass : 0x04161d30:Constant:exact *],ECX
> ! Field Volatile.value
> 086 MEMBAR-volatile (unnecessary so empty encoding)
> 086 MEMBAR-volatile (unnecessary so empty encoding)
>
> ... (loop further unrolled here) ...
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20100108/7831db39/attachment-0001.html
More information about the hotspot-compiler-dev
mailing list