mfence on i686 with volatile?

David Holmes - Sun Microsystems David.Holmes at Sun.COM
Wed Nov 18 16:55:53 PST 2009


Vladimir Kozlov said the following on 11/19/09 06:23:
> 046     MOV    [ESI + precise klass HelloWorld: 
> 0x080f2198:Constant:exact *],EBX ! Field  VolatileHelloWorld.v
> 04c     MEMBAR-volatile (unnecessary so empty encoding)
> 04c     LOCK ADDL [ESP + #0], 0 ! membar_volatile

Just as aside but I find the text "unnecessary so empty encoding" very 
confusing - what exactly is it referring to?

David

> 
> Vladimir
> 
> Dennis Byrne wrote:
>> Still no luck on this.  There are two "Processor" items under "System
>> Information"/"System Summary" and Cygwin gives me this:
>>
>> $ cat /proc/cpuinfo | grep ^processor | wc -l
>> 8
>>
>> "$jdk1.6.0_18\fastdebug\bin\java.exe -XX:+PrintOptoAssembly -server
>> -XX:CompileCommand=print,*HelloWorld.run  HelloWorld" yields the
>> following:
>>
>> 01c       CMP    EBP,#1000000
>> 022       Jge    B13  P=0.000000 C=29521.000000
>> 022
>> 028   B2: #    B3 <- B1  Freq: 1
>> 028       MOV    EDX,EBP
>> 02a       INC    EDX
>> 02b       IMUL   EAX,EBP,#333
>> 031       MOV    ECX,#448
>> 036       MOV    EBX,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *]    # int ! Field HelloWorld.v
>> 03c       SUB    EBX,EAX
>> 03e       MOV    ECX,#452
>> 043       MOV    EDI,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *]    # int ! Field HelloWorld.v2
>> 043
>> 049   B3: #    B5 B4 <- B2 B4     Loop: B3-B4 inner stride: not 
>> constant pre
>> of N112 Freq: 2
>> 049       MOV    [ESP + #4],EBX
>> 04d       ADD    EAX,EBX
>> 04f       ADD    EDI,EAX
>> 051       MOV    ESI,EBP
>> 053       INC    ESI
>> 054       ADD    EDI,#333
>> 05a       MOV    EBX,#452
>> 05f       MOV    [EBX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EDI ! Field HelloWorld.v2
>> 065       ADD    EAX,#333
>> 06b       MOV    ECX,#448
>> 070       MOV    [ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EAX ! Field HelloWorld.v
>> 076       CMP    ESI,EDX
>> 078       Jge,s  B5    # Loop end  P=0.500000 C=29027.000000
>> 078
>> 07a   B4: #    B3 <- B3  Freq: 1
>> 07a       IMUL   EAX,EBP,#333
>> 080       ADD    EAX,#333
>> 086       MOV    EBP,ESI
>> 088       MOV    EBX,[ESP + #4]
>> 08c       JMP,s  B3
>> 08c
>> 08e   B5: #    B8 B6 <- B3  Freq: 1
>> 08e       MOV    ECX,#1000000
>> 093       SUB    ECX,ESI
>> 095       AND    ECX,#-16
>> 098       ADD    ECX,ESI
>> 09a       CMP    ESI,ECX
>> 09c       Jl,s  B8  P=0.999999 C=-1.000000
>>
>> Dennis Byrne
>>
>> On Wed, Nov 18, 2009 at 8:52 AM, Dennis Byrne <dennisbyrne at apache.org> 
>> wrote:
>>> At this point my best guess is that this has something to do with the
>>> fact that I am running on a Virtual Machine.  cat /proc/cpuinfo
>>> indicates the machine does in fact see two processors.  The output I
>>> get for both -client and -server has no locks and no fences.  I will
>>> run the same class through PrintOptoAssembly outside of VM Workstation
>>> later in the day in order to build on my case that this has something
>>> to do with VMWare.  Thanks for your time on this.
>>>
>>> Dennis Byrne
>>>
>>> On Wed, Nov 18, 2009 at 3:21 AM, David Holmes - Sun Microsystems
>>> <David.Holmes at sun.com> wrote:
>>>> Tim Bell said the following on 11/18/09 18:22:
>>>>> Also - are you running your tests on a single-CPU machine?  I recall
>>>>> HotSpot
>>>>> detects that and will elide any n-way synchronization measures if 
>>>>> so...
>>>> Yes that should have been my first question :-) Thanks Tim. On x86 this
>>>> stuff is elided if running on only a single core/processor.
>>>>
>>>> I just confirmed -client output showing the lock:addl "fences"
>>>>
>>>> David
>>>>
>>>>  ;;  block B2 [8, 31]
>>>>
>>>>  0xca4f72b8: mov    $0xc61f5590,%edx   ;...ba90551f c6
>>>>                                        ;   {oop('HelloWorld')}
>>>>  0xca4f72bd: mov    0x1c0(%edx),%ecx   ;...8b8ac001 0000
>>>>                                        ;*getstatic v
>>>>                                        ; - HelloWorld::run at 8 (line 10)
>>>>  0xca4f72c3: add    $0x14d,%ecx        ;...81c14d01 0000
>>>>  0xca4f72c9: mov    %ecx,0x1c0(%edx)   ;...898ac001 0000
>>>>  0xca4f72cf: lock addl $0x0,(%esp)     ;...f0830424 00
>>>>                                        ;*putstatic v
>>>>                                        ; - HelloWorld::run at 15 (line 10)
>>>>  0xca4f72d4: mov    0x1c4(%edx),%ecx   ;...8b8ac401 0000
>>>>                                        ;*getstatic v2
>>>>                                        ; - HelloWorld::run at 18 (line 11)
>>>>  0xca4f72da: mov    0x1c0(%edx),%edi   ;...8bbac001 0000
>>>>                                        ;*getstatic v
>>>>                                        ; - HelloWorld::run at 21 (line 11)
>>>>  0xca4f72e0: add    %edi,%ecx          ;...03cf
>>>>  0xca4f72e2: mov    %ecx,0x1c4(%edx)   ;...898ac401 0000
>>>>  0xca4f72e8: lock addl $0x0,(%esp)     ;...f0830424 00
>>>>                                        ;*putstatic v2
>>>>                                        ; - HelloWorld::run at 25 (line 11)
>>>>
>>>>
>>>>
>>>>
>>>>>> For x86 we should see a StoreLoad barrier after the assignment to v2:
>>>>>>
>>>>>> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>>>>>>
>>>>>> For -server printOptoAssembly shows our "mfence" replacement a 
>>>>>> lock:addl
>>>>>> (which is the implementation of membar-volatile().
>>>>>>
>>>>>> [ Got to go get hsdis myself to test printAssembly directly ]
>>>>> HTH-
>>>>>
>>>>> Tim
>>>
>>>
>>> -- 
>>> Dennis Byrne
>>>
>>
>>
>>


More information about the hotspot-dev mailing list