mfence on i686 with volatile?
Vladimir Kozlov
Vladimir.Kozlov at Sun.COM
Wed Nov 18 17:33:58 PST 2009
When there are sequential volatile membars
C2 generates instruction only for last one and previous are
declared as "unnecessary" but they are printed for debugging.
Note: during parsing after each store to a volatile
field (or variable) C2 generates several volatile membars - one
per each volatile field (or variable). In the test case there
were 2 volatile variables, so C2 generates 2 volatile membars.
Vladimir
David Holmes - Sun Microsystems wrote:
> Vladimir Kozlov said the following on 11/19/09 06:23:
>> 046 MOV [ESI + precise klass HelloWorld:
>> 0x080f2198:Constant:exact *],EBX ! Field VolatileHelloWorld.v
>> 04c MEMBAR-volatile (unnecessary so empty encoding)
>> 04c LOCK ADDL [ESP + #0], 0 ! membar_volatile
>
> Just as aside but I find the text "unnecessary so empty encoding" very
> confusing - what exactly is it referring to?
>
> David
>
>>
>> Vladimir
>>
>> Dennis Byrne wrote:
>>> Still no luck on this. There are two "Processor" items under "System
>>> Information"/"System Summary" and Cygwin gives me this:
>>>
>>> $ cat /proc/cpuinfo | grep ^processor | wc -l
>>> 8
>>>
>>> "$jdk1.6.0_18\fastdebug\bin\java.exe -XX:+PrintOptoAssembly -server
>>> -XX:CompileCommand=print,*HelloWorld.run HelloWorld" yields the
>>> following:
>>>
>>> 01c CMP EBP,#1000000
>>> 022 Jge B13 P=0.000000 C=29521.000000
>>> 022
>>> 028 B2: # B3 <- B1 Freq: 1
>>> 028 MOV EDX,EBP
>>> 02a INC EDX
>>> 02b IMUL EAX,EBP,#333
>>> 031 MOV ECX,#448
>>> 036 MOV EBX,[ECX + precise klass HelloWorld:
>>> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v
>>> 03c SUB EBX,EAX
>>> 03e MOV ECX,#452
>>> 043 MOV EDI,[ECX + precise klass HelloWorld:
>>> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v2
>>> 043
>>> 049 B3: # B5 B4 <- B2 B4 Loop: B3-B4 inner stride: not
>>> constant pre
>>> of N112 Freq: 2
>>> 049 MOV [ESP + #4],EBX
>>> 04d ADD EAX,EBX
>>> 04f ADD EDI,EAX
>>> 051 MOV ESI,EBP
>>> 053 INC ESI
>>> 054 ADD EDI,#333
>>> 05a MOV EBX,#452
>>> 05f MOV [EBX + precise klass HelloWorld:
>>> 0x00a2fc30:Constant:exact *],EDI ! Field HelloWorld.v2
>>> 065 ADD EAX,#333
>>> 06b MOV ECX,#448
>>> 070 MOV [ECX + precise klass HelloWorld:
>>> 0x00a2fc30:Constant:exact *],EAX ! Field HelloWorld.v
>>> 076 CMP ESI,EDX
>>> 078 Jge,s B5 # Loop end P=0.500000 C=29027.000000
>>> 078
>>> 07a B4: # B3 <- B3 Freq: 1
>>> 07a IMUL EAX,EBP,#333
>>> 080 ADD EAX,#333
>>> 086 MOV EBP,ESI
>>> 088 MOV EBX,[ESP + #4]
>>> 08c JMP,s B3
>>> 08c
>>> 08e B5: # B8 B6 <- B3 Freq: 1
>>> 08e MOV ECX,#1000000
>>> 093 SUB ECX,ESI
>>> 095 AND ECX,#-16
>>> 098 ADD ECX,ESI
>>> 09a CMP ESI,ECX
>>> 09c Jl,s B8 P=0.999999 C=-1.000000
>>>
>>> Dennis Byrne
>>>
>>> On Wed, Nov 18, 2009 at 8:52 AM, Dennis Byrne
>>> <dennisbyrne at apache.org> wrote:
>>>> At this point my best guess is that this has something to do with the
>>>> fact that I am running on a Virtual Machine. cat /proc/cpuinfo
>>>> indicates the machine does in fact see two processors. The output I
>>>> get for both -client and -server has no locks and no fences. I will
>>>> run the same class through PrintOptoAssembly outside of VM Workstation
>>>> later in the day in order to build on my case that this has something
>>>> to do with VMWare. Thanks for your time on this.
>>>>
>>>> Dennis Byrne
>>>>
>>>> On Wed, Nov 18, 2009 at 3:21 AM, David Holmes - Sun Microsystems
>>>> <David.Holmes at sun.com> wrote:
>>>>> Tim Bell said the following on 11/18/09 18:22:
>>>>>> Also - are you running your tests on a single-CPU machine? I recall
>>>>>> HotSpot
>>>>>> detects that and will elide any n-way synchronization measures if
>>>>>> so...
>>>>> Yes that should have been my first question :-) Thanks Tim. On x86
>>>>> this
>>>>> stuff is elided if running on only a single core/processor.
>>>>>
>>>>> I just confirmed -client output showing the lock:addl "fences"
>>>>>
>>>>> David
>>>>>
>>>>> ;; block B2 [8, 31]
>>>>>
>>>>> 0xca4f72b8: mov $0xc61f5590,%edx ;...ba90551f c6
>>>>> ; {oop('HelloWorld')}
>>>>> 0xca4f72bd: mov 0x1c0(%edx),%ecx ;...8b8ac001 0000
>>>>> ;*getstatic v
>>>>> ; - HelloWorld::run at 8 (line 10)
>>>>> 0xca4f72c3: add $0x14d,%ecx ;...81c14d01 0000
>>>>> 0xca4f72c9: mov %ecx,0x1c0(%edx) ;...898ac001 0000
>>>>> 0xca4f72cf: lock addl $0x0,(%esp) ;...f0830424 00
>>>>> ;*putstatic v
>>>>> ; - HelloWorld::run at 15 (line
>>>>> 10)
>>>>> 0xca4f72d4: mov 0x1c4(%edx),%ecx ;...8b8ac401 0000
>>>>> ;*getstatic v2
>>>>> ; - HelloWorld::run at 18 (line
>>>>> 11)
>>>>> 0xca4f72da: mov 0x1c0(%edx),%edi ;...8bbac001 0000
>>>>> ;*getstatic v
>>>>> ; - HelloWorld::run at 21 (line
>>>>> 11)
>>>>> 0xca4f72e0: add %edi,%ecx ;...03cf
>>>>> 0xca4f72e2: mov %ecx,0x1c4(%edx) ;...898ac401 0000
>>>>> 0xca4f72e8: lock addl $0x0,(%esp) ;...f0830424 00
>>>>> ;*putstatic v2
>>>>> ; - HelloWorld::run at 25 (line
>>>>> 11)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>> For x86 we should see a StoreLoad barrier after the assignment to
>>>>>>> v2:
>>>>>>>
>>>>>>> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>>>>>>>
>>>>>>> For -server printOptoAssembly shows our "mfence" replacement a
>>>>>>> lock:addl
>>>>>>> (which is the implementation of membar-volatile().
>>>>>>>
>>>>>>> [ Got to go get hsdis myself to test printAssembly directly ]
>>>>>> HTH-
>>>>>>
>>>>>> Tim
>>>>
>>>>
>>>> --
>>>> Dennis Byrne
>>>>
>>>
>>>
>>>
More information about the hotspot-dev
mailing list