mfence on i686 with volatile?
Vladimir Kozlov
Vladimir.Kozlov at Sun.COM
Wed Nov 18 12:23:35 PST 2009
Dennis,
run fastdebug java with next options
java -XX:+PrintMiscellaneous -XX:+Verbose -version
the "CPU:" line shows number of cores VM see.
On solaris I see barrier when I declared variables as volatile:
046 MOV [ESI + precise klass HelloWorld: 0x080f2198:Constant:exact *],EBX ! Field VolatileHelloWorld.v
04c MEMBAR-volatile (unnecessary so empty encoding)
04c LOCK ADDL [ESP + #0], 0 ! membar_volatile
Vladimir
Dennis Byrne wrote:
> Still no luck on this. There are two "Processor" items under "System
> Information"/"System Summary" and Cygwin gives me this:
>
> $ cat /proc/cpuinfo | grep ^processor | wc -l
> 8
>
> "$jdk1.6.0_18\fastdebug\bin\java.exe -XX:+PrintOptoAssembly -server
> -XX:CompileCommand=print,*HelloWorld.run HelloWorld" yields the
> following:
>
> 01c CMP EBP,#1000000
> 022 Jge B13 P=0.000000 C=29521.000000
> 022
> 028 B2: # B3 <- B1 Freq: 1
> 028 MOV EDX,EBP
> 02a INC EDX
> 02b IMUL EAX,EBP,#333
> 031 MOV ECX,#448
> 036 MOV EBX,[ECX + precise klass HelloWorld:
> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v
> 03c SUB EBX,EAX
> 03e MOV ECX,#452
> 043 MOV EDI,[ECX + precise klass HelloWorld:
> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v2
> 043
> 049 B3: # B5 B4 <- B2 B4 Loop: B3-B4 inner stride: not constant pre
> of N112 Freq: 2
> 049 MOV [ESP + #4],EBX
> 04d ADD EAX,EBX
> 04f ADD EDI,EAX
> 051 MOV ESI,EBP
> 053 INC ESI
> 054 ADD EDI,#333
> 05a MOV EBX,#452
> 05f MOV [EBX + precise klass HelloWorld:
> 0x00a2fc30:Constant:exact *],EDI ! Field HelloWorld.v2
> 065 ADD EAX,#333
> 06b MOV ECX,#448
> 070 MOV [ECX + precise klass HelloWorld:
> 0x00a2fc30:Constant:exact *],EAX ! Field HelloWorld.v
> 076 CMP ESI,EDX
> 078 Jge,s B5 # Loop end P=0.500000 C=29027.000000
> 078
> 07a B4: # B3 <- B3 Freq: 1
> 07a IMUL EAX,EBP,#333
> 080 ADD EAX,#333
> 086 MOV EBP,ESI
> 088 MOV EBX,[ESP + #4]
> 08c JMP,s B3
> 08c
> 08e B5: # B8 B6 <- B3 Freq: 1
> 08e MOV ECX,#1000000
> 093 SUB ECX,ESI
> 095 AND ECX,#-16
> 098 ADD ECX,ESI
> 09a CMP ESI,ECX
> 09c Jl,s B8 P=0.999999 C=-1.000000
>
> Dennis Byrne
>
> On Wed, Nov 18, 2009 at 8:52 AM, Dennis Byrne <dennisbyrne at apache.org> wrote:
>> At this point my best guess is that this has something to do with the
>> fact that I am running on a Virtual Machine. cat /proc/cpuinfo
>> indicates the machine does in fact see two processors. The output I
>> get for both -client and -server has no locks and no fences. I will
>> run the same class through PrintOptoAssembly outside of VM Workstation
>> later in the day in order to build on my case that this has something
>> to do with VMWare. Thanks for your time on this.
>>
>> Dennis Byrne
>>
>> On Wed, Nov 18, 2009 at 3:21 AM, David Holmes - Sun Microsystems
>> <David.Holmes at sun.com> wrote:
>>> Tim Bell said the following on 11/18/09 18:22:
>>>> Also - are you running your tests on a single-CPU machine? I recall
>>>> HotSpot
>>>> detects that and will elide any n-way synchronization measures if so...
>>> Yes that should have been my first question :-) Thanks Tim. On x86 this
>>> stuff is elided if running on only a single core/processor.
>>>
>>> I just confirmed -client output showing the lock:addl "fences"
>>>
>>> David
>>>
>>> ;; block B2 [8, 31]
>>>
>>> 0xca4f72b8: mov $0xc61f5590,%edx ;...ba90551f c6
>>> ; {oop('HelloWorld')}
>>> 0xca4f72bd: mov 0x1c0(%edx),%ecx ;...8b8ac001 0000
>>> ;*getstatic v
>>> ; - HelloWorld::run at 8 (line 10)
>>> 0xca4f72c3: add $0x14d,%ecx ;...81c14d01 0000
>>> 0xca4f72c9: mov %ecx,0x1c0(%edx) ;...898ac001 0000
>>> 0xca4f72cf: lock addl $0x0,(%esp) ;...f0830424 00
>>> ;*putstatic v
>>> ; - HelloWorld::run at 15 (line 10)
>>> 0xca4f72d4: mov 0x1c4(%edx),%ecx ;...8b8ac401 0000
>>> ;*getstatic v2
>>> ; - HelloWorld::run at 18 (line 11)
>>> 0xca4f72da: mov 0x1c0(%edx),%edi ;...8bbac001 0000
>>> ;*getstatic v
>>> ; - HelloWorld::run at 21 (line 11)
>>> 0xca4f72e0: add %edi,%ecx ;...03cf
>>> 0xca4f72e2: mov %ecx,0x1c4(%edx) ;...898ac401 0000
>>> 0xca4f72e8: lock addl $0x0,(%esp) ;...f0830424 00
>>> ;*putstatic v2
>>> ; - HelloWorld::run at 25 (line 11)
>>>
>>>
>>>
>>>
>>>>> For x86 we should see a StoreLoad barrier after the assignment to v2:
>>>>>
>>>>> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>>>>>
>>>>> For -server printOptoAssembly shows our "mfence" replacement a lock:addl
>>>>> (which is the implementation of membar-volatile().
>>>>>
>>>>> [ Got to go get hsdis myself to test printAssembly directly ]
>>>> HTH-
>>>>
>>>> Tim
>>
>>
>> --
>> Dennis Byrne
>>
>
>
>
More information about the hotspot-dev
mailing list