mfence on i686 with volatile?
Dennis Byrne
dennisbyrne at apache.org
Wed Nov 18 12:29:51 PST 2009
Please see below,
C:\work\jdk_x86>jdk1.6.0_18\fastdebug\bin\java.exe
-XX:+PrintMiscellaneous -XX:+Verbose -version
VM option '+PrintMiscellaneous'
VM option '+Verbose'
[SafePoint Polling address: 0x003c0000]
[Memory Serialize Page address: 0x003e0000]
Logical CPUs per core: 1
UseSSE=4
Allocation: PREFETCHNTA 256, 3 lines with step 64 bytes
CPU:total 8 (4 cores per cpu, 1 threads per core) family 6 model 23
stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1
Warning: SDK 1.6 Unsafe.prefetchRead/Write not found.
Warning: SDK 1.7 Unsafe.copyMemory not found.
java version "1.6.0_18-ea-fastdebug"
Java(TM) SE Runtime Environment (build 1.6.0_18-ea-fastdebug-b04)
Java HotSpot(TM) Client VM (build 16.0-b11-fastdebug, mixed mode)
Dennis Byrne
On Wed, Nov 18, 2009 at 2:23 PM, Vladimir Kozlov
<Vladimir.Kozlov at sun.com> wrote:
> Dennis,
>
> run fastdebug java with next options
>
> java -XX:+PrintMiscellaneous -XX:+Verbose -version
>
> the "CPU:" line shows number of cores VM see.
>
> On solaris I see barrier when I declared variables as volatile:
>
> 046 MOV [ESI + precise klass HelloWorld: 0x080f2198:Constant:exact
> *],EBX ! Field VolatileHelloWorld.v
> 04c MEMBAR-volatile (unnecessary so empty encoding)
> 04c LOCK ADDL [ESP + #0], 0 ! membar_volatile
>
> Vladimir
>
> Dennis Byrne wrote:
>>
>> Still no luck on this. There are two "Processor" items under "System
>> Information"/"System Summary" and Cygwin gives me this:
>>
>> $ cat /proc/cpuinfo | grep ^processor | wc -l
>> 8
>>
>> "$jdk1.6.0_18\fastdebug\bin\java.exe -XX:+PrintOptoAssembly -server
>> -XX:CompileCommand=print,*HelloWorld.run HelloWorld" yields the
>> following:
>>
>> 01c CMP EBP,#1000000
>> 022 Jge B13 P=0.000000 C=29521.000000
>> 022
>> 028 B2: # B3 <- B1 Freq: 1
>> 028 MOV EDX,EBP
>> 02a INC EDX
>> 02b IMUL EAX,EBP,#333
>> 031 MOV ECX,#448
>> 036 MOV EBX,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v
>> 03c SUB EBX,EAX
>> 03e MOV ECX,#452
>> 043 MOV EDI,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *] # int ! Field HelloWorld.v2
>> 043
>> 049 B3: # B5 B4 <- B2 B4 Loop: B3-B4 inner stride: not constant pre
>> of N112 Freq: 2
>> 049 MOV [ESP + #4],EBX
>> 04d ADD EAX,EBX
>> 04f ADD EDI,EAX
>> 051 MOV ESI,EBP
>> 053 INC ESI
>> 054 ADD EDI,#333
>> 05a MOV EBX,#452
>> 05f MOV [EBX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EDI ! Field HelloWorld.v2
>> 065 ADD EAX,#333
>> 06b MOV ECX,#448
>> 070 MOV [ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EAX ! Field HelloWorld.v
>> 076 CMP ESI,EDX
>> 078 Jge,s B5 # Loop end P=0.500000 C=29027.000000
>> 078
>> 07a B4: # B3 <- B3 Freq: 1
>> 07a IMUL EAX,EBP,#333
>> 080 ADD EAX,#333
>> 086 MOV EBP,ESI
>> 088 MOV EBX,[ESP + #4]
>> 08c JMP,s B3
>> 08c
>> 08e B5: # B8 B6 <- B3 Freq: 1
>> 08e MOV ECX,#1000000
>> 093 SUB ECX,ESI
>> 095 AND ECX,#-16
>> 098 ADD ECX,ESI
>> 09a CMP ESI,ECX
>> 09c Jl,s B8 P=0.999999 C=-1.000000
>>
>> Dennis Byrne
>>
>> On Wed, Nov 18, 2009 at 8:52 AM, Dennis Byrne <dennisbyrne at apache.org>
>> wrote:
>>>
>>> At this point my best guess is that this has something to do with the
>>> fact that I am running on a Virtual Machine. cat /proc/cpuinfo
>>> indicates the machine does in fact see two processors. The output I
>>> get for both -client and -server has no locks and no fences. I will
>>> run the same class through PrintOptoAssembly outside of VM Workstation
>>> later in the day in order to build on my case that this has something
>>> to do with VMWare. Thanks for your time on this.
>>>
>>> Dennis Byrne
>>>
>>> On Wed, Nov 18, 2009 at 3:21 AM, David Holmes - Sun Microsystems
>>> <David.Holmes at sun.com> wrote:
>>>>
>>>> Tim Bell said the following on 11/18/09 18:22:
>>>>>
>>>>> Also - are you running your tests on a single-CPU machine? I recall
>>>>> HotSpot
>>>>> detects that and will elide any n-way synchronization measures if so...
>>>>
>>>> Yes that should have been my first question :-) Thanks Tim. On x86 this
>>>> stuff is elided if running on only a single core/processor.
>>>>
>>>> I just confirmed -client output showing the lock:addl "fences"
>>>>
>>>> David
>>>>
>>>> ;; block B2 [8, 31]
>>>>
>>>> 0xca4f72b8: mov $0xc61f5590,%edx ;...ba90551f c6
>>>> ; {oop('HelloWorld')}
>>>> 0xca4f72bd: mov 0x1c0(%edx),%ecx ;...8b8ac001 0000
>>>> ;*getstatic v
>>>> ; - HelloWorld::run at 8 (line 10)
>>>> 0xca4f72c3: add $0x14d,%ecx ;...81c14d01 0000
>>>> 0xca4f72c9: mov %ecx,0x1c0(%edx) ;...898ac001 0000
>>>> 0xca4f72cf: lock addl $0x0,(%esp) ;...f0830424 00
>>>> ;*putstatic v
>>>> ; - HelloWorld::run at 15 (line 10)
>>>> 0xca4f72d4: mov 0x1c4(%edx),%ecx ;...8b8ac401 0000
>>>> ;*getstatic v2
>>>> ; - HelloWorld::run at 18 (line 11)
>>>> 0xca4f72da: mov 0x1c0(%edx),%edi ;...8bbac001 0000
>>>> ;*getstatic v
>>>> ; - HelloWorld::run at 21 (line 11)
>>>> 0xca4f72e0: add %edi,%ecx ;...03cf
>>>> 0xca4f72e2: mov %ecx,0x1c4(%edx) ;...898ac401 0000
>>>> 0xca4f72e8: lock addl $0x0,(%esp) ;...f0830424 00
>>>> ;*putstatic v2
>>>> ; - HelloWorld::run at 25 (line 11)
>>>>
>>>>
>>>>
>>>>
>>>>>> For x86 we should see a StoreLoad barrier after the assignment to v2:
>>>>>>
>>>>>> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>>>>>>
>>>>>> For -server printOptoAssembly shows our "mfence" replacement a
>>>>>> lock:addl
>>>>>> (which is the implementation of membar-volatile().
>>>>>>
>>>>>> [ Got to go get hsdis myself to test printAssembly directly ]
>>>>>
>>>>> HTH-
>>>>>
>>>>> Tim
>>>
>>>
>>> --
>>> Dennis Byrne
>>>
>>
>>
>>
>
--
Dennis Byrne
More information about the hotspot-dev
mailing list