mfence on i686 with volatile?

Dennis Byrne dennisbyrne at apache.org
Wed Nov 18 12:29:51 PST 2009


Please see below,

C:\work\jdk_x86>jdk1.6.0_18\fastdebug\bin\java.exe
-XX:+PrintMiscellaneous -XX:+Verbose -version
VM option '+PrintMiscellaneous'
VM option '+Verbose'
[SafePoint Polling address: 0x003c0000]
[Memory Serialize  Page address: 0x003e0000]
Logical CPUs per core: 1
UseSSE=4
Allocation: PREFETCHNTA 256, 3 lines with step 64 bytes
CPU:total 8 (4 cores per cpu, 1 threads per core) family 6 model 23
stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1

Warning:  SDK 1.6 Unsafe.prefetchRead/Write not found.
Warning:  SDK 1.7 Unsafe.copyMemory not found.
java version "1.6.0_18-ea-fastdebug"
Java(TM) SE Runtime Environment (build 1.6.0_18-ea-fastdebug-b04)
Java HotSpot(TM) Client VM (build 16.0-b11-fastdebug, mixed mode)

Dennis Byrne

On Wed, Nov 18, 2009 at 2:23 PM, Vladimir Kozlov
<Vladimir.Kozlov at sun.com> wrote:
> Dennis,
>
> run fastdebug java with next options
>
> java -XX:+PrintMiscellaneous -XX:+Verbose -version
>
> the "CPU:" line shows number of cores VM see.
>
> On solaris I see barrier when I declared variables as volatile:
>
> 046     MOV    [ESI + precise klass HelloWorld: 0x080f2198:Constant:exact
> *],EBX ! Field  VolatileHelloWorld.v
> 04c     MEMBAR-volatile (unnecessary so empty encoding)
> 04c     LOCK ADDL [ESP + #0], 0 ! membar_volatile
>
> Vladimir
>
> Dennis Byrne wrote:
>>
>> Still no luck on this.  There are two "Processor" items under "System
>> Information"/"System Summary" and Cygwin gives me this:
>>
>> $ cat /proc/cpuinfo | grep ^processor | wc -l
>> 8
>>
>> "$jdk1.6.0_18\fastdebug\bin\java.exe -XX:+PrintOptoAssembly -server
>> -XX:CompileCommand=print,*HelloWorld.run  HelloWorld" yields the
>> following:
>>
>> 01c     CMP    EBP,#1000000
>> 022     Jge    B13  P=0.000000 C=29521.000000
>> 022
>> 028   B2: #     B3 <- B1  Freq: 1
>> 028     MOV    EDX,EBP
>> 02a     INC    EDX
>> 02b     IMUL   EAX,EBP,#333
>> 031     MOV    ECX,#448
>> 036     MOV    EBX,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *]    # int ! Field HelloWorld.v
>> 03c     SUB    EBX,EAX
>> 03e     MOV    ECX,#452
>> 043     MOV    EDI,[ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *]    # int ! Field HelloWorld.v2
>> 043
>> 049   B3: #     B5 B4 <- B2 B4  Loop: B3-B4 inner stride: not constant pre
>> of N112 Freq: 2
>> 049     MOV    [ESP + #4],EBX
>> 04d     ADD    EAX,EBX
>> 04f     ADD    EDI,EAX
>> 051     MOV    ESI,EBP
>> 053     INC    ESI
>> 054     ADD    EDI,#333
>> 05a     MOV    EBX,#452
>> 05f     MOV    [EBX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EDI ! Field HelloWorld.v2
>> 065     ADD    EAX,#333
>> 06b     MOV    ECX,#448
>> 070     MOV    [ECX + precise klass HelloWorld:
>> 0x00a2fc30:Constant:exact *],EAX ! Field HelloWorld.v
>> 076     CMP    ESI,EDX
>> 078     Jge,s  B5       # Loop end  P=0.500000 C=29027.000000
>> 078
>> 07a   B4: #     B3 <- B3  Freq: 1
>> 07a     IMUL   EAX,EBP,#333
>> 080     ADD    EAX,#333
>> 086     MOV    EBP,ESI
>> 088     MOV    EBX,[ESP + #4]
>> 08c     JMP,s  B3
>> 08c
>> 08e   B5: #     B8 B6 <- B3  Freq: 1
>> 08e     MOV    ECX,#1000000
>> 093     SUB    ECX,ESI
>> 095     AND    ECX,#-16
>> 098     ADD    ECX,ESI
>> 09a     CMP    ESI,ECX
>> 09c     Jl,s  B8  P=0.999999 C=-1.000000
>>
>> Dennis Byrne
>>
>> On Wed, Nov 18, 2009 at 8:52 AM, Dennis Byrne <dennisbyrne at apache.org>
>> wrote:
>>>
>>> At this point my best guess is that this has something to do with the
>>> fact that I am running on a Virtual Machine.  cat /proc/cpuinfo
>>> indicates the machine does in fact see two processors.  The output I
>>> get for both -client and -server has no locks and no fences.  I will
>>> run the same class through PrintOptoAssembly outside of VM Workstation
>>> later in the day in order to build on my case that this has something
>>> to do with VMWare.  Thanks for your time on this.
>>>
>>> Dennis Byrne
>>>
>>> On Wed, Nov 18, 2009 at 3:21 AM, David Holmes - Sun Microsystems
>>> <David.Holmes at sun.com> wrote:
>>>>
>>>> Tim Bell said the following on 11/18/09 18:22:
>>>>>
>>>>> Also - are you running your tests on a single-CPU machine?  I recall
>>>>> HotSpot
>>>>> detects that and will elide any n-way synchronization measures if so...
>>>>
>>>> Yes that should have been my first question :-) Thanks Tim. On x86 this
>>>> stuff is elided if running on only a single core/processor.
>>>>
>>>> I just confirmed -client output showing the lock:addl "fences"
>>>>
>>>> David
>>>>
>>>>  ;;  block B2 [8, 31]
>>>>
>>>>  0xca4f72b8: mov    $0xc61f5590,%edx   ;...ba90551f c6
>>>>                                       ;   {oop('HelloWorld')}
>>>>  0xca4f72bd: mov    0x1c0(%edx),%ecx   ;...8b8ac001 0000
>>>>                                       ;*getstatic v
>>>>                                       ; - HelloWorld::run at 8 (line 10)
>>>>  0xca4f72c3: add    $0x14d,%ecx        ;...81c14d01 0000
>>>>  0xca4f72c9: mov    %ecx,0x1c0(%edx)   ;...898ac001 0000
>>>>  0xca4f72cf: lock addl $0x0,(%esp)     ;...f0830424 00
>>>>                                       ;*putstatic v
>>>>                                       ; - HelloWorld::run at 15 (line 10)
>>>>  0xca4f72d4: mov    0x1c4(%edx),%ecx   ;...8b8ac401 0000
>>>>                                       ;*getstatic v2
>>>>                                       ; - HelloWorld::run at 18 (line 11)
>>>>  0xca4f72da: mov    0x1c0(%edx),%edi   ;...8bbac001 0000
>>>>                                       ;*getstatic v
>>>>                                       ; - HelloWorld::run at 21 (line 11)
>>>>  0xca4f72e0: add    %edi,%ecx          ;...03cf
>>>>  0xca4f72e2: mov    %ecx,0x1c4(%edx)   ;...898ac401 0000
>>>>  0xca4f72e8: lock addl $0x0,(%esp)     ;...f0830424 00
>>>>                                       ;*putstatic v2
>>>>                                       ; - HelloWorld::run at 25 (line 11)
>>>>
>>>>
>>>>
>>>>
>>>>>> For x86 we should see a StoreLoad barrier after the assignment to v2:
>>>>>>
>>>>>> http://gee.cs.oswego.edu/dl/jmm/cookbook.html
>>>>>>
>>>>>> For -server printOptoAssembly shows our "mfence" replacement a
>>>>>> lock:addl
>>>>>> (which is the implementation of membar-volatile().
>>>>>>
>>>>>> [ Got to go get hsdis myself to test printAssembly directly ]
>>>>>
>>>>> HTH-
>>>>>
>>>>> Tim
>>>
>>>
>>> --
>>> Dennis Byrne
>>>
>>
>>
>>
>



-- 
Dennis Byrne


More information about the hotspot-dev mailing list