Re: 回复：回复： please help me about understanding method "OrderAccess::acquire()and OrderAccess::acquire()"

Thu Oct 20 04:21:52 UTC 2016

Adding back hotspot-dev mailing list ...

On 20/10/2016 1:52 PM, 恶灵骑士 wrote:
>
> Thank you very much as i could, David !
>
> two more question ,
> 1, simple one,
> is there a download link of  openjdk9 source code like openjdk8 ?
> i just find version8  http://download.java.net/openjdk/jdk8/

I wasn't aware of the jdk8 download link. The main location for the 
sources are the Mercurial repositories:

http://hg.openjdk.java.net/

> 2, about volatile implemetation,
> src/share/vm/c1/c1_LIRGenerator.cpp
> void LIRGenerator::do_StoreField(StoreField* x)
> void LIRGenerator::do_LoadField(LoadField* x)
>
> i add "cout << "  in below methods
> volatile_field_store
> volatile_field_load
>
> this is my test class
> class Test extends Thread {
>
>     volatile int keepRunning = 1;
>
>     public void run() {
>         while (keepRunning == 10000000) {
>         }
>         System.out.println(Thread.currentThread().getName()+ " Thread
> terminated.");
>     }
>
>     public static void main(String[] args) {
>         try{
>                         Test t = new Test();
>                         t.start();
>
>                         for(int i=0; i<10000000; i++){
>                                 t.keepRunning += 1;
>                         }
>
>                 }catch(Exception e){
>                 }
>         System.out.println(Thread.currentThread().getName() +"
> keepRunning set to false.");
>     }
> }
> result:   volatile_field_store print does not  occur every time,  why ?

You added the cout to the C1 compiler code, so it will only show up when 
running compiled code - initially your program will execute in the 
interpreter then eventually the loop will be compiled via 
on-stack-replacement and you will start to see the cout results.

You can try running with -Xcomp to force everything to be compiled on 
startup.

> if i change for(int i=0; i<10000000; i++){
>                                 t.keepRunning += 1;
>                         }
> the 10000000 to 20,
> then volatile_field_store  never occured,

You never compile the loop with a count of only 20.

Cheers,
David

>
> i learn the implemetation place of volatile from google,
> and add cout to verify if it's true,
> but i am confused about the testing result,
> if i am wrong at some place?
>
> thank you ...........!!!!!!!!!!
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "David Holmes";<david.holmes at oracle.com>;
> *发送时间:* 2016年10月20日(星期四) 中午11:20
> *收件人:* "恶灵骑士"<1072213404 at qq.com>; "hotspot-dev
> developers"<hotspot-dev at openjdk.java.net>;
> *主题:* Re: 回复： please help me about understanding method
> "OrderAccess::acquire()and OrderAccess::acquire()"
>
> On 20/10/2016 12:33 PM, 恶灵骑士 wrote:
>>
>> thank you so much ,David !
>> it's my fault not to clarify the problem,
>> --------------------------------------------------
>> the openjdk8 code shuold be :
>> inline void OrderAccess::acquire() {
>>   volatile intptr_t local_dummy;
>> #ifdef AMD64
>>   __asm__ volatile ("movq 0(%%rsp), %0" : "=r" (local_dummy) : :
> "memory");
>> #else
>>   __asm__ volatile ("movl 0(%%esp),%0" : "=r" (local_dummy) : : "memory");
>> #endif // AMD64
>> }
>>
>> inline void OrderAccess::release() {   ----------- copied wrong name
> before
>
> Ah! I should have realized that. :)
>
>>   // Avoid hitting the same cache-line from
>>   // different threads.
>>   volatile jint local_dummy = 0;
>> }
>> --------------------------------------------------
>> 1,
>> just for openjdk version 8,
>> you said that "The intent was to produce some code to force a "compiler
>> barrier" so that the acquire() semantics needed on x86 would exist
>>  force a "compiler barrier" ",
>> the keyword ' asm volatile' and 'memory'  are not enough to force
>> a "compiler barrier"?
>> why is "movq 0(%%rsp), %0" : "=r" (local_dummy) "  still needed?
>> and the local_dummy is a volatile filed,
>> if local_dummy without volatile qualifier , the code '__asm__ volatile
>> ("movq 0(%%rsp), %0" : "=r" (local_dummy) : : "memory");‘ still can
>> force a "compiler barrier"？
>>
>> and which part of  '__asm__ volatile ("movq 0(%%rsp), %0" : "=r"
>> (local_dummy) : : "memory");‘  force a 'compiler barrier',
>> 'volatile '  or  "movq 0(%%rsp), %0" : "=r" (local_dummy)  or  'memory'
>> or  the three together?
>
> You have to remember that this code has very old origins and that
> compilers have not been very clear when it comes to things like compiler
> barriers and memory ordering, and for a long time we were stuck using
> fairly old compilers. So what we have had in the past is a volatile load
> using asm "memory" to obtain the necessary "compiler barrier" to effect
> the acquire() semantics, and a volatile store to effect the "compiler
> barrier" for the release() semantics.
>
> For JDK 9, with our updated compilers, we have moved to the more direct
> compiler_barrier code for use with gcc:
>
> static inline void compiler_barrier() {
>     __asm__ volatile ("" : : : "memory");
> }
>
> and as you can see no actual store or load is needed any more.
>
>> -----
>> 2,
>> on x86, for acquire() , foring a 'comliler barrier' is enough?
>> 'hardware barrier' not needed?
>> for the reason of x86 ensure loadload and loadstore?
>
> x86 has total-store-ordering so no hardware barriers are needed. As long
> as the compiler has not reordered the program statements all stores will
> happen in the written order, so nothing needs to happen to achieve
> "acquire" semantics.
>
>> ---
>> 3,
>> i am confused about OrderAccess::release() ;
>> in windows_x86 version the comment is "// A volatile store has release
>> semantics."
>> in linux_x86 version the comment is  "// Avoid hitting the same
>> cache-line from
>>   // different threads."
>> what's the difference about "volatile jint local_dummy = 0;" between
>> windows and linux?
>> for windows i can find something in the c++ doc
>> that  from  vc++ 2005 ,vc compiler supports volatile acquire/release
>> sematics,
>> but for linux ,i can not  find anything about gcc  describing  volatile
>> acquire/release sematics,
>>
>> how does "volatile jint local_dummy = 0;" work on linux_x86?
>
> As you note VS 2005 explicitly provides acquire/release semantics for
> volatile read/write. So to tell the VS compiler to implement a release()
> operation we simply do:
>
> volatile int dummy = 0;
>
> and it will generate whatever code is needed to achieve release
> semantics (likely no code at all as it can elide the actual write to the
> dummy variable).
>
> For gcc there was no such explicit statement regarding release/acquire.
> So it was based on the observed behaviour and informal commentary on the
> gcc internals. It was found that a volatile write was sufficient to get
> the desired effects.
>
> The comment:
>
> // Avoid hitting the same cache-line from
> // different threads
>
> is unrelated to the acquire/release. In the past these operations wrote
> to a shared static variable. That turned out to be a performance issue
> because of cache contention. So the code was changed to write to a local
> variable instead - and the behaviour of the compiler was verified by
> inspection/observation.
>
> Hope that clarifies things.
>
> David
>
>>
>> thank you so so ... much !
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "David Holmes";<david.holmes at oracle.com>;
>> *发送时间:* 2016年10月20日(星期四) 上午6:39
>> *收件人:* "恶灵骑士"<1072213404 at qq.com>;
>> "hotspot-dev"<hotspot-dev at openjdk.java.net>;
>> *主题:* Re: please help me about understanding method
>> "OrderAccess::acquire()and OrderAccess::acquire()"
>>
>> On 19/10/2016 10:35 PM, 恶灵骑士 wrote:
>>> src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp
>>> inline void OrderAccess::acquire() {
>>>   volatile intptr_t local_dummy;
>>> #ifdef AMD64
>>>   __asm__ volatile ("movq 0(%%rsp), %0" : "=r" (local_dummy) : :
>> "memory");
>>> #else
>>>   __asm__ volatile ("movl 0(%%esp),%0" : "=r" (local_dummy) : :
> "memory");
>>> #endif // AMD64
>>> }
>>>
>>>
>>> inline void OrderAccess::acquire() {   ----------  should be release()
>>>   // Avoid hitting the same cache-line from
>>>   // different threads.
>>>   volatile jint local_dummy = 0;
>>> }
>>
>> As Kim stated these are old implementations. The intent was to produce
>> some code to force a "compiler barrier" so that the acquire() semantics
>> needed on x86 would exist - which is just a compiler barrier. The new
>> code relies on a more direct gcc technique:
>>
>> // A compiler barrier, forcing the C++ compiler to invalidate all memory
>> assumptions
>> static inline void compiler_barrier() {
>>    __asm__ volatile ("" : : : "memory");
>> }
>> inline void OrderAccess::acquire()    { compiler_barrier(); }
>>
>>>
>>> I have a few questions:
>>> 1,does gcc support the c++ keyword 'volatile' aquire/release sematics?
>>>
>>>
>>> if question 1's answer is 'yes', then i can understand the
>> implemetation of
>>> method 'OrderAccess::acquire()',
>>
>> Not sure exactly what you mean, but we do not rely on any C++ memory
>> model operations in the hotspot code - acquire/release semanics - we
>> just use volatile to flag variables that should not be optimized and use
>> OrderAccess operations to explicitly enforce any memory ordering
>> requirements.
>>
>>>
>>> 2, about the part 0f ' __asm__ volatile ("movq 0(%%rsp), %0" : "=r"
>> (local_dummy) : : "memory");'
>>> the 'volatile' prevents compiler optimize,
>>> the 'memory' ensure compiler no-reordering,
>>
>> Basically yes that was the intent. The implementation has changed over
> time.
>>
>>>
>>> then what about ""movq 0(%%rsp), %0" : "=r" (local_dummy) "? what's
>> this part effect? and the local_dummywas declared as 'volatile ',is it
>> necessary?
>>
>> That was to do the actual assignment to which the volatile and memory
>> would apply. This part is no longer necessary.
>>
>> David
>>
>>>
>>> thank you so so much!
>>>

Re: 回复： 回复： please help me about understanding method "OrderAccess::acquire()and OrderAccess::acquire()"

Re: 回复：回复： please help me about understanding method "OrderAccess::acquire()and OrderAccess::acquire()"