RFR (S): 8029957: PPC64 (part 213): cppInterpreter: memory ordering for object initialization
David Holmes
david.holmes at oracle.com
Mon Jan 20 18:49:01 PST 2014
On 20/01/2014 11:41 PM, Lindenmaier, Goetz wrote:
> Hi David,
>
> I understand your arguments and basically agree with them.
> If the serialization page does not work on PPC, your solution
> 1) is best.
>
> But I'm not sure why there should be a link between TSO and whether
> the serialization page trick works. Second depends, as you say,
> on the OS implementation.
My limited understanding is that on RMO-based systems the requirements for:
"Synchronization based on page-protections - mprotect()"
as described in:
http://home.comcast.net/~pjbishop/Dave/Asymmetric-Dekker-Synchronization.txt
may not always hold. Dave Dice would need to provide more details if needed.
Cheers,
David
> I would assume that the write to the serialization page causes
> the OS to generate a new TLB entry or the like, involving the
> use of a ptwsync instruction which would be fine. But we are
> investigating this.
>
> We are also experimenting with a small regression test to find
> out whether the serialization page ever fails.
>
> Best regards,
> Goetz.
>
>
>
>
>
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Montag, 20. Januar 2014 03:45
> To: Lindenmaier, Goetz; Vladimir Kozlov
> Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev Source Developers'
> Subject: Re: RFR (S): 8029957: PPC64 (part 213): cppInterpreter: memory ordering for object initialization
>
> On 17/01/2014 11:30 PM, Lindenmaier, Goetz wrote:
>> Hi,
>>
>> I had a look at the first part of this issue: Whether StoreStore
>> is necessary in the interpreter. Let's for now assume the serialization
>> page mechanism works on PPC.
>>
>> In the state transition leaving the VM state, which is executed in the
>> destructor, ThreadStateTransition::transition() is called, which executes
>> if (UseMembar) {
>> OrderAccess::fence();
>> } else {
>> os::write_memory_serialize_page(thread);
>> }
>>
>> os:: write_memory_serialize_page() can not be considered a proper
>> MemBar, as it only serializes if another thread poisoned the page.
>> Thus it does not qualify to order the initialization and the publishing
>> of the object.
>>
>> You are right, if UseMembar is true, the StoreStore in the interpreter
>> is superfluous. We could guard the StoreStores in the interpreter by
>> !UseMembar.
>
> My understanding, from our existing non-TSO system ports, is that the
> present assumption is that either:
>
> a) you have a TSO system, in which case you are probably using the
> serialization page, but you don't need any barrier to enforce ordering
> anyway; or
>
> b) you don't have a TSO system, you are using UseMembar==true and so you
> get a full fence inserted that enforces the ordering anyway.
>
> So the ordering requirements are satisfied by piggy-backing on the
> UseMembar setting that comes from the thread state transition code,
> which forms part of the "runtime entry" code. That's not to say that you
> will necessarily find this applied consistently in all places where it
> might be applied - nor will you necessarily find that this is common
> knowledge amongst VM engineers.
>
> Technically the storeStore barriers could be conditional on !UseMembar
> but that is redundant in the current usage.
>
>> But then again, one is to order the publishing of the thread states,
>> the other to enforce some Java semantics. I don't know whether everybody
>> who changes in one place is aware of both issues. But if you want to,
>> I'll add a !UseMembar in the interpreter.
>
> Here are my preferred options in order:
>
> 1. Set UseMembar==true on PPC64 and drop these new storeStore barriers -
> rely on the piggy-backing effect.
>
> 2. Conditionalize the new storeStore barriers on !UseMembar. This
> unfortunately penalizes all platforms with a runtime check.
>
> 3. Add the storeStores unconditionally. This penalizes platforms that
> set UseMembar==true as we will now get two fences at runtime.
>
> I know we're talking about the interpreter here so performance is not
> exactly critical, but still ...
>
>> Maybe it would be a good idea
>> to document the double use in interfaceSupport.cpp, too. And maybe
>> add an assertion of some kind.
>
> interfaceSupport doesn't know that other code piggy-backs on the fact
> state-transitions have full fences when UseMembar is true. If it is
> documented anywhere it should be in the interpreter (and any other
> places that makes the same assumption) - something like:
>
> // On non-TSO systems there can be additional ordering constraints
> // between Java-level actions (such as allocation and constructor
> // invocation) that in principle need explicit memory barriers.
> // However, on many non-TSO systems the thread-state transition logic
> // in the IRT_ENTRY code will insert a full fence due to the use of
> // UseMembar==true, which provides the necessary ordering guarantees.
>
>>
>> We're digging into the other issue currenty, whether the serialization page
>> works on ppc. We understand your concerns and have no simple answer to
>> it right now. At least, in our VM and in the port there are no known problems
>> with the state transitions.
>
> Even if the memory serialization page does not work, in a guaranteed
> sense, on PPC-AIX, it is extremely unlikely that testing would expose
> this. Also note that the memory serialization page behaviour is more a
> function of the OS - so it may be that AIX is different to linux in that
> regard.
>
> Cheers,
> David
> -----
>
>> Best regards,
>> Goetz.
>>
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Donnerstag, 16. Januar 2014 19:16
>> To: David Holmes; Lindenmaier, Goetz
>> Cc: 'ppc-aix-port-dev at openjdk.java.net'; 'hotspot-dev Source Developers'
>> Subject: Re: RFR (S): 8029957: PPC64 (part 213): cppInterpreter: memory ordering for object initialization
>>
>> Changes are in C++ Interpreter so it does not affect Oracle VM.
>> But David has point here. I would like to hear the explanation too.
>>
>> BTW, I see that for ppc64:
>>
>> src/cpu/ppc/vm//globals_ppc.hpp:define_pd_global(bool, UseMembar, false);
>>
>> as result write_memory_serialize_page() is used in
>> ThreadStateTransition::transition().
>>
>> Is it not enough on PPC64?
>>
>> Thanks,
>> Vladimir
>>
>> On 1/15/14 9:30 PM, David Holmes wrote:
>>> Can I get some response on this please - specifically the redundancy wrt
>>> IRT_ENTRY actions.
>>>
>>> Thanks,
>>> David
>>>
>>> On 20/12/2013 11:55 AM, David Holmes wrote:
>>>> Still catching up ...
>>>>
>>>> On 11/12/2013 9:46 PM, Lindenmaier, Goetz wrote:
>>>>> Hi,
>>>>>
>>>>> this change adds StoreStore barriers after object initialization and
>>>>> after constructor calls in the C++ interpreter. This assures no
>>>>> uninitialized
>>>>> objects or final fields are visible.
>>>>> http://cr.openjdk.java.net/~goetz/webrevs/8029957-0-moci/
>>>>
>>>> The InterpreterRuntime calls are all IRT_ENTRY points which will utilize
>>>> thread state transitions that already include a full "fence" so the
>>>> storestore barriers are redundant in those cases.
>>>>
>>>> The fastpath _new storestore seems okay.
>>>>
>>>> I don't know how handle_return gets used to know if it is reasonable or
>>>> not.
>>>>
>>>> I was trying, unsuccessfully, to examine the same code in the
>>>> templateInterpreter to see how it handles these cases as it naturally
>>>> has the same object-initialization-safety requirements (though these can
>>>> be handled in a number of different ways other than an unconditional
>>>> storestore barrier at the end of the initialization and construction
>>>> phases.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> Please review and test this change.
>>>>>
>>>>> Best regards,
>>>>> Goetz.
>>>>>
More information about the ppc-aix-port-dev
mailing list