RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

David Holmes david.holmes at oracle.com
Fri Oct 14 06:15:43 UTC 2016

On 11/10/2016 12:30 AM, Hiroshi H Horii wrote:
> Hi Thomas, David, and all,
>> I think you intended to modify cmpxchg_pre_membar not
>> cmpxchg_post_membar!
> The previous patch will change only behavior of cmpxchg_pre_membar. But

No it changed the post-membar:


   inline void cmpxchg_post_membar(cmpxchg_memory_order order) {
!   if (order == memory_order_conservative) {
       __asm__ __volatile__ (
         /* fence */

but I did get confused in what I wrote previously. Given the release 
must come before the store the pre barrier must be the one that does 
that - as per latest code.

> the patch is not good to be reviewed (it was not obvious) and Martin
> suggested me to use lwsync rather than sync.
> I created a new webrev. This webrev includes all points that David and
> Thomas pointed also.
> http://cr.openjdk.java.net/~horii/8154736/webrev.05/
> With this change, callers of copy_to_survivor_space can safely touch
> fields of returned obj because OrderAccess::acquire() is called in
> copy_to_survivor_space when CAS fails.

So the intent is that the acquire() pairs with the release() semantics 
of the cmpxchg store that succeeded in the other thread. That makes 
sense, though I really have to question the trade-off in code complexity 
and understandability against any performance gain due to a slight 
reduction in the barrier strengths. Do you have any metrics on this 
latest version?

>> Changes in shared code must be algorithmically correct on all platforms.
>> Not just "it will work fine today".
>> Given all then work being done to add missing barriers, removing them
>> must come with a detailed analysis establishing the safety of doing so.
>> And I am not seeing that here.
> The latest codes in the repository are missing some calls of
> OrderAccess::acquire() before touching fileds of new_obj or
> o->forwardee() in PSPromotionManager::copy_and_push_safe_barrier and
> copy_to_survivor_space respectivey. I believe, this webrev correct them,
> also.
> Some methods call forwardee(). However, they don't toruch fields of
> forwardee while copying survived objects to a survivor space.
>   PSMarkSweepDecorator::compact()
>   PSPromotionManager::process_array_chunk()
>   PSPromotionManager::claim_or_forward_internal_depth()

Focusing on the code for now ...


In cmpxchg_post_membar I think it is preferable to maintain the existing 
default of a full fence if not specifically a "release" or "relaxed" 
operation ie:

  inline void cmpxchg_post_membar(cmpxchg_memory_order order) {
    if (order == memory_order_release) {
      // no post membar
    } else if (order != memory_order_relaxed) {
      __asm__ __volatile__ (
        /* fence */

as is done for the pre-membar. If nothing else pre-membar and 
post-membar should be consistent in their approach.



507     // call acquire for reading fields of obj in callers

May I suggest:

// acquire() by cas loser is needed to pair with 'release' of cas winner
// so we can safely access data (eg. fields of obj)



258       // call acquire for reading fields of new_obj in callers
264     // call acquire for reading fields of new_obj in callers

Same as above.

  264     // call acquire for reading fields of new_obj in callers
  265     OrderAccess::acquire();

If I'm reading this right this is the else part of:

  119   // The same test as "o->is_forwarded()"
  120   if (!test_mark->is_marked()) {

and it less clear what the acquire() is pairing with, but presumably it 
is still the release of a successful cas_forward_to. But given the 
isolation of this code from the modified cas operation I have to wonder 
about performance again - how often will we take this path with the new 
barrier, compared to the paths with the now modified cas operations?


Overall the way this proposal has been presented does not instill me 
with great confidence about its correctness, or performance benefit:

1. Replace strong cas with a relaxed cas
Issue: accessing obj fields in logging statement may not be safe.
2. Remove access to obj fields in logging statements
Issue: callee access to obj fields is also unsafe
3. Return NULL on failed cas paths
Issue: changes overall semantics and correctness of returning NULL is 
very unclear.
4. Go back to previous code.
Issue: callee access to obj fields is also unsafe
Suggestion from Kim: fully relaxed seems unsafe but using "release" 
semantics might be okay [this hypothesis warranted detailed discussion 
but there was none]
5. replace relaxed cas with release cas
6. Add in acquire() on cas-losing paths, and general access to forwardee

Aside: we can presumably add back the logging statement in the losing 
path now we have the acquire() in place.

So far the only justification for making these changes to the GC code 
come from the April discussion [1] where it was stated simply that:

"We've looked at the proposed changes and we are pretty sure that the
cmpxchg done during copy_to_survivor_space in the parallel GC doesn't
require the full fence/acquire semantics." [Martin & Volker]

Reading back through all the emails, including the ones in April, I 
_think_ part of the reasoning here is that we're not doing a CAS that 
publishes a new object that was just created, but that we have 
previously created that object using a full CAS and are now only 
updating the markword of another object with a forwarding pointer. The 
second cas would not need full fence semantics as the other object is 
already visible. However I am not a GC expert and other comments by GC 
folk suggest that is not in fact the case, or at least is not 
necessarily always the case. So I can not establish that what is being 
proposed is correct.

I think the GC experts need to have a discussion to resolve things to 
their mutual satisfaction.




> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo

More information about the hotspot-runtime-dev mailing list