enhancement of cmpxchg and copy_to_survivor for ppc64

Carsten Varming varming at gmail.com
Mon Apr 18 13:01:47 UTC 2016


Dear Hiroshi and David,

I took a brief look at the patch and the idea looks good to me, in fact, a
bunch of the garbage collectors could benefit from similar changes (G1 and
ParNew both use oopDesc::forward_to_atomic which does not call
cas_set_mark, and they could both benefit from the weaker semantics).

An important question is: Should the shared parts of hotspot move towards
weaker memory models? If yes, then everybody should review code assuming
the weaker semantics. If no, then there really isn't room for patches like
this one :(.

The patch needs some work (IMHO). The name cas_set_mark must be renamed to
reflect the weaker semantics, and the ifdefs in the shared code should be
removed. All platforms should be changed as the Atomic API is extended, and
platforms with weaker memory models should implement the weaker semantics.
Finally, oopDesc::forward_to_atomic should use the weaker semantics,
preferable by calling cas_set_mark instead of calling into Atomic directly.

BTW. what is going on in oopDesc::forward_to_atomic? If curMark != oldMark,
then curMark must have been marked (a similar fact is checked by
"assert(is_forwarded())"
(modulo a short race)), oldMark is set to curMark and then
oldMark->is_marked is checked in the look condition. Why is there a loop?
It seems much simpler if the assert is changed to a
"guarantee(curMark->is_marked)" and the loop is changed to an if statement.
Oh, and an assert is probably sufficient.

My 2 cents,
Carsten

On Mon, Apr 18, 2016 at 12:38 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Hiroshi,
>
> On 18/04/2016 12:15 PM, Hiroshi H Horii wrote:
>
>> Hi David,
>>
>> Thank you for your replying and sorry that I didn't append my diff file
>> when the discussion was forwarded to this mailing list. I appended my
>> original diff file (hg diff -g) to this mail.
>>
>>  > It is fine for ppc to have variations of cmpxchg with different memory
>>  > barrier semantics, but the shared API must not be affected as there is
>> a
>>  > requirement that the basic form of this operation provide "full
>>  > bi-directional fence" semantics. Note that these semantics are not in
>>  > place to fulfill Java Memory Model requirements, but are an internal
>>  > contract in hotspot.
>>
>> Sure. Probably, it is better for me to modify my patch because it
>> changes the internal contract. I will create a new patch that adds new
>> cmpxchg functions for ppc.
>>
>
> I think this is only usable from PPC specific code, not from the shared
> code as per your original patch. The oopDesc::cas_set_mark may be written
> to expect the full bi-directional fence that is required by the atomic.hpp
> contract. If we break that contract we would have to prove correctness
> along all code paths using that code - well the ppc64 folk would have to do
> that :). But I would object to the platform-specific code in the shared
> file - sorry.
>
> Thanks,
> David
>
>  > Also can you get someone to host the webrev
>>  > for you on cr.openjdk.java.net? Or else include the diff in the bug
>> report.
>>
>> I will ask someone to create webrev after my next patch is created.
>>
>>
>>
>> Regards,
>> Hiroshi
>> -----------------------
>> Hiroshi Horii, Ph.D.
>> IBM Research - Tokyo
>>
>>
>> David Holmes <david.holmes at oracle.com> wrote on 04/16/2016 16:43:20:
>>
>>  > From: David Holmes <david.holmes at oracle.com>
>>  > To: Christian Thalinger <christian.thalinger at oracle.com>, Hiroshi H
>>  > Horii/Japan/IBM at IBMJP
>>  > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com>, ppc-aix-port-
>>  > dev at openjdk.java.net, hotspot-runtime-dev at openjdk.java.net
>>  > Date: 04/16/2016 16:46
>>  > Subject: Re: enhancement of cmpxchg and copy_to_survivor for ppc64
>>
>>  >
>>  > Hi Hiroshi,
>>  >
>>  > As the diff file does not survive the mail process I can't see the
>>  > actual proposed changes. There doesn't seem to be a bug for this so
>>  > could you please file one. Also can you get someone to host the webrev
>>  > for you on cr.openjdk.java.net? Or else include the diff in the bug
>> report.
>>  >
>>  > It is fine for ppc to have variations of cmpxchg with different memory
>>  > barrier semantics, but the shared API must not be affected as there is
>> a
>>  > requirement that the basic form of this operation provide "full
>>  > bi-directional fence" semantics. Note that these semantics are not in
>>  > place to fulfill Java Memory Model requirements, but are an internal
>>  > contract in hotspot.
>>  >
>>  > Thanks,
>>  > David
>>  >
>>  > On 12/04/2016 3:59 AM, Christian Thalinger wrote:
>>  > > [This should be on hotspot-runtime-dev.  BCC’ing
>> hotspot-compiler-dev.]
>>  > >
>>  > >> On Apr 8, 2016, at 12:53 AM, Hiroshi H Horii <HORII at jp.ibm.com>
>> wrote:
>>  > >>
>>  > >> Dear all:
>>  > >>
>>  > >> Can I please request reviews for the following change?
>>  > >> This change was created for JDK 9 and ppc64.
>>  > >>
>>  > >> Description:
>>  > >> This change adds options of compare-and-exchange for POWER
>> architecture.
>>  > >> As described in atomic_linux_ppc.inline.hpp, the current
>> implementation of
>>  > >> cmpxchg is fence_cmpxchg_acquire. This implementation is useful for
>>  > >> general purposes because twice calls of sync before and after
>> cmpxchg will
>>  > >> keep consistency. However, they sometimes cause overheads because
>>  > >> sync instructions are very expensive in the current POWER chip
>> design.
>>  > >> With this change, callers can explicitly specify to run fence and
>>  > acquire with
>>  > >> two additional bool parameters. Because their default values are
>> "true",
>>  > >> it is not necessary to modify existing cmpxchg calls.
>>  > >>
>>  > >> In addition, with the new parameters of cmpxchg, this change
>> improves
>>  > >> performance of copy_to_survivor in the parallel GC.
>>  > >> copy_to_survivor changes forward pointers by using cmpxchg. This
>>  > >> operation doesn't require any sync instructions, in my
>> understanding.
>>  > >> A pointer is changed at most once in a GC and when cmpxchg fails,
>>  > >> the latest pointer is available for the caller.
>>  > >>
>>  > >> When I evaluated SPECjbb2013 (slightly customized because obsolete
>> grizzly
>>  > >> doesn't support new version format of Java 9), pause time of young
>> GC was
>>  > >> reduced from 10% to 20%.
>>  > >>
>>  > >> Summary of source code changes:
>>  > >>
>>  > >> * src/share/vm/runtime/atomic.hpp
>>  > >> * src/share/vm/runtime/atomic.cpp
>>  > >> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
>>  > >>         - Add two arguments of fence and acquire to cmpxchg only
>> for PPC64.
>>  > >>           Though cmpxchg in atomic_linux_ppc.inline.hpp has some
>> branches,
>>  > >>           they are reduced while inlining to callers.
>>  > >>
>>  > >> * src/share/vm/oops/oop.inline.hpp
>>  > >>        - Changed cas_set_mark to call cmpxchg without fence and
>> acquire.
>>  > >>           cas_set_mark is called only by cas_forward_to that is
>>  > called only by
>>  > >>           copy_to_survivor_space and oop_promotion_failed in
>>  > >>           psPromotionManager.
>>  > >>
>>  > >> Code change:
>>  > >>
>>  > >>     Please see an attached diff file that was generated with "hg
>> diff -g"
>>  > >>     under the latest hotspot directory.
>>  > >>
>>  > >> Passed test:
>>  > >>      SPECjbb2013 (customized)
>>  > >>
>>  > >> * I believe some other cmpxchg will be optimized by reducing fence
>>  > >>    or acquire because twice calls of sync are too conservative
>> toimplement
>>  > >>    Java memory model.
>>  > >>
>>  > >>
>>  > >>
>>  > >> Regards,
>>  > >> Hiroshi
>>  > >> -----------------------
>>  > >> Hiroshi Horii, Ph.D.
>>  > >> IBM Research - Tokyo
>>  > >>
>>  > >
>>  >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20160418/de7f0c48/attachment.html>


More information about the ppc-aix-port-dev mailing list