enhancement of cmpxchg and copy_to_survivor for ppc64
David Holmes
david.holmes at oracle.com
Tue Apr 19 00:03:39 UTC 2016
Hi Carsten,
On 18/04/2016 11:01 PM, Carsten Varming wrote:
> Dear Hiroshi and David,
>
> I took a brief look at the patch and the idea looks good to me, in fact,
> a bunch of the garbage collectors could benefit from similar changes (G1
> and ParNew both use oopDesc::forward_to_atomic which does not call
> cas_set_mark, and they could both benefit from the weaker semantics).
>
> An important question is: Should the shared parts of hotspot move
> towards weaker memory models? If yes, then everybody should review code
> assuming the weaker semantics. If no, then there really isn't room for
> patches like this one :(.
The shared code should be written in a form that is correct on the
weakest memory model we support - always using the appropriate Atomic
and/or OrderAccess operations. On individual platforms those operations
then reduce - sometimes to no-ops - based on the platform semantics.
But that also requires we have a comprehensive set of Atomic and/or
OrderAccess operations to choose from, so that we don't define
algorithms that are unnecessarily excessive in their use of memory
barriers. That said, establishing that a given algorithm+data-structure
can be coded in a more relaxed form is a significant challenge in terms
of "proving" correctness and adequate testing. I expect conservatism to
rule in many places.
More on Volker's response ...
Thanks,
David
> The patch needs some work (IMHO). The name cas_set_mark must be renamed
> to reflect the weaker semantics, and the ifdefs in the shared code
> should be removed. All platforms should be changed as the Atomic API is
> extended, and platforms with weaker memory models should implement the
> weaker semantics. Finally, oopDesc::forward_to_atomic should use the
> weaker semantics, preferable by calling cas_set_mark instead of calling
> into Atomic directly.
>
> BTW. what is going on in oopDesc::forward_to_atomic? If curMark !=
> oldMark, then curMark must have been marked (a similar fact is checked
> by "assert(is_forwarded())" (modulo a short race)), oldMark is set to
> curMark and then oldMark->is_marked is checked in the look condition.
> Why is there a loop? It seems much simpler if the assert is changed to a
> "guarantee(curMark->is_marked)" and the loop is changed to an if
> statement. Oh, and an assert is probably sufficient.
>
> My 2 cents,
> Carsten
>
> On Mon, Apr 18, 2016 at 12:38 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Hiroshi,
>
> On 18/04/2016 12:15 PM, Hiroshi H Horii wrote:
>
> Hi David,
>
> Thank you for your replying and sorry that I didn't append my
> diff file
> when the discussion was forwarded to this mailing list. I
> appended my
> original diff file (hg diff -g) to this mail.
>
> > It is fine for ppc to have variations of cmpxchg with
> different memory
> > barrier semantics, but the shared API must not be affected
> as there is a
> > requirement that the basic form of this operation provide "full
> > bi-directional fence" semantics. Note that these semantics
> are not in
> > place to fulfill Java Memory Model requirements, but are an
> internal
> > contract in hotspot.
>
> Sure. Probably, it is better for me to modify my patch because it
> changes the internal contract. I will create a new patch that
> adds new
> cmpxchg functions for ppc.
>
>
> I think this is only usable from PPC specific code, not from the
> shared code as per your original patch. The oopDesc::cas_set_mark
> may be written to expect the full bi-directional fence that is
> required by the atomic.hpp contract. If we break that contract we
> would have to prove correctness along all code paths using that code
> - well the ppc64 folk would have to do that :). But I would object
> to the platform-specific code in the shared file - sorry.
>
> Thanks,
> David
>
> > Also can you get someone to host the webrev
> > for you on cr.openjdk.java.net <http://cr.openjdk.java.net>?
> Or else include the diff in the bug
> report.
>
> I will ask someone to create webrev after my next patch is created.
>
>
>
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
>
> David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote on 04/16/2016 16:43:20:
>
> > From: David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> > To: Christian Thalinger <christian.thalinger at oracle.com
> <mailto:christian.thalinger at oracle.com>>, Hiroshi H
> > Horii/Japan/IBM at IBMJP
> > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
> <mailto:Tim_Ellison at uk.ibm.com>>, ppc-aix-port-
> > dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
> hotspot-runtime-dev at openjdk.java.net
> <mailto:hotspot-runtime-dev at openjdk.java.net>
> > Date: 04/16/2016 16:46
> > Subject: Re: enhancement of cmpxchg and copy_to_survivor for
> ppc64
>
> >
> > Hi Hiroshi,
> >
> > As the diff file does not survive the mail process I can't
> see the
> > actual proposed changes. There doesn't seem to be a bug for
> this so
> > could you please file one. Also can you get someone to host
> the webrev
> > for you on cr.openjdk.java.net <http://cr.openjdk.java.net>?
> Or else include the diff in the bug
> report.
> >
> > It is fine for ppc to have variations of cmpxchg with
> different memory
> > barrier semantics, but the shared API must not be affected
> as there is a
> > requirement that the basic form of this operation provide "full
> > bi-directional fence" semantics. Note that these semantics
> are not in
> > place to fulfill Java Memory Model requirements, but are an
> internal
> > contract in hotspot.
> >
> > Thanks,
> > David
> >
> > On 12/04/2016 3:59 AM, Christian Thalinger wrote:
> > > [This should be on hotspot-runtime-dev. BCC’ing
> hotspot-compiler-dev.]
> > >
> > >> On Apr 8, 2016, at 12:53 AM, Hiroshi H Horii
> <HORII at jp.ibm.com <mailto:HORII at jp.ibm.com>> wrote:
> > >>
> > >> Dear all:
> > >>
> > >> Can I please request reviews for the following change?
> > >> This change was created for JDK 9 and ppc64.
> > >>
> > >> Description:
> > >> This change adds options of compare-and-exchange for POWER
> architecture.
> > >> As described in atomic_linux_ppc.inline.hpp, the current
> implementation of
> > >> cmpxchg is fence_cmpxchg_acquire. This implementation is
> useful for
> > >> general purposes because twice calls of sync before and after
> cmpxchg will
> > >> keep consistency. However, they sometimes cause overheads
> because
> > >> sync instructions are very expensive in the current POWER
> chip design.
> > >> With this change, callers can explicitly specify to run
> fence and
> > acquire with
> > >> two additional bool parameters. Because their default
> values are
> "true",
> > >> it is not necessary to modify existing cmpxchg calls.
> > >>
> > >> In addition, with the new parameters of cmpxchg, this
> change improves
> > >> performance of copy_to_survivor in the parallel GC.
> > >> copy_to_survivor changes forward pointers by using
> cmpxchg. This
> > >> operation doesn't require any sync instructions, in my
> understanding.
> > >> A pointer is changed at most once in a GC and when
> cmpxchg fails,
> > >> the latest pointer is available for the caller.
> > >>
> > >> When I evaluated SPECjbb2013 (slightly customized because
> obsolete
> grizzly
> > >> doesn't support new version format of Java 9), pause time
> of young
> GC was
> > >> reduced from 10% to 20%.
> > >>
> > >> Summary of source code changes:
> > >>
> > >> * src/share/vm/runtime/atomic.hpp
> > >> * src/share/vm/runtime/atomic.cpp
> > >> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
> > >> - Add two arguments of fence and acquire to
> cmpxchg only
> for PPC64.
> > >> Though cmpxchg in atomic_linux_ppc.inline.hpp
> has some
> branches,
> > >> they are reduced while inlining to callers.
> > >>
> > >> * src/share/vm/oops/oop.inline.hpp
> > >> - Changed cas_set_mark to call cmpxchg without
> fence and
> acquire.
> > >> cas_set_mark is called only by cas_forward_to
> that is
> > called only by
> > >> copy_to_survivor_space and oop_promotion_failed in
> > >> psPromotionManager.
> > >>
> > >> Code change:
> > >>
> > >> Please see an attached diff file that was generated
> with "hg
> diff -g"
> > >> under the latest hotspot directory.
> > >>
> > >> Passed test:
> > >> SPECjbb2013 (customized)
> > >>
> > >> * I believe some other cmpxchg will be optimized by
> reducing fence
> > >> or acquire because twice calls of sync are too
> conservative
> toimplement
> > >> Java memory model.
> > >>
> > >>
> > >>
> > >> Regards,
> > >> Hiroshi
> > >> -----------------------
> > >> Hiroshi Horii, Ph.D.
> > >> IBM Research - Tokyo
> > >>
> > >
> >
>
>
More information about the ppc-aix-port-dev
mailing list