RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
David Holmes
david.holmes at oracle.com
Fri Sep 30 11:12:27 UTC 2016
On 30/09/2016 8:17 PM, Hiroshi H Horii wrote:
> Dear David, and Dan,
>
> Thank you for your comments.
>
>> In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
>> 266 the log line reads data from the forwardee even when the CAS
>> fails. I believe those reads will be unsafe without barriers after
>> the copy of the content of the object.
>> hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288
>> same problem as in line 266
>
> Can we use o->size() or new_obj_size instead of new_obj->size()?
>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>
> In my understanding, while copying objects to a survivor space, if a
> thread creates a new_obj and sets a pointer with CAS, the other threads
> can touch the new_obj after the thread calls push_contents(new_obj)
> (Line: 239). In push_contents, OrderAccess::release_store is called
> before pushing the object as a task into a deque of workstealing
> (taskqueue.inline.hpp). If the other thread reads the task, all of copy
> for new_obj is safe.
I'm not familiar with the larger picture of the GC protocols here, but
just looking at this code fragment in isolation if the CAS fails we read
o->forwardee() to set new_obj. That in itself is fine because we're
reading the field that we were testing with the CAS. But we could then
deference new_obj before the thread that won the CAS calls
push_contents; and even if it is after push_contents we have not done an
acquire to pair with the release-store in push_contents.
So I'm really not seeing how we can use a barrier-less CAS here.
David
-----
>
> Thank you for your helps again. I may be misunderstanding or missing
> something critical. Any comments and claims are always appreciated.
>
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
>
> David Holmes <david.holmes at oracle.com> wrote on 09/30/2016 07:16:16:
>
>> From: David Holmes <david.holmes at oracle.com>
>> To: Carsten Varming <varming at gmail.com>, Hiroshi H Horii/Japan/IBM at IBMJP
>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com>, "ppc-aix-port-
>> dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>, "hotspot-
>> runtime-dev at openjdk.java.net" <hotspot-runtime-
>> dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
>> gc-dev at openjdk.java.net>, hotspot-compiler-dev <hotspot-compiler-
>> dev-bounces at openjdk.java.net>
>> Date: 09/30/2016 07:17
>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> copy_to_survivor for ppc64
>>
>> On 30/09/2016 12:47 AM, Carsten Varming wrote:
>> > Dear Hiroshi,
>> >
>> > In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:266
>> > the log line reads data from the forwardee even when the CAS fails. I
>> > believe those reads will be unsafe without barriers after the copy of
>> > the content of the object.
>>
>> I find it extremely hard to reason about a barrier-less cmpxchg in
> general.
>>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>>
>> David
>> -----
>>
>> > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 same
>> > problem as in line 266
>> >
>> > I would argue that the logging should only happen if the thread
>> > successfully copied the object and CAS failures should be logged
>> > separately without reading data from the forwardee.
>> >
>> > BTW, unrelated to your change: It seems like the logging in line 266
>> > should be guarded by something like "if (log_develop_is_enabled(Trace,
>> > gc, scavenge)" like the logging in line 288.
>> >
>> > Carsten
>> >
>> > On Thu, Sep 29, 2016 at 8:00 AM, Hiroshi H Horii <HORII at jp.ibm.com
>> > <mailto:HORII at jp.ibm.com>> wrote:
>> >
>> > Hi all,
>> >
>> > Can I please request reviews for a change for 8154736 that improve
>> > copy_to_survivor performance of ppc64 and aarch64?
>> > If possible, I would like to include this change into jdk9.
>> >
>> > 8154736 includes two changes, cmpxchg and copy_to_suvivor, and the
>> > former
>> > was resolved as 8155949.
>> > Now, I would like to ask a review for the remaining, copy_to_suvivor
>> > change.
>> >
>> > webrev:
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/>
>> > JIRA: https://bugs.openjdk.java.net/browse/JDK-8154736
>> > <https://bugs.openjdk.java.net/browse/JDK-8154736>
>> >
>> > I tested this change with SPECjbb2013. Also, I re-check that relaxed
>> > cmpxchg is available for changing forwarding pointers. However,
> because
>> > this change is sensitive, we need more reviews not only from
>> > compiler-dev,
>> > but also from gc-dev.
>> >
>> > Regards,
>> > Hiroshi
>> > -----------------------
>> > Hiroshi Horii, Ph.D.
>> > IBM Research - Tokyo
>> >
>> >
>> >
>> >
>> > From: David Holmes <david.holmes at oracle.com
>> > <mailto:david.holmes at oracle.com>>
>> > To: "Doerr, Martin" <martin.doerr at sap.com
>> > <mailto:martin.doerr at sap.com>>, Hiroshi H
>> > Horii/Japan/IBM at IBMJP
>> > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> > <mailto:Tim_Ellison at uk.ibm.com>>,
>> > "ppc-aix-port-dev at openjdk.java.net
>> > <mailto:ppc-aix-port-dev at openjdk.java.net>"
>> > <ppc-aix-port-dev at openjdk.java.net
>> > <mailto:ppc-aix-port-dev at openjdk.java.net>>,
>> > "hotspot-gc-dev at openjdk.java.net
>> > <mailto:hotspot-gc-dev at openjdk.java.net>"
>> > <hotspot-gc-dev at openjdk.java.net
>> > <mailto:hotspot-gc-dev at openjdk.java.net>>,
>> > "hotspot-runtime-dev at openjdk.java.net
>> > <mailto:hotspot-runtime-dev at openjdk.java.net>"
>> > <hotspot-runtime-dev at openjdk.java.net
>> > <mailto:hotspot-runtime-dev at openjdk.java.net>>
>> > Date: 05/10/2016 19:31
>> > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> > copy_to_survivor for ppc64
>> >
>> >
>> >
>> > On 10/05/2016 7:41 PM, Doerr, Martin wrote:
>> > > Hi David,
>> > >
>> > > thank you very much for testing the other platforms.
>> > >
>> > > Here's an updated webrev:
>> > > http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/
>> > <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/>
>> >
>> > Thanks. Second test run on its way.
>> >
>> > David
>> > -----
>> >
>> > > Best regards,
>> > > Martin
>> > >
>> > > -----Original Message-----
>> > > From: hotspot-runtime-dev [
>> > mailto:hotspot-runtime-dev-bounces at openjdk.java.net
>> > <mailto:hotspot-runtime-dev-bounces at openjdk.java.net>] On Behalf Of
>> > David
>> > Holmes
>> > > Sent: Dienstag, 10. Mai 2016 11:11
>> > > To: Hiroshi H Horii <HORII at jp.ibm.com <mailto:HORII at jp.ibm.com>>
>> > > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> > <mailto:Tim_Ellison at uk.ibm.com>>;
>> > ppc-aix-port-dev at openjdk.java.net
>> > <mailto:ppc-aix-port-dev at openjdk.java.net>;
>> > hotspot-gc-dev at openjdk.java.net
>> > <mailto:hotspot-gc-dev at openjdk.java.net>;
>> > hotspot-runtime-dev at openjdk.java.net
>> > <mailto:hotspot-runtime-dev at openjdk.java.net>
>> > > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> > copy_to_survivor for ppc64
>> > >
>> > > The fix seems incomplete for solaris:
>> > >
>> > > make/Main.gmk:232: recipe for target 'hotspot' failed
>> > >
>> > "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> > > line 124: Error: Too many arguments in call to
>> > > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> > >
>> > "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> > > line 128: Error: Too many arguments in call to
>> > > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> > >
>> > > David
>> > >
>> > > On 10/05/2016 5:34 PM, David Holmes wrote:
>> > >> Hi Hiroshi,
>> > >>
>> > >> On 6/05/2016 8:11 PM, Hiroshi H Horii wrote:
>> > >>> Hi David,
>> > >>>
>> > >>> Thank you for your comments.
>> > >>>
>> > >>> As Martin suggested me, I would like to separate this
> proposal to
>> > >>> - relaxing memory order of cmpxchg
>> > >>> - improvement of copy_to_survivior with relaxed cmpxchg
>> > >>> and discuss the former first.
>> > >>>
>> > >>> Martin thankfully created a new webrev that include a change of
>> > cmpxchg.
>> > >>>
>> > http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/
>> > <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/>
>> > >>> He has already tested it with AIX, linuxx86_64, linuxppc64le and
>> > >>> darwinintel64.
>> > >>> (Please tell me if I need to send a new mail for this PFR)
>> > >>
>> > >> Please do as it will be simpler to track that way.
>> > >>
>> > >>>> What I would prefer to see is an additional memory_order value
>> > (such
>> > as
>> > >>>> memory_order_ignored) which is the default for all methods
> declared
>> > to
>> > >>>> take a memory_order parameter.
>> > >>>
>> > >>> We added simple enum to specify memory order in atomic.hpp as
>> > follows.
>> > >>>
>> > >>> typedef enum cmpxchg_cmpxchg_memory_order {
>> > >>> memory_order_relaxed,
>> > >>> memory_order_conservative
>> > >>> } cmpxchg_memory_order;
>> > >>>
>> > >>> All of cmpxchg functions have an argument of
> cmpxchg_memory_order
>> > >>> with a default value memory_order_conservative that uses the
> same
>> > >>> semantics with the existing cmpxchg and requires no change
> for the
>> > >>> existing
>> > >>> callers. If you think "memory_order_ignored" is better than
>> > >>> "memory_order_conservative", I will be happy to modify this
> change.
>> > >>> (I just thought, "ignored" may resemble "relaxed" and may make
>> > >>> people who are familiar with C++11's memory semantics confused.
>> > >>> I would like to know thoughts of native speakers.)
>> > >>
>> > >> That is fine by me. I don't think "ignored" would be confused
> with
>> > >> "relaxed", but "conservative" is fine.
>> > >>
>> > >> I will run the patch through our internal build system while you
>> > prepare
>> > >> the updated RFR. My only concern is "unused argument" warnings
>> > from the
>> > >> compiler. :)
>> > >>
>> > >> We are quickly running into a hard deadline with Feature Complete
>> > >> however - possibly less than 24 hours - for hotspot changes.
> If this
>> > >> doesn't get in in time I will see if I can shepherd it
> through the
>> > >> approval process.
>> > >>
>> > >> Thanks,
>> > >> David
>> > >>
>> > >>
>> > >>> Regards,
>> > >>> Hiroshi
>> > >>> -----------------------
>> > >>> Hiroshi Horii, Ph.D.
>> > >>> IBM Research - Tokyo
>> > >>>
>> > >>>
>> > >>> David Holmes <david.holmes at oracle.com
>> > <mailto:david.holmes at oracle.com>> wrote on 05/04/2016 14:55:29:
>> > >>>
>> > >>>> From: David Holmes <david.holmes at oracle.com
>> > <mailto:david.holmes at oracle.com>>
>> > >>>> To: Hiroshi H Horii/Japan/IBM at IBMJP
>> > >>>> Cc: hotspot-gc-dev at openjdk.java.net
>> > <mailto:hotspot-gc-dev at openjdk.java.net>, hotspot-runtime-
>> > >>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> > ppc-aix-port-dev at openjdk.java.net
>> > <mailto:ppc-aix-port-dev at openjdk.java.net>, Tim Ellison
>> > >>>> <Tim_Ellison at uk.ibm.com <mailto:Tim_Ellison at uk.ibm.com>>,
>> > Volker Simonis <volker.simonis at gmail.com
>> > <mailto:volker.simonis at gmail.com>>,
>> > >>>> "Doerr, Martin" <martin.doerr at sap.com
>> > <mailto:martin.doerr at sap.com>>, "Lindenmaier, Goetz"
>> > >>>> <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>>
>> > >>>> Date: 05/04/2016 14:57
>> > >>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> > >>>> copy_to_survivor for ppc64
>> > >>>>
>> > >>>> Hi Hiroshi,
>> > >>>>
>> > >>>> Sorry for the delay on getting back to this.
>> > >>>>
>> > >>>> On 25/04/2016 5:09 PM, Hiroshi H Horii wrote:
>> > >>>>> Hi David,
>> > >>>>>
>> > >>>>> Thank you for your comments and questions.
>> > >>>>>
>> > >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> > >>>>>> memory_order_seq_cst?
>> > >>>>>
>> > >>>>> This is very good question..
>> > >>>>>
>> > >>>>> I guess, cmpxchg needs a more conservative constraint for
> memory
>> > >>> ordering
>> > >>>>> than C++11, to add sync after a compare-and-exchange
> operation.
>> > >>>>>
>> > >>>>> Could someone give comments or thoughts?
>> > >>>>
>> > >>>> I don't want to comment on the comparison with C++11. What
> I would
>> > >>>> prefer to see is an additional memory_order value (such as
>> > >>>> memory_order_ignored) which is the default for all methods
> declared
>> > to
>> > >>>> take a memory_order parameter. That way existing
>> > implementations are
>> > >>>> clearly ignoring the memory_order attribute and there is no
>> > potential
>> > >>>> for confusion as to whether the existing implementations
> equate to
>> > >>>> memory_order_seq_cst or not.
>> > >>>>
>> > >>>> That said, I'm not sure it makes sense to add the memory_order
>> > parameter
>> > >>>> to all methods with "cas" in their name, e.g.
>> > oopDesc::cas_set_mark,
>> > >>>> oopDesc::cas_forward_to, unless those methods can sensibly be
>> > called
>> > >>>> with any value for memory_order - which seems highly unlikely.
>> > Perhaps
>> > >>>> those methods should identify the weakest form of
> memory_order they
>> > >>>> support and that should be hard-wired into them?
>> > >>>>
>> > >>>> Thanks,
>> > >>>> David
>> > >>>>
>> > >>>>> memory_order_seq_cst is defined as
>> > >>>>> "Any operation with this memory order is both an acquire
>> > >>> operation and
>> > >>>>> a release operation, plus a single total order exists in
>> > which
>> > >>>> all
>> > >>>>> threads
>> > >>>>> observe all modifications (see below) in the same order."
>> > >>>>> (http://en.cppreference.com/w/cpp/atomic/memory_order
>> > <http://en.cppreference.com/w/cpp/atomic/memory_order>)
>> > >>>>>
>> > >>>>> In my environment, g++ and xlc generate following
> assemblies on
>> > >>>> ppc64le.
>> > >>>>> (interestingly, they generates the same assemblies for any
>> > >>>> memory_order)
>> > >>>>>
>> > >>>>> g++ (4.9.2)
>> > >>>>> 100008a4: ac 04 00 7c sync
>> > >>>>> 100008a8: 28 50 20 7d lwarx r9,0,r10
>> > >>>>> 100008ac: 00 18 09 7c cmpw r9,r3
>> > >>>>> 100008b0: 0c 00 c2 40 bne- 100008bc
>> > >>>>> 100008b4: 2d 51 80 7c stwcx. r4,0,r10
>> > >>>>> 100008b8: f0 ff c2 40 bne- 100008a8
>> > >>>>> 100008bc: 2c 01 00 4c isync
>> > >>>>>
>> > >>>>> xlc (13.1.3)
>> > >>>>> 10000888: ac 04 00 7c sync
>> > >>>>> 1000088c: 28 28 c0 7c lwarx r6,0,r5
>> > >>>>> 10000890: 40 00 26 7c cmpld r6,r0
>> > >>>>> 10000894: 0c 00 82 40 bne 100008a0
>> > >>>>> 10000898: 2d 29 80 7c stwcx. r4,0,r5
>> > >>>>> 1000089c: f0 ff e2 40 bne+ 1000088c
>> > >>>>> 100008a0: 2c 01 00 4c isync
>> > >>>>>
>> > >>>>> On the other hand, the current OpenJDK generates following
>> > assemblies.
>> > >>>>>
>> > >>>>> 508: ac 04 00 7c sync
>> > >>>>> 50c: 00 00 5c e9 ld r10,0(r28)
>> > >>>>> 510: 00 50 3b 7c cmpd r27,r10
>> > >>>>> 514: 1c 00 c2 40 bne- 530
>> > >>>>> 518: a8 40 5c 7d ldarx r10,r28,r8
>> > >>>>> 51c: 00 50 3b 7c cmpd r27,r10
>> > >>>>> 520: 10 00 c2 40 bne- 530
>> > >>>>> 524: ad 41 3c 7d stdcx. r9,r28,r8
>> > >>>>> 528: f0 ff c2 40 bne- 518
>> > >>>>> 52c: ac 04 00 7c sync
>> > >>>>> 530: 00 50 bb 7f ...
>> > >>>>>
>> > >>>>> Though we can ignore 50c-514 (because they are a
> duplicated guard
>> > >>>>> condition),
>> > >>>>> the last sync instruction (52c) makes cmpxchg more strict than
>> > >>>>> memory_order_seq_cst.
>> > >>>>>
>> > >>>>> In some cases, the last sync is necessary when this thread
> must be
>> > >>>> able
>> > >>>>> to read
>> > >>>>> all of the changes in the other threads while executing from
>> > 508 to
>> > >>>> 530
>> > >>>>> (that processes compare-and-exchange).
>> > >>>>>
>> > >>>>>> 2. Has there been a discussion already, establishing that the
>> > >>>> modified
>> > >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> > >>>>>> postulating that and based on what evidence?
>> > >>>>>
>> > >>>>> Volker and his colleagues have investigated the current GC
> codes
>> > >>>>> according to this.
>> > >>>>>
>> > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> > <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> > >>>> April/019079.html
>> > >>>>> However, I believe, we need comments of other GC expertsto
> change
>> > >>>>> the shared codes.
>> > >>>>>
>> > >>>>> Regards,
>> > >>>>> Hiroshi
>> > >>>>> -----------------------
>> > >>>>> Hiroshi Horii, Ph.D.
>> > >>>>> IBM Research - Tokyo
>> > >>>>>
>> > >>>>>
>> > >>>>> David Holmes <david.holmes at oracle.com
>> > <mailto:david.holmes at oracle.com>> wrote on 04/22/2016 21:57:07:
>> > >>>>>
>> > >>>>>> From: David Holmes <david.holmes at oracle.com
>> > <mailto:david.holmes at oracle.com>>
>> > >>>>>> To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-runtime-
>> > >>>>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> > hotspot-gc-dev at openjdk.java.net
> <mailto:hotspot-gc-dev at openjdk.java.net>
>> > >>>>>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> > <mailto:Tim_Ellison at uk.ibm.com>>,
>> > >>>>> ppc-aix-port-dev at openjdk.java.net
>> > <mailto:ppc-aix-port-dev at openjdk.java.net>
>> > >>>>>> Date: 04/22/2016 21:58
>> > >>>>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> > >>>>>> copy_to_survivor for ppc64
>> > >>>>>>
>> > >>>>>> Hi Hiroshi,
>> > >>>>>>
>> > >>>>>> Two initial questions:
>> > >>>>>>
>> > >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> > >>>>>> memory_order_seq_cst?
>> > >>>>>>
>> > >>>>>> 2. Has there been a discussion already, establishing that the
>> > >>>> modified
>> > >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> > >>>>>> postulating that and based on what evidence?
>> > >>>>>>
>> > >>>>>> Missing memory barriers have caused very difficult to
> track down
>> > >>> bugs in
>> > >>>>>> the past - very rare race conditions. So any relaxation
> here has
>> > >>>> to be
>> > >>>>>> done with extreme confidence.
>> > >>>>>>
>> > >>>>>> Thanks,
>> > >>>>>> David
>> > >>>>>>
>> > >>>>>> On 22/04/2016 10:28 PM, Hiroshi H Horii wrote:
>> > >>>>>>> Dear all:
>> > >>>>>>>
>> > >>>>>>> Can I please request reviews for the following change?
>> > >>>>>>>
>> > >>>>>>> Code change:
>> > >>>>>>>
>> > >>>
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/>
>> > >>>>>>> (I initially created and Martin enhanced so much)
>> > >>>>>>>
>> > >>>>>>> This change follows the discussion started from this mail.
>> > >>>>>>>
>> > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> > <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> > >>>>>> April/018960.html
>> > >>>>>>>
>> > >>>>>>> Description:
>> > >>>>>>> This change provides relaxed compare-and-exchange by
> introducing
>> > >>>>>>> similar semantics of C++ atomic memory operators, enum
>> > >>>> memory_order.
>> > >>>>>>> As described in atomic_linux_ppc.inline.hpp, the current
>> > >>>>> implementation of
>> > >>>>>>> cmpxchg is fence_cmpxchg_acquire. This implementation is
> useful
>> > for
>> > >>>>>>> general purposes because twice calls of sync before and
> after
>> > >>>>> cmpxchg will
>> > >>>>>>> provide strict consistency. However, they sometimes cause
>> > overheads
>> > >>>>>>> because
>> > >>>>>>> sync instructions are very expensive in the current
> POWER chip
>> > >>> design.
>> > >>>>>>> In addition, for the other platforms, such as aarch64, this
>> > strict
>> > >>>>>>> semantics
>> > >>>>>>> may cause some overheads (according to the Andrew's mail).
>> > >>>>>>>
>> > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> > <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> > >>>>>> April/019073.html
>> > >>>>>>>
>> > >>>>>>> With this change, callers can explicitly specify
> constraints of
>> > >>> memory
>> > >>>>>>> ordering
>> > >>>>>>> for cmpxchg with an additional parameter, memory_order
> order.
>> > >>>>>>>
>> > >>>>>>> typedef enum memory_order {
>> > >>>>>>> memory_order_relaxed,
>> > >>>>>>> memory_order_consume,
>> > >>>>>>> memory_order_acquire,
>> > >>>>>>> memory_order_release,
>> > >>>>>>> memory_order_acq_rel,
>> > >>>>>>> memory_order_seq_cst
>> > >>>>>>> } memory_order;
>> > >>>>>>>
>> > >>>>>>> Because the default value of the parameter is
>> > memory_order_seq_cst,
>> > >>>>>>> existing codes can use the same semantics of cmpxchg
> without any
>> > >>>>>>> modification. The relaxed cmpxchg is implemented only on ppc
>> > >>>>>>> in this changeset. Therefore, the behavior on the other
>> > platforms
>> > >>> will
>> > >>>>>>> not be changed with this changeset.
>> > >>>>>>>
>> > >>>>>>> In addition, with the new parameter of cmpxchg, this change
>> > >>>> improves
>> > >>>>>>> performance of copy_to_survivor in the parallel GC.
>> > >>>>>>> copy_to_survivor changes forward pointers by using
> cmpxchg. This
>> > >>>>>>> operation doesn't require any sync instructions. A
> pointer is
>> > >>> changed
>> > >>>>>>> at most once in a GC and when cmpxchg fails, the latest
>> > pointer is
>> > >>>>>>> available for the caller. cas_set_mark and
> cas_forward_to are
>> > >>> extended
>> > >>>>>>> with an additional memory_order parameter as cmpxchg and
>> > >>>>> copy_to_survivor
>> > >>>>>>> uses memory_order_relaxed to modify the forward pointers.
>> > >>>>>>>
>> > >>>>>>> Summary of source code changes:
>> > >>>>>>>
>> > >>>>>>> * src/share/vm/runtime/atomic.hpp
>> > >>>>>>> - Defines enum memory_order and adds a parameter to
>> > cmpxchg.
>> > >>>>>>>
>> > >>>>>>> * src/share/vm/runtime/atomic.cpp
>> > >>>>>>> * src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp
>> > >>>>>>> * src/os_cpu/bsd_zero/vm/atomic_bsd_zero.inline.hpp
>> > >>>>>>> *
> src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp
>> > >>>>>>> * src/os_cpu/linux_sparc/vm/atomic_linux_sparc.inline.hpp
>> > >>>>>>> * src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp
>> > >>>>>>> * src/os_cpu/linux_zero/vm/atomic_linux_zero.inline.hpp
>> > >>>>>>> *
> src/os_cpu/solaris_sparc/vm/atomic_solaris_sparc.inline.hpp
>> > >>>>>>> * src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp
>> > >>>>>>> * src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp
>> > >>>>>>> - Added a parameter for each cmpxchg function to
> follow
>> > >>>>>>> the change of atomic.hpp. Their implementations
> are not
>> > >>>>> changed.
>> > >>>>>>>
>> > >>>>>>> * src/os_cpu/aix_ppc/vm/atomic_aix_ppc.inline.hpp
>> > >>>>>>> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
>> > >>>>>>> - Added a parameter for each cmpxchg function to
> follow
>> > >>>>>>> the change of atomic.hpp. In addition,
> implementations
>> > >>>>>>> are changed corresponding to the specified
>> > memory_order.
>> > >>>>>>>
>> > >>>>>>> * src/share/vm/oops/oop.hpp
>> > >>>>>>> * src/share/vm/oops/oop.inline.hpp
>> > >>>>>>> - Add a memory_order parameter to use relaxed
> cmpxchg in
>> > >>>>>>> cas_set_mark and cas_forward_to.
>> > >>>>>>>
>> > >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.cpp
>> > >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.inline.hpp
>> > >>>>>>>
>> > >>>>>>> Martin tested this changeset on linuxx86_64,
> linuxppc64le and
>> > >>>>>>> darwinintel64.
>> > >>>>>>> Though more time is needed to test on the other platform, we
>> > would
>> > >>>>> like to
>> > >>>>>>> ask
>> > >>>>>>> reviews and start discussion on this changeset.
>> > >>>>>>> I also tested this changeset with SPECjbb2013 and
> confirmed that
>> > gc
>> > >>>>> pause
>> > >>>>>>> time
>> > >>>>>>> is reduced.
>> > >>>>>>>
>> > >>>>>>> Regards,
>> > >>>>>>> Hiroshi
>> > >>>>>>> -----------------------
>> > >>>>>>> Hiroshi Horii, Ph.D.
>> > >>>>>>> IBM Research - Tokyo
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>
>> > >>>>
>> > >>>
>> >
>> >
>> >
>> >
>> >
>> >
>>
>
More information about the hotspot-compiler-dev
mailing list