RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64

David Holmes david.holmes at oracle.com
Fri Sep 30 11:12:27 UTC 2016


On 30/09/2016 8:17 PM, Hiroshi H Horii wrote:
> Dear David, and Dan,
>
> Thank you for your comments.
>
>> In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
>> 266 the log line reads data from the forwardee even when the CAS
>> fails. I believe those reads will be unsafe without barriers after
>> the copy of the content of the object.
>> hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288
>> same problem as in line 266
>
> Can we use o->size() or new_obj_size instead of new_obj->size()?
>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>
> In my understanding, while copying objects to a survivor space, if a
> thread creates a new_obj and sets a pointer with CAS, the other threads
> can touch the new_obj after the thread calls push_contents(new_obj)
> (Line: 239). In push_contents, OrderAccess::release_store is called
> before pushing the object as a task into a deque of workstealing
> (taskqueue.inline.hpp). If the other thread reads the task, all of copy
> for new_obj is safe.

I'm not familiar with the larger picture of the GC protocols here, but 
just looking at this code fragment in isolation if the CAS fails we read 
o->forwardee() to set new_obj. That in itself is fine because we're 
reading the field that we were testing with the CAS. But we could then 
deference new_obj before the thread that won the CAS calls 
push_contents; and even if it is after push_contents we have not done an 
acquire to pair with the release-store in push_contents.

So I'm really not seeing how we can use a barrier-less CAS here.

David
-----

>
> Thank you for your helps again. I may be misunderstanding or missing
> something critical. Any comments and claims are always appreciated.
>
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
>
> David Holmes <david.holmes at oracle.com> wrote on 09/30/2016 07:16:16:
>
>> From: David Holmes <david.holmes at oracle.com>
>> To: Carsten Varming <varming at gmail.com>, Hiroshi H Horii/Japan/IBM at IBMJP
>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com>, "ppc-aix-port-
>> dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>, "hotspot-
>> runtime-dev at openjdk.java.net" <hotspot-runtime-
>> dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
>> gc-dev at openjdk.java.net>, hotspot-compiler-dev <hotspot-compiler-
>> dev-bounces at openjdk.java.net>
>> Date: 09/30/2016 07:17
>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> copy_to_survivor for ppc64
>>
>> On 30/09/2016 12:47 AM, Carsten Varming wrote:
>> > Dear Hiroshi,
>> >
>> > In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:266
>> > the log line reads data from the forwardee even when the CAS fails. I
>> > believe those reads will be unsafe without barriers after the copy of
>> > the content of the object.
>>
>> I find it extremely hard to reason about a barrier-less cmpxchg in
> general.
>>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>>
>> David
>> -----
>>
>> > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 same
>> > problem as in line 266
>> >
>> > I would argue that the logging should only happen if the thread
>> > successfully copied the object and CAS failures should be logged
>> > separately without reading data from the forwardee.
>> >
>> > BTW, unrelated to your change: It seems like the logging in line 266
>> > should be guarded by something like "if (log_develop_is_enabled(Trace,
>> > gc, scavenge)" like the logging in line 288.
>> >
>> > Carsten
>> >
>> > On Thu, Sep 29, 2016 at 8:00 AM, Hiroshi H Horii <HORII at jp.ibm.com
>> > <mailto:HORII at jp.ibm.com>> wrote:
>> >
>> >     Hi all,
>> >
>> >     Can I please request reviews for a change for 8154736 that improve
>> >     copy_to_survivor performance of ppc64 and aarch64?
>> >     If possible, I would like to include this change into jdk9.
>> >
>> >     8154736 includes two changes, cmpxchg and copy_to_suvivor, and the
>> >     former
>> >     was resolved as 8155949.
>> >     Now, I would like to ask a review for the remaining, copy_to_suvivor
>> >     change.
>> >
>> >     webrev:
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/>
>> >     JIRA: https://bugs.openjdk.java.net/browse/JDK-8154736
>> >     <https://bugs.openjdk.java.net/browse/JDK-8154736>
>> >
>> >     I tested this change with SPECjbb2013. Also, I re-check that relaxed
>> >     cmpxchg is available for changing forwarding pointers. However,
> because
>> >     this change is sensitive, we need more reviews not only from
>> >     compiler-dev,
>> >     but also from gc-dev.
>> >
>> >     Regards,
>> >     Hiroshi
>> >     -----------------------
>> >     Hiroshi Horii, Ph.D.
>> >     IBM Research - Tokyo
>> >
>> >
>> >
>> >
>> >     From:   David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     To:     "Doerr, Martin" <martin.doerr at sap.com
>> >     <mailto:martin.doerr at sap.com>>, Hiroshi H
>> >     Horii/Japan/IBM at IBMJP
>> >     Cc:     Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     "ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>"
>> >     <ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>>,
>> >     "hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>"
>> >     <hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>>,
>> >     "hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>"
>> >     <hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>>
>> >     Date:   05/10/2016 19:31
>> >     Subject:        Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     copy_to_survivor for ppc64
>> >
>> >
>> >
>> >     On 10/05/2016 7:41 PM, Doerr, Martin wrote:
>> >     > Hi David,
>> >     >
>> >     > thank you very much for testing the other platforms.
>> >     >
>> >     > Here's an updated webrev:
>> >     > http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/
>> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/>
>> >
>> >     Thanks. Second test run on its way.
>> >
>> >     David
>> >     -----
>> >
>> >     > Best regards,
>> >     > Martin
>> >     >
>> >     > -----Original Message-----
>> >     > From: hotspot-runtime-dev [
>> >     mailto:hotspot-runtime-dev-bounces at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev-bounces at openjdk.java.net>] On Behalf Of
>> >     David
>> >     Holmes
>> >     > Sent: Dienstag, 10. Mai 2016 11:11
>> >     > To: Hiroshi H Horii <HORII at jp.ibm.com <mailto:HORII at jp.ibm.com>>
>> >     > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>;
>> >     ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>;
>> >     hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>;
>> >     hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>
>> >     > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     copy_to_survivor for ppc64
>> >     >
>> >     > The fix seems incomplete for solaris:
>> >     >
>> >     > make/Main.gmk:232: recipe for target 'hotspot' failed
>> >     >
>> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> >     > line 124: Error: Too many arguments in call to
>> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> >     >
>> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> >     > line 128: Error: Too many arguments in call to
>> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> >     >
>> >     > David
>> >     >
>> >     > On 10/05/2016 5:34 PM, David Holmes wrote:
>> >     >> Hi Hiroshi,
>> >     >>
>> >     >> On 6/05/2016 8:11 PM, Hiroshi H Horii wrote:
>> >     >>> Hi David,
>> >     >>>
>> >     >>> Thank you for your comments.
>> >     >>>
>> >     >>> As Martin suggested me, I would like to separate this
> proposal to
>> >     >>>   - relaxing memory order of cmpxchg
>> >     >>>   - improvement of copy_to_survivior with relaxed cmpxchg
>> >     >>> and discuss the former first.
>> >     >>>
>> >     >>> Martin thankfully created a new webrev that include a change of
>> >     cmpxchg.
>> >     >>>
>> >     http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/
>> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/>
>> >     >>> He has already tested it with AIX, linuxx86_64, linuxppc64le and
>> >     >>> darwinintel64.
>> >     >>> (Please tell me if I need to send a new mail for this PFR)
>> >     >>
>> >     >> Please do as it will be simpler to track that way.
>> >     >>
>> >     >>>> What I would prefer to see is an additional memory_order value
>> >     (such
>> >     as
>> >     >>>> memory_order_ignored) which is the default for all methods
> declared
>> >     to
>> >     >>>> take a memory_order parameter.
>> >     >>>
>> >     >>> We added simple enum to specify memory order in atomic.hpp as
>> >     follows.
>> >     >>>
>> >     >>> typedef enum cmpxchg_cmpxchg_memory_order {
>> >     >>>   memory_order_relaxed,
>> >     >>>   memory_order_conservative
>> >     >>> } cmpxchg_memory_order;
>> >     >>>
>> >     >>> All of cmpxchg functions have an argument of
> cmpxchg_memory_order
>> >     >>> with a default value memory_order_conservative that uses the
> same
>> >     >>> semantics with the existing cmpxchg and requires no change
> for the
>> >     >>> existing
>> >     >>> callers. If you think "memory_order_ignored" is better than
>> >     >>> "memory_order_conservative", I will be happy to modify this
> change.
>> >     >>> (I just thought, "ignored" may resemble "relaxed" and may make
>> >     >>> people who are familiar with C++11's memory semantics confused.
>> >     >>> I would like to know thoughts of native speakers.)
>> >     >>
>> >     >> That is fine by me. I don't think "ignored" would be confused
> with
>> >     >> "relaxed", but "conservative" is fine.
>> >     >>
>> >     >> I will run the patch through our internal build system while you
>> >     prepare
>> >     >> the updated RFR. My only concern is "unused argument" warnings
>> >     from the
>> >     >> compiler. :)
>> >     >>
>> >     >> We are quickly running into a hard deadline with Feature Complete
>> >     >> however - possibly less than 24 hours - for hotspot changes.
> If this
>> >     >> doesn't get in in time I will see if I can shepherd it
> through the
>> >     >> approval process.
>> >     >>
>> >     >> Thanks,
>> >     >> David
>> >     >>
>> >     >>
>> >     >>> Regards,
>> >     >>> Hiroshi
>> >     >>> -----------------------
>> >     >>> Hiroshi Horii, Ph.D.
>> >     >>> IBM Research - Tokyo
>> >     >>>
>> >     >>>
>> >     >>> David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>> wrote on 05/04/2016 14:55:29:
>> >     >>>
>> >     >>>> From: David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     >>>> To: Hiroshi H Horii/Japan/IBM at IBMJP
>> >     >>>> Cc: hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>, hotspot-runtime-
>> >     >>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> >     ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>, Tim Ellison
>> >     >>>> <Tim_Ellison at uk.ibm.com <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     Volker Simonis <volker.simonis at gmail.com
>> >     <mailto:volker.simonis at gmail.com>>,
>> >     >>>> "Doerr, Martin" <martin.doerr at sap.com
>> >     <mailto:martin.doerr at sap.com>>, "Lindenmaier, Goetz"
>> >     >>>> <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>>
>> >     >>>> Date: 05/04/2016 14:57
>> >     >>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     >>>> copy_to_survivor for ppc64
>> >     >>>>
>> >     >>>> Hi Hiroshi,
>> >     >>>>
>> >     >>>> Sorry for the delay on getting back to this.
>> >     >>>>
>> >     >>>> On 25/04/2016 5:09 PM, Hiroshi H Horii wrote:
>> >     >>>>> Hi David,
>> >     >>>>>
>> >     >>>>> Thank you for your comments and questions.
>> >     >>>>>
>> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> >     >>>>>> memory_order_seq_cst?
>> >     >>>>>
>> >     >>>>> This is very good question..
>> >     >>>>>
>> >     >>>>> I guess, cmpxchg needs a more conservative constraint for
> memory
>> >     >>> ordering
>> >     >>>>> than C++11, to add sync after a compare-and-exchange
> operation.
>> >     >>>>>
>> >     >>>>> Could someone give comments or thoughts?
>> >     >>>>
>> >     >>>> I don't want to comment on the comparison with C++11. What
> I would
>> >     >>>> prefer to see is an additional memory_order value (such as
>> >     >>>> memory_order_ignored) which is the default for all methods
> declared
>> >     to
>> >     >>>> take a memory_order parameter. That way existing
>> >     implementations are
>> >     >>>> clearly ignoring the memory_order attribute and there is no
>> >     potential
>> >     >>>> for confusion as to whether the existing implementations
> equate to
>> >     >>>> memory_order_seq_cst or not.
>> >     >>>>
>> >     >>>> That said, I'm not sure it makes sense to add the memory_order
>> >     parameter
>> >     >>>> to all methods with "cas" in their name, e.g.
>> >     oopDesc::cas_set_mark,
>> >     >>>> oopDesc::cas_forward_to, unless those methods can sensibly be
>> >     called
>> >     >>>> with any value for memory_order - which seems highly unlikely.
>> >     Perhaps
>> >     >>>> those methods should identify the weakest form of
> memory_order they
>> >     >>>> support and that should be hard-wired into them?
>> >     >>>>
>> >     >>>> Thanks,
>> >     >>>> David
>> >     >>>>
>> >     >>>>> memory_order_seq_cst is defined as
>> >     >>>>>     "Any operation with this memory order is both an acquire
>> >     >>> operation and
>> >     >>>>>      a release operation, plus a single total order exists in
>> >     which
>> >     >>>> all
>> >     >>>>> threads
>> >     >>>>>      observe all modifications (see below) in the same order."
>> >     >>>>> (http://en.cppreference.com/w/cpp/atomic/memory_order
>> >     <http://en.cppreference.com/w/cpp/atomic/memory_order>)
>> >     >>>>>
>> >     >>>>> In my environment, g++ and xlc generate following
> assemblies on
>> >     >>>> ppc64le.
>> >     >>>>> (interestingly, they generates the same assemblies for any
>> >     >>>> memory_order)
>> >     >>>>>
>> >     >>>>> g++ (4.9.2)
>> >     >>>>>     100008a4:   ac 04 00 7c     sync
>> >     >>>>>     100008a8:   28 50 20 7d     lwarx   r9,0,r10
>> >     >>>>>     100008ac:   00 18 09 7c     cmpw    r9,r3
>> >     >>>>>     100008b0:   0c 00 c2 40     bne-    100008bc
>> >     >>>>>     100008b4:   2d 51 80 7c     stwcx.  r4,0,r10
>> >     >>>>>     100008b8:   f0 ff c2 40     bne-    100008a8
>> >     >>>>>     100008bc:   2c 01 00 4c     isync
>> >     >>>>>
>> >     >>>>> xlc (13.1.3)
>> >     >>>>>     10000888:   ac 04 00 7c     sync
>> >     >>>>>     1000088c:   28 28 c0 7c     lwarx   r6,0,r5
>> >     >>>>>     10000890:   40 00 26 7c     cmpld   r6,r0
>> >     >>>>>     10000894:   0c 00 82 40     bne     100008a0
>> >     >>>>>     10000898:   2d 29 80 7c     stwcx.  r4,0,r5
>> >     >>>>>     1000089c:   f0 ff e2 40     bne+    1000088c
>> >     >>>>>     100008a0:   2c 01 00 4c     isync
>> >     >>>>>
>> >     >>>>> On the other hand, the current OpenJDK generates following
>> >     assemblies.
>> >     >>>>>
>> >     >>>>>     508:   ac 04 00 7c     sync
>> >     >>>>>     50c:   00 00 5c e9     ld      r10,0(r28)
>> >     >>>>>     510:   00 50 3b 7c     cmpd    r27,r10
>> >     >>>>>     514:   1c 00 c2 40     bne-    530
>> >     >>>>>     518:   a8 40 5c 7d     ldarx   r10,r28,r8
>> >     >>>>>     51c:   00 50 3b 7c     cmpd    r27,r10
>> >     >>>>>     520:   10 00 c2 40     bne-    530
>> >     >>>>>     524:   ad 41 3c 7d     stdcx.  r9,r28,r8
>> >     >>>>>     528:   f0 ff c2 40     bne-    518
>> >     >>>>>     52c:   ac 04 00 7c     sync
>> >     >>>>>     530:   00 50 bb 7f     ...
>> >     >>>>>
>> >     >>>>> Though we can ignore 50c-514 (because they are a
> duplicated guard
>> >     >>>>> condition),
>> >     >>>>> the last sync instruction (52c) makes cmpxchg more strict than
>> >     >>>>> memory_order_seq_cst.
>> >     >>>>>
>> >     >>>>> In some cases, the last sync is necessary when this thread
> must be
>> >     >>>> able
>> >     >>>>> to read
>> >     >>>>> all of the changes in the other threads while executing from
>> >     508 to
>> >     >>>> 530
>> >     >>>>> (that processes compare-and-exchange).
>> >     >>>>>
>> >     >>>>>> 2. Has there been a discussion already, establishing that the
>> >     >>>> modified
>> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> >     >>>>>> postulating that and based on what evidence?
>> >     >>>>>
>> >     >>>>> Volker and his colleagues have investigated the current GC
> codes
>> >     >>>>> according to this.
>> >     >>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>> April/019079.html
>> >     >>>>> However, I believe, we need comments of other GC expertsto
> change
>> >     >>>>> the shared codes.
>> >     >>>>>
>> >     >>>>> Regards,
>> >     >>>>> Hiroshi
>> >     >>>>> -----------------------
>> >     >>>>> Hiroshi Horii, Ph.D.
>> >     >>>>> IBM Research - Tokyo
>> >     >>>>>
>> >     >>>>>
>> >     >>>>> David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>> wrote on 04/22/2016 21:57:07:
>> >     >>>>>
>> >     >>>>>> From: David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     >>>>>> To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-runtime-
>> >     >>>>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> >     hotspot-gc-dev at openjdk.java.net
> <mailto:hotspot-gc-dev at openjdk.java.net>
>> >     >>>>>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     >>>>> ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>
>> >     >>>>>> Date: 04/22/2016 21:58
>> >     >>>>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     >>>>>> copy_to_survivor for ppc64
>> >     >>>>>>
>> >     >>>>>> Hi Hiroshi,
>> >     >>>>>>
>> >     >>>>>> Two initial questions:
>> >     >>>>>>
>> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> >     >>>>>> memory_order_seq_cst?
>> >     >>>>>>
>> >     >>>>>> 2. Has there been a discussion already, establishing that the
>> >     >>>> modified
>> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> >     >>>>>> postulating that and based on what evidence?
>> >     >>>>>>
>> >     >>>>>> Missing memory barriers have caused very difficult to
> track down
>> >     >>> bugs in
>> >     >>>>>> the past - very rare race conditions. So any relaxation
> here has
>> >     >>>> to be
>> >     >>>>>> done with extreme confidence.
>> >     >>>>>>
>> >     >>>>>> Thanks,
>> >     >>>>>> David
>> >     >>>>>>
>> >     >>>>>> On 22/04/2016 10:28 PM, Hiroshi H Horii wrote:
>> >     >>>>>>> Dear all:
>> >     >>>>>>>
>> >     >>>>>>> Can I please request reviews for the following change?
>> >     >>>>>>>
>> >     >>>>>>> Code change:
>> >     >>>>>>>
>> >     >>>
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/>
>> >     >>>>>>> (I initially created and Martin enhanced so much)
>> >     >>>>>>>
>> >     >>>>>>> This change follows the discussion started from this mail.
>> >     >>>>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>>>> April/018960.html
>> >     >>>>>>>
>> >     >>>>>>> Description:
>> >     >>>>>>> This change provides relaxed compare-and-exchange by
> introducing
>> >     >>>>>>> similar semantics of C++ atomic memory operators, enum
>> >     >>>> memory_order.
>> >     >>>>>>> As described in atomic_linux_ppc.inline.hpp, the current
>> >     >>>>> implementation of
>> >     >>>>>>> cmpxchg is fence_cmpxchg_acquire. This implementation is
> useful
>> >     for
>> >     >>>>>>> general purposes because twice calls of sync before and
> after
>> >     >>>>> cmpxchg will
>> >     >>>>>>> provide strict consistency. However, they sometimes cause
>> >     overheads
>> >     >>>>>>> because
>> >     >>>>>>> sync instructions are very expensive in the current
> POWER chip
>> >     >>> design.
>> >     >>>>>>> In addition, for the other platforms, such as aarch64, this
>> >     strict
>> >     >>>>>>> semantics
>> >     >>>>>>> may cause some overheads (according to the Andrew's mail).
>> >     >>>>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>>>> April/019073.html
>> >     >>>>>>>
>> >     >>>>>>> With this change, callers can explicitly specify
> constraints of
>> >     >>> memory
>> >     >>>>>>> ordering
>> >     >>>>>>> for cmpxchg with an additional parameter, memory_order
> order.
>> >     >>>>>>>
>> >     >>>>>>> typedef enum memory_order {
>> >     >>>>>>>    memory_order_relaxed,
>> >     >>>>>>>    memory_order_consume,
>> >     >>>>>>>    memory_order_acquire,
>> >     >>>>>>>    memory_order_release,
>> >     >>>>>>>    memory_order_acq_rel,
>> >     >>>>>>>    memory_order_seq_cst
>> >     >>>>>>> } memory_order;
>> >     >>>>>>>
>> >     >>>>>>> Because the default value of the parameter is
>> >     memory_order_seq_cst,
>> >     >>>>>>> existing codes can use the same semantics of cmpxchg
> without any
>> >     >>>>>>> modification. The relaxed cmpxchg is implemented only on ppc
>> >     >>>>>>> in this changeset. Therefore, the behavior on the other
>> >     platforms
>> >     >>> will
>> >     >>>>>>> not be changed with this changeset.
>> >     >>>>>>>
>> >     >>>>>>> In addition, with the new parameter of cmpxchg, this change
>> >     >>>> improves
>> >     >>>>>>> performance of copy_to_survivor in the parallel GC.
>> >     >>>>>>> copy_to_survivor changes forward pointers by using
> cmpxchg. This
>> >     >>>>>>> operation doesn't require any sync instructions.  A
> pointer is
>> >     >>> changed
>> >     >>>>>>> at most once in a GC and when cmpxchg fails, the latest
>> >     pointer is
>> >     >>>>>>> available for the caller. cas_set_mark and
> cas_forward_to are
>> >     >>> extended
>> >     >>>>>>> with an additional memory_order parameter as cmpxchg and
>> >     >>>>> copy_to_survivor
>> >     >>>>>>> uses memory_order_relaxed to modify the forward pointers.
>> >     >>>>>>>
>> >     >>>>>>> Summary of source code changes:
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/runtime/atomic.hpp
>> >     >>>>>>>       - Defines enum memory_order and adds a parameter to
>> >     cmpxchg.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/runtime/atomic.cpp
>> >     >>>>>>> * src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/bsd_zero/vm/atomic_bsd_zero.inline.hpp
>> >     >>>>>>> *
> src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_sparc/vm/atomic_linux_sparc.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_zero/vm/atomic_linux_zero.inline.hpp
>> >     >>>>>>> *
> src/os_cpu/solaris_sparc/vm/atomic_solaris_sparc.inline.hpp
>> >     >>>>>>> * src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp
>> >     >>>>>>>       - Added a parameter for each cmpxchg function to
> follow
>> >     >>>>>>>          the change of atomic.hpp. Their implementations
> are not
>> >     >>>>> changed.
>> >     >>>>>>>
>> >     >>>>>>> * src/os_cpu/aix_ppc/vm/atomic_aix_ppc.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
>> >     >>>>>>>       - Added a parameter for each cmpxchg function to
> follow
>> >     >>>>>>>          the change of atomic.hpp. In addition,
> implementations
>> >     >>>>>>>          are changed corresponding to the specified
>> >     memory_order.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/oops/oop.hpp
>> >     >>>>>>> * src/share/vm/oops/oop.inline.hpp
>> >     >>>>>>>       - Add a memory_order parameter to use relaxed
> cmpxchg in
>> >     >>>>>>>          cas_set_mark and cas_forward_to.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.cpp
>> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.inline.hpp
>> >     >>>>>>>
>> >     >>>>>>> Martin tested this changeset  on linuxx86_64,
> linuxppc64le and
>> >     >>>>>>> darwinintel64.
>> >     >>>>>>> Though more time is needed to test on the other platform, we
>> >     would
>> >     >>>>> like to
>> >     >>>>>>> ask
>> >     >>>>>>> reviews and start discussion on this changeset.
>> >     >>>>>>> I also tested this changeset with SPECjbb2013 and
> confirmed that
>> >     gc
>> >     >>>>> pause
>> >     >>>>>>> time
>> >     >>>>>>> is reduced.
>> >     >>>>>>>
>> >     >>>>>>> Regards,
>> >     >>>>>>> Hiroshi
>> >     >>>>>>> -----------------------
>> >     >>>>>>> Hiroshi Horii, Ph.D.
>> >     >>>>>>> IBM Research - Tokyo
>> >     >>>>>>>
>> >     >>>>>>>
>> >     >>>>>>
>> >     >>>>>
>> >     >>>>
>> >     >>>
>> >
>> >
>> >
>> >
>> >
>> >
>>
>


More information about the hotspot-compiler-dev mailing list