<font size=2 face="sans-serif">Dear all:</font><br><br><font size=2 face="sans-serif">Can I please request reviews for the
following change?</font><br><br><font size=2 face="sans-serif">Code change:</font><br><a href=http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/><font size=2 color=blue face="sans-serif">http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/</font></a><br><font size=2 face="sans-serif">(I initially created and Martin enhanced
so much)</font><br><br><font size=2 face="sans-serif">This change follows the discussion started
from this mail.</font><br><a href="http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-April/018960.html"><font size=2 color=blue face="sans-serif">http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-April/018960.html</font></a><br><br><font size=2 face="sans-serif">Description:</font><br><font size=2 face="sans-serif">This change provides relaxed compare-and-exchange
by introducing</font><br><font size=2 face="sans-serif">similar semantics of C++ atomic memory
operators, enum memory_order.</font><br><font size=2 face="sans-serif">As described in atomic_linux_ppc.inline.hpp,
the current implementation of</font><br><font size=2 face="sans-serif">cmpxchg is fence_cmpxchg_acquire. This
implementation is useful for</font><br><font size=2 face="sans-serif">general purposes because twice calls
of sync before and after cmpxchg will</font><br><font size=2 face="sans-serif">provide strict consistency. However,
they sometimes cause overheads because</font><br><font size=2 face="sans-serif">sync instructions are very expensive
in the current POWER chip design.</font><br><font size=2 face="sans-serif">In addition, for the other platforms,
such as aarch64, this strict semantics</font><br><font size=2 face="sans-serif">may cause some overheads (according
to the Andrew's mail). </font><br><a href="http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-April/019073.html"><font size=2 color=blue face="sans-serif">http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-April/019073.html</font></a><br><br><font size=2 face="sans-serif">With this change, callers can explicitly
specify constraints of memory ordering</font><br><font size=2 face="sans-serif">for cmpxchg with an additional parameter,
memory_order order.</font><br><br><font size=2 face="sans-serif">typedef enum memory_order {</font><br><font size=2 face="sans-serif"> memory_order_relaxed,</font><br><font size=2 face="sans-serif"> memory_order_consume,</font><br><font size=2 face="sans-serif"> memory_order_acquire,</font><br><font size=2 face="sans-serif"> memory_order_release,</font><br><font size=2 face="sans-serif"> memory_order_acq_rel,</font><br><font size=2 face="sans-serif"> memory_order_seq_cst</font><br><font size=2 face="sans-serif">} memory_order;</font><br><br><font size=2 face="sans-serif">Because the default value of the parameter
is memory_order_seq_cst, </font><br><font size=2 face="sans-serif">existing codes can use the same semantics
of cmpxchg without any</font><br><font size=2 face="sans-serif">modification. The relaxed cmpxchg is
implemented only on ppc </font><br><font size=2 face="sans-serif">in this changeset. Therefore, the behavior
on the other platforms will</font><br><font size=2 face="sans-serif">not be changed with this changeset.</font><br><br><font size=2 face="sans-serif">In addition, with the new parameter
of cmpxchg, this change improves</font><br><font size=2 face="sans-serif">performance of copy_to_survivor in the
parallel GC. </font><br><font size=2 face="sans-serif">copy_to_survivor changes forward pointers
by using cmpxchg. This </font><br><font size=2 face="sans-serif">operation doesn't require any sync instructions.
A pointer is changed </font><br><font size=2 face="sans-serif">at most once in a GC and when cmpxchg
fails, the latest pointer is </font><br><font size=2 face="sans-serif">available for the caller. cas_set_mark
and cas_forward_to are extended </font><br><font size=2 face="sans-serif">with an additional memory_order parameter
as cmpxchg and copy_to_survivor</font><br><font size=2 face="sans-serif">uses memory_order_relaxed to modify
the forward pointers.</font><br><br><font size=2 face="sans-serif">Summary of source code changes:</font><br><br><font size=2 face="sans-serif">* src/share/vm/runtime/atomic.hpp </font><br><font size=2 face="sans-serif"> - Defines enum memory_order
and adds a parameter to cmpxchg.</font><br><br><font size=2 face="sans-serif">* src/share/vm/runtime/atomic.cpp</font><br><font size=2 face="sans-serif">* src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/bsd_zero/vm/atomic_bsd_zero.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/linux_sparc/vm/atomic_linux_sparc.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/linux_zero/vm/atomic_linux_zero.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/solaris_sparc/vm/atomic_solaris_sparc.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp</font><br><font size=2 face="sans-serif"> - Added a parameter
for each cmpxchg function to follow</font><br><font size=2 face="sans-serif"> the change
of atomic.hpp. Their implementations are not changed.</font><br><br><font size=2 face="sans-serif">* src/os_cpu/aix_ppc/vm/atomic_aix_ppc.inline.hpp</font><br><font size=2 face="sans-serif">* src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp</font><br><font size=2 face="sans-serif"> - Added a parameter
for each cmpxchg function to follow</font><br><font size=2 face="sans-serif"> the change
of atomic.hpp. In addition, implementations </font><br><font size=2 face="sans-serif"> are changed
corresponding to the specified memory_order.</font><br><br><font size=2 face="sans-serif">* src/share/vm/oops/oop.hpp</font><br><font size=2 face="sans-serif">* src/share/vm/oops/oop.inline.hpp</font><br><font size=2 face="sans-serif"> - Add a memory_order
parameter to use relaxed cmpxchg in</font><br><font size=2 face="sans-serif"> cas_set_mark
and cas_forward_to.</font><br><br><font size=2 face="sans-serif">* src/share/vm/gc/parallel/psPromotionManager.cpp</font><br><font size=2 face="sans-serif">* src/share/vm/gc/parallel/psPromotionManager.inline.hpp</font><br><br><font size=2 face="sans-serif">Martin tested this changeset on
linuxx86_64, linuxppc64le and darwinintel64. </font><br><font size=2 face="sans-serif">Though more time is needed to test on
the other platform, we would like to ask</font><br><font size=2 face="sans-serif">reviews and start discussion on this
changeset.</font><br><font size=2 face="sans-serif">I also tested this changeset with SPECjbb2013
and confirmed that gc pause time</font><br><font size=2 face="sans-serif">is reduced.</font><br><br><font size=2 face="sans-serif">Regards,<br>Hiroshi<br>-----------------------<br>Hiroshi Horii, Ph.D.<br>IBM Research - Tokyo<br></font><BR>