enhancement of cmpxchg and copy_to_survivor for ppc64

charlie hunt charlie.hunt at oracle.com
Mon Apr 11 18:19:30 UTC 2016


FYI, SPECjbb2013 is obsolete in favor of SPECjbb2015.  SPECjbb2015 should run fine with JDK 9 in the default configuration with grizzly as the transport. I have run it on JDK 9 SPARC and JDK 9 x86/x64 platforms.

hths,

charlie

> On Apr 11, 2016, at 12:59 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
> [This should be on hotspot-runtime-dev.  BCC’ing hotspot-compiler-dev.]
> 
>> On Apr 8, 2016, at 12:53 AM, Hiroshi H Horii <HORII at jp.ibm.com> wrote:
>> 
>> Dear all:
>> 
>> Can I please request reviews for the following change?
>> This change was created for JDK 9 and ppc64.
>> 
>> Description:
>> This change adds options of compare-and-exchange for POWER architecture.
>> As described in atomic_linux_ppc.inline.hpp, the current implementation of
>> cmpxchg is fence_cmpxchg_acquire. This implementation is useful for
>> general purposes because twice calls of sync before and after cmpxchg will
>> keep consistency. However, they sometimes cause overheads because
>> sync instructions are very expensive in the current POWER chip design.
>> With this change, callers can explicitly specify to run fence and acquire with
>> two additional bool parameters. Because their default values are "true",
>> it is not necessary to modify existing cmpxchg calls. 
>> 
>> In addition, with the new parameters of cmpxchg, this change improves
>> performance of copy_to_survivor in the parallel GC. 
>> copy_to_survivor changes forward pointers by using cmpxchg. This 
>> operation doesn't require any sync instructions, in my understanding. 
>> A pointer is changed at most once in a GC and when cmpxchg fails, 
>> the latest pointer is available for the caller.
>> 
>> When I evaluated SPECjbb2013 (slightly customized because obsolete grizzly
>> doesn't support new version format of Java 9), pause time of young GC was
>> reduced from 10% to 20%.
>> 
>> Summary of source code changes:
>> 
>> * src/share/vm/runtime/atomic.hpp
>> * src/share/vm/runtime/atomic.cpp
>> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
>>       - Add two arguments of fence and acquire to cmpxchg only for PPC64.
>>         Though cmpxchg in atomic_linux_ppc.inline.hpp has some branches,
>>         they are reduced while inlining to callers.
>> 
>> * src/share/vm/oops/oop.inline.hpp
>>      - Changed cas_set_mark to call cmpxchg without fence and acquire.
>>         cas_set_mark is called only by cas_forward_to that is called only by
>>         copy_to_survivor_space and oop_promotion_failed in 
>>         psPromotionManager.
>> 
>> Code change:
>> 
>>   Please see an attached diff file that was generated with "hg diff -g" 
>>   under the latest hotspot directory.
>> 
>> Passed test:
>>    SPECjbb2013 (customized)
>> 
>> * I believe some other cmpxchg will be optimized by reducing fence 
>>  or acquire because twice calls of sync are too conservative to implement
>>  Java memory model.
>> 
>> 
>> 
>> Regards,
>> Hiroshi
>> -----------------------
>> Hiroshi Horii, Ph.D.
>> IBM Research - Tokyo
>> 
> 



More information about the ppc-aix-port-dev mailing list