RTM disabled for Linux on PPC64 LE

Doerr, Martin martin.doerr at sap.com
Mon Feb 22 10:35:10 UTC 2016


Hi Gustavo,

I think the change should get contributed. I have opened a bug for it which is the first thing we need: JDK-8150353

Can you create and upload a webrev, please?

The hg change comment should be:
8150353: PPC64LE: Support RTM on linux
Reviewed-by: mdoerr

When the webrev is there, please send out a request for review with the headline:
RFR(M) 8150353: PPC64LE: Support RTM on linux

Information about how to do this and about the review process can be found here:
http://openjdk.java.net/guide/webrevHelp.html
http://openjdk.java.net/guide/
http://openjdk.java.net/guide/codeReview.html

If you have questions or problems feel free to contact us.

Btw., do you think the big endian linux kernel will also contain the syscall change?
If not, I suggest to only set INCLUDE_RTM_OPT to 1 on AIX and PPC64LE in globalDefinitions_ppc.hpp.
#if defined(COMPILER2) && (defined(AIX) || defined(VM_LITTLE_ENDIAN)

Best regards,
Martin

-----Original Message-----
From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com] 
Sent: Freitag, 19. Februar 2016 22:35
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Cc: Breno Leitao <brenohl at br.ibm.com>
Subject: Re: RTM disabled for Linux on PPC64 LE

Hi Martin,

I can't afford the PECjbb2005 by now, since it's paid. Instead I'm using
the SPECjvm2008 suite.

Thanks for bringing up the problem on C2's scratch buffer. Indeed, I've
got a core dump when I combined +UseRTMLocking, +UseRTMForStackLocks,
and +UseRTMDeopt (http://goo.gl/Sc5Ekp).

I've experimented a little with the MAX_inst_size value and found that
at least doubling it is sufficient to solve the problem:

# HG changeset patch
# User gromero
# Date 1455916590 7200
#      Fri Feb 19 19:16:30 2016 -0200
# Node ID 721c2e526fa7ee5e46b0ab7219e2acac90c4239b
# Parent  a83242700c91e294886d23c89061c1916682836c
Fix C2 scratch buffer too small

diff --git a/src/share/vm/opto/compile.hpp b/src/share/vm/opto/compile.hpp
--- a/src/share/vm/opto/compile.hpp
+++ b/src/share/vm/opto/compile.hpp
@@ -1118,7 +1118,7 @@
   bool           in_scratch_emit_size() const   { return _in_scratch_emit_size;     }

   enum ScratchBufferBlob {
-    MAX_inst_size       = 1024,
+    MAX_inst_size       = 2048,
     MAX_locs_size       = 128, // number of relocInfo elements
     MAX_const_size      = 128,
     MAX_stubs_size      = 128


Do you think we can fix it upstream and enable the RTM for Linux on
ppc64le? Any guidelines on it?

BTW, I'm still taking a deeper reflection on your comments about biased,
RTM and classic locking.

Best regards,
--
Gustavo Romero

On 16-02-2016 11:33, Doerr, Martin wrote:
> Hi Gustavo,
> 
> thanks for the information and for working on this topic.
> 
> I have used SPEC jbb2005 to test and benchmark RTM on PPC64. It has worked even with the old linux kernel to some extent.
> 
> There are currently the following problems:
> The C2's scratch buffer seems to be too small if you enable all options:
> -XX:+UnlockExperimentalVMOptions -XX:+UseRTMLocking -XX:+UseRTMForStackLocks -XX:+UseRTMDeopt
> I guess we need to increase MAX_inst_size in ScratchBufferBlob (compile.hpp). I didn't have the time to try, yet.
> 
> The following issue is important for performance work:
> RTM does not work with BiasedLocking. The latter gets switched off if RTM is activated which has a large performance impact (especially in jbb2005).
> I would disable it for a reference measurement:
> -XX:-UseBiasedLocking
> 
> Unfortunately, RTM was slower than BiasedLocking but faster than the reference (without both) which tells me that there's room for improvement.
> There are basically 3 classes of locks:
> 1. no contention
> 2. contention on lock, low contention on data
> 3. high contention on data
> 
> I believe the optimal treatment for the cases would be:
> 1. Biased Locking
> 2. Transactional Memory
> 3. classical locking with lock inflating
> 
> I think it would be good if the JVM could optimize for all these cases in the future. But that would add additional complexity and code size.
> 
> Best regards,
>  Martin
> 
> 
> -----Original Message-----
> From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com] 
> Sent: Montag, 15. Februar 2016 15:23
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
> Cc: Breno Leitao <brenohl at br.ibm.com>
> Subject: Re: RTM disabled for Linux on PPC64 LE
> 
> Hello Martin,
> 
> Thank you for your reply.
> 
> The problematic behavior of syscalls has been addressed since kernel 4.2
> (already present in, por instance, Ubuntu 15.10 and 16.04):
> https://goo.gl/d80xAJ
> 
> I'm taking a closer look at the RTM tests and I'll make additional
> experiments as you suggested.
> 
> So far I enabled RTM for Linux on ppc64le and there is no regression in
> the RTM test suite. I'm using kernel 4.2.0.
> 
> The following patch was applied to
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot, 5d17092b6917+ tip, and I
> used the (major + minor) version to enable RTM as you said:
> 
> # HG changeset patch
> # User gromero
> # Date 1455540780 7200
> #      Mon Feb 15 10:53:00 2016 -0200
> # Node ID 0e9540f2156c4c4d7d8215eb89109ff81be82f58
> # Parent  5d17092b691720d71f06360fb0cc183fe2877faa
> Enable RTM for Linux on PPC64 LE
> 
> Enable RTM for Linux kernel version equal or above 4.2, since the
> problematic behavior of performing a syscall from within transaction
> which could lead to unpredictable results has been addressed. Please,
> refer to https://goo.gl/fi4tjC
> 
> diff --git a/src/cpu/ppc/vm/globalDefinitions_ppc.hpp b/src/cpu/ppc/vm/globalDefinitions_ppc.hpp
> --- a/src/cpu/ppc/vm/globalDefinitions_ppc.hpp
> +++ b/src/cpu/ppc/vm/globalDefinitions_ppc.hpp
> @@ -52,4 +52,9 @@
>  #define INCLUDE_RTM_OPT 1
>  #endif
> 
> +// Enable RTM experimental support for Linux.
> +#if defined(COMPILER2) && defined(linux)
> +#define INCLUDE_RTM_OPT 1
> +#endif
> +
>  #endif // CPU_PPC_VM_GLOBALDEFINITIONS_PPC_HPP
> diff --git a/src/cpu/ppc/vm/vm_version_ppc.cpp b/src/cpu/ppc/vm/vm_version_ppc.cpp
> --- a/src/cpu/ppc/vm/vm_version_ppc.cpp
> +++ b/src/cpu/ppc/vm/vm_version_ppc.cpp
> @@ -255,7 +255,12 @@
>      }
>  #endif
>  #ifdef linux
> -    // TODO: check kernel version (we currently have too old versions only)
> +    // At least Linux kernel 4.2, as the problematic behavior of syscalls
> +    // being called from within a transaction has been addressed.
> +    // Please, refer to commit 4b4fadba057c1af7689fc8fa182b13baL7
> +    if (os::Linux::os_version() >= 0x040200) {
> +      os_too_old = false;
> +    }
>  #endif
>      if (os_too_old) {
>        vm_exit_during_initialization("RTM is not supported on this OS version.");
> diff --git a/src/os/linux/vm/os_linux.cpp b/src/os/linux/vm/os_linux.cpp
> --- a/src/os/linux/vm/os_linux.cpp
> +++ b/src/os/linux/vm/os_linux.cpp
> @@ -135,6 +135,7 @@
>  int os::Linux::_page_size = -1;
>  const int os::Linux::_vm_default_page_size = (8 * K);
>  bool os::Linux::_supports_fast_thread_cpu_time = false;
> +uint32_t os::Linux::_os_version = 0;
>  const char * os::Linux::_glibc_version = NULL;
>  const char * os::Linux::_libpthread_version = NULL;
>  pthread_condattr_t os::Linux::_condattr[1];
> @@ -4332,6 +4333,31 @@
>    return (tp.tv_sec * NANOSECS_PER_SEC) + tp.tv_nsec;
>  }
> 
> +void os::Linux::initialize_os_info() {
> +  assert(_os_version == 0, "OS info already initialized");
> +
> +  struct utsname _uname;
> +
> +  uint32_t major;
> +  uint32_t minor;
> +  uint32_t fix;
> +
> +  uname(&_uname); // Not sure yet how to bail out if ret == -1
> +  sscanf(_uname.release,"%d.%d.%d", &major,
> +                                    &minor,
> +                                    &fix   );
> +
> +  _os_version = (major << 16) |
> +                (minor << 8 ) |
> +                (fix   << 0 ) ;
> +}
> +
> +uint32_t os::Linux::os_version() {
> +  assert(_os_version != 0, "not initialized");
> +  return _os_version;
> +}
> +
> +
>  /////
>  // glibc on Linux platform uses non-documented flag
>  // to indicate, that some special sort of signal
> @@ -4552,6 +4578,8 @@
>    }
>    init_page_sizes((size_t) Linux::page_size());
> 
> +  Linux::initialize_os_info();
> +
>    Linux::initialize_system_info();
> 
>    // main_thread points to the aboriginal thread
> diff --git a/src/os/linux/vm/os_linux.hpp b/src/os/linux/vm/os_linux.hpp
> --- a/src/os/linux/vm/os_linux.hpp
> +++ b/src/os/linux/vm/os_linux.hpp
> @@ -56,6 +56,12 @@
> 
>    static GrowableArray<int>* _cpu_to_node;
> 
> +  // Ox00AABBCC
> +  // AA, Major Version
> +  // BB, Minor Version
> +  // CC, Fix   Version
> +  static uint32_t _os_version;
> +
>   protected:
> 
>    static julong _physical_memory;
> @@ -198,6 +204,9 @@
> 
>    static jlong fast_thread_cpu_time(clockid_t clockid);
> 
> +  static void initialize_os_info();
> +  static uint32_t os_version();
> +
>    // pthread_cond clock suppport
>   private:
>    static pthread_condattr_t _condattr[1];
> 
> Should I use any test suite besides the jtreg suite already present
> in the Hotspot forest?
> 
> 
> Best Regards,
> Gustavo
> 
> On 12-02-2016 12:52, Doerr, Martin wrote:
>> Hi Gustavo,
>>
>> the reason why we disabled RTM for linux on PPC64 (big or little endian) was the problematic behavior of syscalls.
>> The old version of the document
>> www.kernel.org/doc/Documentation/powerpc/transactional_memory.txt
>> said:
>> “Performing syscalls from within transaction is not recommended, and can lead to unpredictable results.“
>>
>> Transactions need to either pass completely or roll back completely without disturbing side effects of partially executed syscalls.
>> We rely on the kernel to abort transactions if necessary.
>>
>> The document has changed and it may possibly work with a new linux kernel.
>> However, we don't have such a new kernel, yet. So we can't test it at the moment.
>> I don't know which kernel version exactly contains the change. I guess this exact version number (major + minor) should be used for enabling RTM.
>>
>> I haven't looked into the tests, yet. There may be a need for additional adaptations and fixes.
>>
>> We appreciate if you make experiments and/or contributions.
>>
>> Thanks and best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Gustavo Romero
>> Sent: Freitag, 12. Februar 2016 14:45
>> To: hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
>> Subject: RTM disabled for Linux on PPC64 LE
>> Importance: High
>>
>> Hi,
>> As of now (tip 1922:be58b02c11f9, jdk9/jdk9 repo) Hotspot build for Linux on ppc64le of fails due to a simple uninitialized variable error:
>>
>> hotspot/src/share/vm/ci/ciMethodData.hpp:585:100:       error: ‘data’ may be used uninitialized in this function
>> hotspot/src/cpu/ppc/vm/c1_LIRAssembler_ppc.cpp:2408:78: error: ‘md’   may be used uninitialized in this function
>>
>> So this straightforward patch solves the issue:
>>  diff -r 534c50395957 src/cpu/ppc/vm/c1_LIRAssembler_ppc.cpp
>> --- a/src/cpu/ppc/vm/c1_LIRAssembler_ppc.cpp	Thu Jan 28 15:42:23 2016 -0800
>> +++ b/src/cpu/ppc/vm/c1_LIRAssembler_ppc.cpp	Mon Feb 08 17:13:14 2016 -0200
>> @@ -2321,8 +2321,8 @@
>>      if (reg_conflict) { obj = dst; }
>>    }
>>  -  ciMethodData* md;
>> -  ciProfileData* data;
>> +  ciMethodData* md = NULL;
>> +  ciProfileData* data = NULL;
>>    int mdo_offset_bias = 0; compiler/rtm
>>    if (should_profile) {
>>      ciMethod* method = op->profiled_method();
>>
>> However, after the build, I realized that RTM is still disabled for Linux on ppc64le, failing 25 tests on compiler/rtm suite:
>>
>> http://hastebin.com/raw/ohoxiwaqih
>>
>> Hence after applying the following patches that enable RTM for Linux on ppc64le:
>>
>> diff -r 266fa9bb5297 src/cpu/ppc/vm/vm_version_ppc.cpp
>> --- a/src/cpu/ppc/vm/vm_version_ppc.cpp	Thu Feb 04 16:48:39 2016 -0800
>> +++ b/src/cpu/ppc/vm/vm_version_ppc.cpp	Fri Feb 12 10:55:46 2016 -0200
>> @@ -255,7 +255,9 @@
>>      }
>>  #endif
>>  #ifdef linux
>> -    // TODO: check kernel version (we currently have too old versions only)
>> +    if (os::Linux::os_version() >= 4) { // at least Linux kernel version 4
>> +      os_too_old = false;
>> +    }
>>  #endif
>>      if (os_too_old) {
>>        vm_exit_during_initialization("RTM is not supported on this OS version.");
>>
>>
>> diff -r 266fa9bb5297 src/os/linux/vm/os_linux.cpp
>> --- a/src/os/linux/vm/os_linux.cpp	Thu Feb 04 16:48:39 2016 -0800
>> +++ b/src/os/linux/vm/os_linux.cpp	Fri Feb 12 10:58:10 2016 -0200
>> @@ -135,6 +135,7 @@
>>  int os::Linux::_page_size = -1;
>>  const int os::Linux::_vm_default_page_size = (8 * K);
>>  bool os::Linux::_supports_fast_thread_cpu_time = false;
>> +uint32_t os::Linux::_os_version = 0;
>>  const char * os::Linux::_glibc_version = NULL;
>>  const char * os::Linux::_libpthread_version = NULL;
>>  pthread_condattr_t os::Linux::_condattr[1];
>> @@ -4332,6 +4333,21 @@
>>    return (tp.tv_sec * NANOSECS_PER_SEC) + tp.tv_nsec;
>>  }
>>  +void os::Linux::initialize_os_info() {
>> +  assert(_os_version == 0, "OS info already initialized");
>> +
>> +  struct utsname _uname;
>> +  +  uname(&_uname); // Not sure yet how deal if ret == -1
>> +  _os_version = atoi(_uname.release);
>> +}
>> +
>> +uint32_t os::Linux::os_version() {
>> +  assert(_os_version != 0, "not initialized");
>> +  return _os_version;
>> +}
>> +
>> +
>>  /////
>>  // glibc on Linux platform uses non-documented flag
>>  // to indicate, that some special sort of signal
>> @@ -4553,6 +4569,7 @@
>>    init_page_sizes((size_t) Linux::page_size());
>>     Linux::initialize_system_info();
>> +  Linux::initialize_os_info();
>>     // main_thread points to the aboriginal thread
>>    Linux::_main_thread = pthread_self();
>>
>>
>> diff -r 266fa9bb5297 src/os/linux/vm/os_linux.hpp
>> --- a/src/os/linux/vm/os_linux.hpp	Thu Feb 04 16:48:39 2016 -0800
>> +++ b/src/os/linux/vm/os_linux.hpp	Fri Feb 12 10:59:01 2016 -0200
>> @@ -55,7 +55,7 @@
>>    static bool _supports_fast_thread_cpu_time;
>>     static GrowableArray<int>* _cpu_to_node;
>> -
>> +  static uint32_t _os_version;   protected:
>>     static julong _physical_memory;
>> @@ -198,6 +198,9 @@
>>     static jlong fast_thread_cpu_time(clockid_t clockid);
>>  +  static void initialize_os_info();
>> +  static uint32_t os_version(); +
>>    // pthread_cond clock suppport
>>   private:
>>    static pthread_condattr_t _condattr[1];
>>
>>
>> 23 tests are now passing: http://hastebin.com/raw/oyicagusod
>>
>> Is there a reason to let RTM disabled for Linux on ppc64le by now? Could somebody explain what is currently missing on PPC64 LE RTM implementation in order to make all RTM tests pass?
>>
>> Thank you.
>>
>> Regards,
>> --
>> Gustavo Romero
>>
> 



More information about the ppc-aix-port-dev mailing list