From david.holmes at oracle.com Thu May 4 01:13:25 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 4 May 2017 11:13:25 +1000 Subject: jdk10/jdk10 is broken on 32-bit linux Message-ID: <1810b41c-28cd-1c00-4e07-939286428348@oracle.com> With the latest changes in jdk10/jdk10 we are seeing failures of all jtreg agentvm mode tests on 32-bit linux binaries due to socket connection failures. And also some OSX failures. This seems to be have been caused by: 8165437: Evaluate the use of gettimeofday in Networking code http://hg.openjdk.java.net/jdk10/jdk10/jdk/rev/7cdde79d6a46 due to a truncation issue through using long instead of jlong. I've notified net-dev and will file a P1 bug to either have this fixed or backed out. David ----- From john_platts at hotmail.com Fri May 5 03:04:04 2017 From: john_platts at hotmail.com (John Platts) Date: Fri, 5 May 2017 03:04:04 +0000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms Message-ID: The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in the ANSI encoding on Windows platforms. There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the char16_t type. This change will make it possible to define jchar character and string literals on Windows platforms and on non-Windows platforms that support the C11 or C++11 standard. The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: #define JCHAR_LITERAL(x) L ## x The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: #define JCHAR_LITERAL(x) u ## x Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: typedef struct JavaVMUnicodeOption { const jchar *optionString; /* the option as a string in UTF-16 encoding */ void *extraInfo; } JavaVMUnicodeOption; typedef struct JavaVMUnicodeInitArgs { jint version; jint nOptions; JavaVMUnicodeOption *options; jboolean ignoreUnrecognized; } JavaVMUnicodeInitArgs; jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); jint JNI_GetDefaultJavaVMInitArgs(void *args); The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be launched with arguments that contain Unicode characters that are not in the platform-default encoding. All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments that are passed to the JVM. From david.holmes at oracle.com Fri May 5 04:07:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 May 2017 14:07:22 +1000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms In-Reply-To: References: Message-ID: Hi John, The JNI is defined to use Modified UTF-8 format for strings, so any Unicode character should be handled if passed in in the right format. Updating the JNI specification and implementation to accept UTF-16 directly would be a major undertaking. Is the issue here that you want a tool, like the java launcher, to accept arbitrary Unicode strings in a end-user friendly manner and then have it perform the modified UTF-8 conversion when invoking the VM? Can you give a concrete example of what you would like to be able to pass as arguments to the JVM? Thanks, David On 5/05/2017 1:04 PM, John Platts wrote: > The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in the ANSI encoding on Windows platforms. > > > There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. > > > jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the char16_t type. This change will make it possible to define jchar character and string literals on Windows platforms and on non-Windows platforms that support the C11 or C++11 standard. > > > The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: > > #define JCHAR_LITERAL(x) L ## x > > > The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: > > #define JCHAR_LITERAL(x) u ## x > > > Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: > > typedef struct JavaVMUnicodeOption { > const jchar *optionString; /* the option as a string in UTF-16 encoding */ > void *extraInfo; > } JavaVMUnicodeOption; > > typedef struct JavaVMUnicodeInitArgs { > jint version; > jint nOptions; > JavaVMUnicodeOption *options; > jboolean ignoreUnrecognized; > } JavaVMUnicodeInitArgs; > > jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); > jint JNI_GetDefaultJavaVMInitArgs(void *args); > > The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be launched with arguments that contain Unicode characters that are not in the platform-default encoding. > > All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments that are passed to the JVM. > From aph at redhat.com Fri May 5 17:10:17 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 5 May 2017 18:10:17 +0100 Subject: RFR: 8179701: AArch64: Reinstate FP as an allocatable register Message-ID: <4b03ad5f-3b42-9111-a77b-7cfc6b340f1c@redhat.com> http://cr.openjdk.java.net/~aph/8179701/ OK? Andrew. From joe.darcy at oracle.com Fri May 5 22:35:57 2017 From: joe.darcy at oracle.com (joe darcy) Date: Fri, 5 May 2017 15:35:57 -0700 Subject: Coming soon: CSR review for JDK 10 API and interface changes Message-ID: Hello, As has been in the works recently [1], the "Compatibility & Specification Review" process (CSR) is coming to JDK 10 soon. The CSR process is a replacement for the long-running JDK-internal CCC process. A sampling of JDK 9 CCC requests have been screened and imported to a temporary CCC project in JBS: https://bugs.openjdk.java.net/issues/?jql=project%20%3D%20ccc More detail on the imported issues can be found in the CSR discussion list [2] and the CSR wiki page discusses motivation for the process along with other supporting information. [3] Please look over the imported issues to get sense for what the CSR is looking for. Assuming no show-stopper issues are found with the CSR issue type, I'd like to start using the CSR process to review JDK 10 API and other interfaces changes shortly after May 12, 2017. Thanks, -Joe [1] http://mail.openjdk.java.net/pipermail/gb-discuss/2017-January/000320.html [2] http://mail.openjdk.java.net/pipermail/csr-discuss/2017-May/000025.html [3] https://wiki.openjdk.java.net/display/csr/Main From david.holmes at oracle.com Mon May 8 00:47:08 2017 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 May 2017 10:47:08 +1000 Subject: Add support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs on Windows platforms In-Reply-To: References: Message-ID: Added back jdk10-dev as a bcc. Added hotspot-dev and core-libs-dev (for launcher) for follow up discussions. Hi John, On 8/05/2017 10:33 AM, John Platts wrote: > I actually did a search through the code that implements > JNI_CreateJavaVM, and I found that the conversion of the strings is done > using java_lang_String::create_from_platform_dependent_str, which > converts from the platform-default encoding to Unicode. In the case of > Windows-based platforms, the conversion is done based on the ANSI > character encoding instead of UTF-8 or Modified UTF-8. > > > The platform encoding detection logic on Windows is implemented > java_props_md.c, which can be found at > jdk/src/windows/native/java/lang/java_props_md.c in releases prior to > JDK 9 and at src/java.base/windows/native/libjava/java_props_md.c in JDK > 9 and later. The encoding used for command-line arguments passed into > the JNI invocation API is Cp1252 for English locales on Windows > platforms, and not Modified UTF-8 or UTF-8. > > > The documentation found > at http://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html also > states that the strings passed into JNI_CreateJavaVM are in the > platform-default encoding. Thanks for the additional details. I assume you are referring to: typedef struct JavaVMOption { char *optionString; /* the option as a string in the default platform encoding */ that comment should not form part of the specification as it is non-normative text. If the intent is truly to use the platform default encoding and not UTF-8 then that should be very clearly spelt out in the spec! That said, the implementation is following this so it is a limitation. I suspect this is historical. > A version of JNI_CreateJavaVM that takes UTF-16-encoded strings should > be added to the JNI Invocation API. The java.exe launchers and javaw.exe > launchers should also be updated to use the UTF-16 version of the > JNI_CreateJavaVM function on Windows platforms and to use wmain and > wWinMain instead of main and WinMain. Why versions for UTF-16 instead of the missing UTF-8 variants? As I said the whole spec is intended to be based around UTF-8 so we would not want to throw in just a couple of UTF-16 based usages. Thanks, David > > A few files in HotSpot would need to be changed in order to implement > the UTF-16 version of JNI_CreateJavaVM, but the change would improve > consistency across different locales on Windows platforms and allow > arguments that contain Unicode characters that are not available in the > platform-default encoding to be passed into the JVM on the command line. > > > The UTF-16-based version of JNI_CreateJavaVM also makes it easier to > allocate string objects that contain non-ASCII characters as the strings > are already in UTF-16 format, at least in cases where the strings > contain Unicode characters that are not in Latin-1 or on VMs that do not > support compact Latin-1 strings. > > > The UTF-16-based version of JNI_CreateJavaVM should probably be > implemented as a separate function so that the solution could be > backported to JDK 8 and JDK 9 updates and so that backwards > compatibility with the current JNI_CreateJavaVM implementation is > maintained. > > > Here is what the new UTF-16-based API might look like: > > typedef struct JavaVMInitArgs_UTF16 { > jint version; > jint nOptions; > JavaVMOptionUTF16 *options; > jboolean ignoreUnrecognized; > } JavaVMInitArgs; > > > typedef struct JavaVMOption_UTF16 { > char *optionString; /* the option as a string in the default > platform encoding */ > void *extraInfo; > } JavaVMOptionUTF16; > > /* vm_args is an pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_CreateJavaVM_UTF16(JavaVM **p_vm, void **p_env, void *vm_args); > > > /* vm_args is a pointer to a JavaVMInitArgs_UTF16 structure */ > > jint JNI_GetDefaultJavaVMInitArgs_UTF16(void *vm_args); > > ------------------------------------------------------------------------ > *From:* David Holmes > *Sent:* Thursday, May 4, 2017 11:07 PM > *To:* John Platts; jdk10-dev at openjdk.java.net > *Subject:* Re: Add support for Unicode versions of JNI_CreateJavaVM and > JNI_GetDefaultJavaVMInitArgs on Windows platforms > > Hi John, > > The JNI is defined to use Modified UTF-8 format for strings, so any > Unicode character should be handled if passed in in the right format. > Updating the JNI specification and implementation to accept UTF-16 > directly would be a major undertaking. > > Is the issue here that you want a tool, like the java launcher, to > accept arbitrary Unicode strings in a end-user friendly manner and then > have it perform the modified UTF-8 conversion when invoking the VM? > > Can you give a concrete example of what you would like to be able to > pass as arguments to the JVM? > > Thanks, > David > > On 5/05/2017 1:04 PM, John Platts wrote: >> The JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods in the JNI invocation API expect ANSI strings on Windows platforms instead of Unicode-encoded strings. This is an issue on Windows-based platforms since some of the option strings that are passed into JNI_CreateJavaVM might contain Unicode characters that are not in > the ANSI encoding on Windows platforms. >> >> >> There is support for UTF-16 literals on Windows platforms with wchar_t and wide character literals prefixed with the L prefix, and on platforms that support C11 and C++11 with char16_t and UTF-16 character literals that are prefixed with the u prefix. >> >> >> jchar is currently defined to be a typedef for unsigned short on all platforms, but char16_t is a separate type and not a typedef for unsigned short or jchar in C++11 and later. jchar should be changed to be a typedef for wchar_t on Windows platforms and to be a typedef for char16_t on non-Windows platforms that support the > char16_t type. This change will make it possible to define jchar > character and string literals on Windows platforms and on non-Windows > platforms that support the C11 or C++11 standard. >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on Windows: >> >> #define JCHAR_LITERAL(x) L ## x >> >> >> The JCHAR_LITERAL macro should be added to the JNI header and defined as follows on non-Windows platforms: >> >> #define JCHAR_LITERAL(x) u ## x >> >> >> Here is how the Unicode version of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs could be defined: >> >> typedef struct JavaVMUnicodeOption { >> const jchar *optionString; /* the option as a string in UTF-16 encoding */ >> void *extraInfo; >> } JavaVMUnicodeOption; >> >> typedef struct JavaVMUnicodeInitArgs { >> jint version; >> jint nOptions; >> JavaVMUnicodeOption *options; >> jboolean ignoreUnrecognized; >> } JavaVMUnicodeInitArgs; >> >> jint JNI_CreateJavaVMUnicode(JavaVM **pvm, void **penv, void *args); >> jint JNI_GetDefaultJavaVMInitArgs(void *args); >> >> The java.exe wrapper should use wmain instead of main on Windows platforms, and the javaw.exe wrapper should use wWinMain instead of WinMain on Windows platforms. This change, along with the support for Unicode-enabled version of the JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs methods, would allow the JVM to be > launched with arguments that contain Unicode characters that are not in > the platform-default encoding. >> >> All of the Windows platforms that Java SE 10 and later VMs would be supported on do support Unicode. Adding support for Unicode versions of JNI_CreateJavaVM and JNI_GetDefaultJavaVMInitArgs will allow Unicode characters that are not in the platform-default encoding on Windows platforms to be supported in command-line arguments > that are passed to the JVM. >> From rwestrel at redhat.com Tue May 9 09:04:09 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 09 May 2017 11:04:09 +0200 Subject: RFR: 8179701: AArch64: Reinstate FP as an allocatable register In-Reply-To: <4b03ad5f-3b42-9111-a77b-7cfc6b340f1c@redhat.com> References: <4b03ad5f-3b42-9111-a77b-7cfc6b340f1c@redhat.com> Message-ID: > http://cr.openjdk.java.net/~aph/8179701/ > > OK? Looks good. Roland. From aph at redhat.com Tue May 9 17:20:20 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 9 May 2017 18:20:20 +0100 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn Message-ID: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> [My apologies to everyone: apparently I have to ask about JDK 10 as well.] I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. Andrew co-wrote the original AArch64 port. He is the author of 15 patches in the HotSpot sources, but that does not reflect the true extent of his contribution because he is the author of 341 patches in the aarch64-port project which I merged into OpenJDK. His considerable expertise, particularly with the C2 compiler, will be of great value to the project. Votes are due by 23 May, 2017. Only current Open|JDK 10 Reviewers [1] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Three-Vote Consensus voting instructions, see [2]. Andrew Haley. [1] http://openjdk.java.net/census [2] http://openjdk.java.net/projects/#reviewer-vote From claes.redestad at oracle.com Tue May 9 17:23:17 2017 From: claes.redestad at oracle.com (Claes Redestad) Date: Tue, 9 May 2017 19:23:17 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <0f9254e8-892c-c6ee-6982-278ce8d5d52f@oracle.com> Vote: yes On 2017-05-09 19:20, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From ashipile at redhat.com Tue May 9 17:27:10 2017 From: ashipile at redhat.com (Aleksey Shipilev) Date: Tue, 9 May 2017 19:27:10 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <92ac3209-652b-cf28-068f-24dbb0ce288d@redhat.com> Vote: yes -Aleksey On 05/09/2017 07:20 PM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote > From mandy.chung at oracle.com Tue May 9 17:27:48 2017 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 9 May 2017 10:27:48 -0700 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes Mandy From zgu at redhat.com Tue May 9 17:28:51 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 9 May 2017 13:28:51 -0400 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <0f9254e8-892c-c6ee-6982-278ce8d5d52f@oracle.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> <0f9254e8-892c-c6ee-6982-278ce8d5d52f@oracle.com> Message-ID: <69635cc2-e441-41b8-9787-bee3a8ad42d8@redhat.com> Vote: yes -Zhengyu On 05/09/2017 01:23 PM, Claes Redestad wrote: > Vote: yes > > On 2017-05-09 19:20, Andrew Haley wrote: >> [My apologies to everyone: apparently I have to ask about JDK 10 as >> well.] >> >> I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. >> >> Andrew co-wrote the original AArch64 port. He is the author of 15 >> patches in the HotSpot sources, but that does not reflect the true >> extent of his contribution because he is the author of 341 patches in >> the aarch64-port project which I merged into OpenJDK. His >> considerable expertise, particularly with the C2 compiler, will be of >> great value to the project. >> >> Votes are due by 23 May, 2017. >> >> Only current Open|JDK 10 Reviewers [1] are eligible to vote >> on this nomination. Votes must be cast in the open by replying >> to this mailing list. >> >> For Three-Vote Consensus voting instructions, see [2]. >> >> Andrew Haley. >> >> >> [1] http://openjdk.java.net/census >> [2] http://openjdk.java.net/projects/#reviewer-vote > From coleen.phillimore at oracle.com Tue May 9 17:29:58 2017 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 9 May 2017 13:29:58 -0400 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <9cfc2ff1-2cdb-4505-203a-8d49f7819c7c@oracle.com> Vote: yes On 5/9/17 1:20 PM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From vladimir.kozlov at oracle.com Tue May 9 17:30:06 2017 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 9 May 2017 10:30:06 -0700 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes On 5/9/17 10:20 AM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote > From tobias.hartmann at oracle.com Tue May 9 17:30:48 2017 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 9 May 2017 19:30:48 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <86f2d2f8-5be1-7e81-472c-03b1a36a9680@oracle.com> Vote: yes Best regards, Tobias On 09.05.2017 19:20, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote > From paul.sandoz at oracle.com Tue May 9 17:33:46 2017 From: paul.sandoz at oracle.com (Paul Sandoz) Date: Tue, 9 May 2017 10:33:46 -0700 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <7A49A671-A2A6-497E-8C16-0F81C1B206BC@oracle.com> Vote: yes Paul. From Roger.Riggs at Oracle.com Tue May 9 17:55:18 2017 From: Roger.Riggs at Oracle.com (Roger Riggs) Date: Tue, 9 May 2017 13:55:18 -0400 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes On 5/9/2017 1:20 PM, Andrew Haley wrote: > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. From daniel.fuchs at oracle.com Tue May 9 18:27:00 2017 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 9 May 2017 19:27:00 +0100 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <4ec6070d-1b2f-2ea6-f74e-2d788ddf0a68@oracle.com> Vote: yes -- daniel On 09/05/17 18:20, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. From david.holmes at oracle.com Tue May 9 21:21:11 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 May 2017 07:21:11 +1000 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes David On 10/05/2017 3:20 AM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote > From serguei.spitsyn at oracle.com Tue May 9 22:28:59 2017 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 9 May 2017 15:28:59 -0700 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes From kim.barrett at oracle.com Wed May 10 07:21:37 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 10 May 2017 03:21:37 -0400 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <2BB55CAD-830B-4EC8-90B3-B36E75AE90E9@oracle.com> vote: yes > On May 9, 2017, at 1:20 PM, Andrew Haley wrote: > > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From thomas.stuefe at gmail.com Wed May 10 08:57:11 2017 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 10 May 2017 10:57:11 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes On Tue, May 9, 2017 at 7:20 PM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote > From omajid at redhat.com Wed May 10 14:45:04 2017 From: omajid at redhat.com (Omair Majid) Date: Wed, 10 May 2017 10:45:04 -0400 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <20170510144504.GC10767@redhat.com> Vote: Yes * Andrew Haley [2017-05-09 13:21]: > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. Thanks, Omair -- PGP Key: 66484681 (http://pgp.mit.edu/) Fingerprint = F072 555B 0A17 3957 4E95 0056 F286 F14F 6648 4681 From vladimir.x.ivanov at oracle.com Wed May 10 15:06:40 2017 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Wed, 10 May 2017 18:06:40 +0300 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <8025768d-d24d-22bc-a5e3-cd1bcbad380c@oracle.com> Vote: yes Best regards, Vladimir Ivanov On 5/9/17 8:20 PM, Andrew Haley wrote: > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. From rwestrel at redhat.com Thu May 11 07:07:24 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 11 May 2017 09:07:24 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes Roland. From peter.levart at gmail.com Thu May 11 21:13:16 2017 From: peter.levart at gmail.com (Peter Levart) Date: Thu, 11 May 2017 23:13:16 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes Regards, Peter On 05/09/2017 07:20 PM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From christian.tornqvist at oracle.com Fri May 12 16:47:04 2017 From: christian.tornqvist at oracle.com (Christian Tornqvist) Date: Fri, 12 May 2017 09:47:04 -0700 Subject: RFR(XS): 8180304 - Add tests to ProblemList that fails on Windows when running with subst or different drive than source code is on. Message-ID: <32E07ACF-9025-4978-AEC3-0BCDA12EE921@oracle.com> Hi everyone, Please review this small change that adds a number of JDK and Langtools tests to ProblemList. They all fail on Windows when running with a jtreg workdir that is either on a drive that has been created using subst or when the source code for the tests are on a different drive than the workdir. Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8180304/ Thanks, Christian From george.triantafillou at oracle.com Fri May 12 16:49:34 2017 From: george.triantafillou at oracle.com (George Triantafillou) Date: Fri, 12 May 2017 12:49:34 -0400 Subject: RFR(XS): 8180304 - Add tests to ProblemList that fails on Windows when running with subst or different drive than source code is on. In-Reply-To: <32E07ACF-9025-4978-AEC3-0BCDA12EE921@oracle.com> References: <32E07ACF-9025-4978-AEC3-0BCDA12EE921@oracle.com> Message-ID: Hi Christian, Looks good. -George On 5/12/2017 12:47 PM, Christian Tornqvist wrote: > Hi everyone, > > Please review this small change that adds a number of JDK and Langtools tests to ProblemList. They all fail on Windows when running with a jtreg workdir that is either on a drive that has been created using subst or when the source code for the tests are on a different drive than the workdir. > > Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8180304/ > > Thanks, > Christian From kumar.x.srinivasan at oracle.com Fri May 12 18:54:54 2017 From: kumar.x.srinivasan at oracle.com (Kumar Srinivasan) Date: Fri, 12 May 2017 11:54:54 -0700 Subject: RFR(XS): 8180304 - Add tests to ProblemList that fails on Windows when running with subst or different drive than source code is on. In-Reply-To: <32E07ACF-9025-4978-AEC3-0BCDA12EE921@oracle.com> References: <32E07ACF-9025-4978-AEC3-0BCDA12EE921@oracle.com> Message-ID: <591604FE.4040304@oracle.com> Looks good. Kumar > Hi everyone, > > Please review this small change that adds a number of JDK and Langtools tests to ProblemList. They all fail on Windows when running with a jtreg workdir that is either on a drive that has been created using subst or when the source code for the tests are on a different drive than the workdir. > > Webrev: http://cr.openjdk.java.net/~ctornqvi/webrev/8180304/ > > Thanks, > Christian From chihiro.ito at oracle.com Thu May 18 06:22:09 2017 From: chihiro.ito at oracle.com (chihiro ito) Date: Thu, 18 May 2017 15:22:09 +0900 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation Message-ID: <591D3D91.9050901@oracle.com> Hi all, I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation option. Logs which applied this is following. Could you possibly review for this following small change? If review is ok, please commit this as cito. Sample Log: [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': size=120036Kb used=13Kb max_used=13Kb free=120022Kb [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': size=120032Kb used=85Kb max_used=85Kb free=119946Kb [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb Source: diff --git a/src/share/vm/compiler/compileBroker.cpp b/src/share/vm/compiler/compileBroker.cpp --- a/src/share/vm/compiler/compileBroker.cpp +++ b/src/share/vm/compiler/compileBroker.cpp @@ -1726,6 +1726,34 @@ tty->print("%s", s.as_string()); } +// wrapper for CodeCache::print_summary() using outputStream +static void codecache_print(outputStream* out, bool detailed) +{ + ResourceMark rm; + stringStream s; + + // Dump code cache into a buffer + { + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); + CodeCache::print_summary(&s, detailed); + } + + char* summary = s.as_string(); + char* cr_pos; + + do { + cr_pos = strchr(summary, '\n'); + if (cr_pos != NULL) { + *cr_pos = '\0'; + } + if (strlen(summary)!=0) { + out->print_cr("%s", summary); + } + + summary = cr_pos + 1; + } while (cr_pos != NULL); +} + void CompileBroker::post_compile(CompilerThread* thread, CompileTask* task, EventCompilation& event, bool success, ciEnv* ci_env) { if (success) { @@ -1939,6 +1967,10 @@ tty->print_cr("time: %d inlined: %d bytes", (int)time.milliseconds(), task->num_inlined_bytecodes()); } + Log(compilation, codecache) log; + if (log.is_debug()) + codecache_print(log.debug_stream(), /* detailed= */ false); + if (PrintCodeCacheOnCompilation) codecache_print(/* detailed= */ false); Regards, Chihiro -- Chihiro Ito | Principal Consultant | +81.90.6148.8815 Oracle Consultant ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN Oracle is committed to developing practices and products that help protect the environment From david.holmes at oracle.com Thu May 18 06:40:22 2017 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 May 2017 16:40:22 +1000 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: <591D3D91.9050901@oracle.com> References: <591D3D91.9050901@oracle.com> Message-ID: <27b3260b-9ba9-134d-76cf-83e7212d1ca0@oracle.com> Hi Chihiro, Reviews take place on the mailing list for the area the code change relates to - in this case it looks like hotspot-compiler-dev at opnejdk.java.net. Please send your RFR over there. Thanks, David On 18/05/2017 4:22 PM, chihiro ito wrote: > Hi all, > > I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation > option. Logs which applied this is following. > Could you possibly review for this following small change? If review is > ok, please commit this as cito. > > Sample Log: > [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': > size=120036Kb used=13Kb max_used=13Kb free=120022Kb > [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': > size=120032Kb used=85Kb max_used=85Kb free=119946Kb > [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': > size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb > > Source: > diff --git a/src/share/vm/compiler/compileBroker.cpp > b/src/share/vm/compiler/compileBroker.cpp > --- a/src/share/vm/compiler/compileBroker.cpp > +++ b/src/share/vm/compiler/compileBroker.cpp > @@ -1726,6 +1726,34 @@ > tty->print("%s", s.as_string()); > } > > +// wrapper for CodeCache::print_summary() using outputStream > +static void codecache_print(outputStream* out, bool detailed) > +{ > + ResourceMark rm; > + stringStream s; > + > + // Dump code cache into a buffer > + { > + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); > + CodeCache::print_summary(&s, detailed); > + } > + > + char* summary = s.as_string(); > + char* cr_pos; > + > + do { > + cr_pos = strchr(summary, '\n'); > + if (cr_pos != NULL) { > + *cr_pos = '\0'; > + } > + if (strlen(summary)!=0) { > + out->print_cr("%s", summary); > + } > + > + summary = cr_pos + 1; > + } while (cr_pos != NULL); > +} > + > void CompileBroker::post_compile(CompilerThread* thread, CompileTask* > task, EventCompilation& event, bool success, ciEnv* ci_env) { > > if (success) { > @@ -1939,6 +1967,10 @@ > tty->print_cr("time: %d inlined: %d bytes", > (int)time.milliseconds(), task->num_inlined_bytecodes()); > } > > + Log(compilation, codecache) log; > + if (log.is_debug()) > + codecache_print(log.debug_stream(), /* detailed= */ false); > + > if (PrintCodeCacheOnCompilation) > codecache_print(/* detailed= */ false); > > > Regards, > Chihiro > > From chihiro.ito at oracle.com Thu May 18 12:53:56 2017 From: chihiro.ito at oracle.com (chihiro ito) Date: Thu, 18 May 2017 21:53:56 +0900 Subject: RFR: Apply UL to PrintCodeCacheOnCompilation In-Reply-To: <27b3260b-9ba9-134d-76cf-83e7212d1ca0@oracle.com> References: <591D3D91.9050901@oracle.com> <27b3260b-9ba9-134d-76cf-83e7212d1ca0@oracle.com> Message-ID: <591D9964.40208@oracle.com> Hi David Thank you for your reply. I try to send to hotspot-compiler-dev. Regards, Chihiro On 2017/05/18 15:40, David Holmes wrote: > Hi Chihiro, > > Reviews take place on the mailing list for the area the code change > relates to - in this case it looks like > hotspot-compiler-dev at opnejdk.java.net. Please send your RFR over there. > > Thanks, > David > > On 18/05/2017 4:22 PM, chihiro ito wrote: >> Hi all, >> >> I apply Unified JVM Logging to log of PrintCodeCacheOnCompilation >> option. Logs which applied this is following. >> Could you possibly review for this following small change? If review is >> ok, please commit this as cito. >> >> Sample Log: >> [1.370s][debug][compilation,codecache] CodeHeap 'non-profiled nmethods': >> size=120036Kb used=13Kb max_used=13Kb free=120022Kb >> [1.372s][debug][compilation,codecache] CodeHeap 'profiled nmethods': >> size=120032Kb used=85Kb max_used=85Kb free=119946Kb >> [1.372s][debug][compilation,codecache] CodeHeap 'non-nmethods': >> size=5692Kb used=2648Kb max_used=2655Kb free=3043Kb >> >> Source: >> diff --git a/src/share/vm/compiler/compileBroker.cpp >> b/src/share/vm/compiler/compileBroker.cpp >> --- a/src/share/vm/compiler/compileBroker.cpp >> +++ b/src/share/vm/compiler/compileBroker.cpp >> @@ -1726,6 +1726,34 @@ >> tty->print("%s", s.as_string()); >> } >> >> +// wrapper for CodeCache::print_summary() using outputStream >> +static void codecache_print(outputStream* out, bool detailed) >> +{ >> + ResourceMark rm; >> + stringStream s; >> + >> + // Dump code cache into a buffer >> + { >> + MutexLockerEx mu(CodeCache_lock, Mutex::_no_safepoint_check_flag); >> + CodeCache::print_summary(&s, detailed); >> + } >> + >> + char* summary = s.as_string(); >> + char* cr_pos; >> + >> + do { >> + cr_pos = strchr(summary, '\n'); >> + if (cr_pos != NULL) { >> + *cr_pos = '\0'; >> + } >> + if (strlen(summary)!=0) { >> + out->print_cr("%s", summary); >> + } >> + >> + summary = cr_pos + 1; >> + } while (cr_pos != NULL); >> +} >> + >> void CompileBroker::post_compile(CompilerThread* thread, CompileTask* >> task, EventCompilation& event, bool success, ciEnv* ci_env) { >> >> if (success) { >> @@ -1939,6 +1967,10 @@ >> tty->print_cr("time: %d inlined: %d bytes", >> (int)time.milliseconds(), task->num_inlined_bytecodes()); >> } >> >> + Log(compilation, codecache) log; >> + if (log.is_debug()) >> + codecache_print(log.debug_stream(), /* detailed= */ false); >> + >> if (PrintCodeCacheOnCompilation) >> codecache_print(/* detailed= */ false); >> >> >> Regards, >> Chihiro >> >> -- Chihiro Ito | Principal Consultant | +81.90.6148.8815 Oracle Consultant ORACLE Japan | Akasaka Center Bldg. | Motoakasaka 1-3-13 | 1070051 Minato-ku, Tokyo, JAPAN Oracle is committed to developing practices and products that help protect the environment From christoph.langer at sap.com Thu May 18 13:41:02 2017 From: christoph.langer at sap.com (Langer, Christoph) Date: Thu, 18 May 2017 13:41:02 +0000 Subject: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: <67a2157d0ab34eb49240e5dbaece5970@sap.com> Vote: Yes > -----Original Message----- > From: jdk10-dev [mailto:jdk10-dev-bounces at openjdk.java.net] On Behalf > Of Andrew Haley > Sent: Dienstag, 9. Mai 2017 19:20 > To: jdk10-dev at openjdk.java.net > Subject: CFV: New JDK 10 Reviewer: Andrew Dinn > > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From volker.simonis at gmail.com Thu May 18 13:54:00 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 18 May 2017 15:54:00 +0200 Subject: CFV: New JDK 10 Reviewer: Andrew Dinn In-Reply-To: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> References: <177a88db-ad04-21da-2fb9-3b9564e3863f@redhat.com> Message-ID: Vote: yes On Tue, May 9, 2017 at 7:20 PM, Andrew Haley wrote: > [My apologies to everyone: apparently I have to ask about JDK 10 as > well.] > > I hereby nominate Andrew Dinn (adinn) to JDK 10 Reviewer. > > Andrew co-wrote the original AArch64 port. He is the author of 15 > patches in the HotSpot sources, but that does not reflect the true > extent of his contribution because he is the author of 341 patches in > the aarch64-port project which I merged into OpenJDK. His > considerable expertise, particularly with the C2 compiler, will be of > great value to the project. > > Votes are due by 23 May, 2017. > > Only current Open|JDK 10 Reviewers [1] are eligible to vote > on this nomination. Votes must be cast in the open by replying > to this mailing list. > > For Three-Vote Consensus voting instructions, see [2]. > > Andrew Haley. > > > [1] http://openjdk.java.net/census > [2] http://openjdk.java.net/projects/#reviewer-vote From joe.darcy at oracle.com Wed May 24 01:00:07 2017 From: joe.darcy at oracle.com (Joseph D. Darcy) Date: Tue, 23 May 2017 18:00:07 -0700 Subject: CSR issue type now available in JDK project of JBS for compatibility and specification review of JDK 10 changes Message-ID: <5924DB17.80202@oracle.com> Hello, As previewed recently [1], the "Compatibility & Specification Review" process (CSR) is now available for JDK 10 changes. To create a CSR request, from a bug in JBS with a fixVersion of JDK 10 select More -> Create CSR and then fill in the pre-populated outline in the description field and set the other fields as appropriate. If you have questions about the process, please first read through the material on the CSR wiki page: https://wiki.openjdk.java.net/display/csr/Main including the FAQ and then send me email if the question is not already covered. Finding an example request for a similar JDK 9 change https://bugs.openjdk.java.net/issues/?jql=project%20%3D%20ccc may provide guidance or a template to follow for a JDK 10 change. I expect to refine the CSR FAQ and other documentation in the coming months as we start using the new process. I also expect some adjustments to the process will be made as we break it in. Thanks, -Joe [1] http://mail.openjdk.java.net/pipermail/jdk10-dev/2017-May/000193.html From adinn at redhat.com Thu May 25 13:12:53 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 14:12:53 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException Message-ID: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> The following webrev fixes a race condition that is present in jdk10 and also jdk9 and jdk8. It is caused by a misplaced volatile keyword that faild to ensure correct ordering of writes by the compiler. Reviews welcome. http://cr.openjdk.java.net/~adinn/8181085/webrev.00/ Backporting: This same fix is required in jdk9 and jdk8. Testing: The reproducer posted with the original issue manifests the NPE reliably on jdk8. It does not manifest on jdk9/10 but that is only thanks to changes introduced into the resolution process in jdk9 which change the timing of execution. However, without this fix the out-of-order write problem is still present in jdk9/10, as can be seen by eyeballing the compiled code for ConstantPoolCacheEntry::set_direct_or_vtable_call. The patch has been validated on jdk8 by running the reproducer. It stops any resulting NPEs. The code for ConstantPoolCacheEntry::set_direct_or_vtable_call on jdk8-10 has been eyeballed to ensure that post-patch the assignments now occur in the correct order. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu May 25 13:30:26 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 May 2017 14:30:26 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> Message-ID: <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> On 25/05/17 14:12, Andrew Dinn wrote: > The following webrev fixes a race condition that is present in jdk10 and > also jdk9 and jdk8. It is caused by a misplaced volatile keyword that > faild to ensure correct ordering of writes by the compiler. Reviews welcome. Can you explain why we don't need a memory fence there? Andrew. From adinn at redhat.com Thu May 25 13:57:32 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 14:57:32 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> Message-ID: On 25/05/17 14:30, Andrew Haley wrote: > On 25/05/17 14:12, Andrew Dinn wrote: >> The following webrev fixes a race condition that is present in jdk10 and >> also jdk9 and jdk8. It is caused by a misplaced volatile keyword that >> faild to ensure correct ordering of writes by the compiler. Reviews welcome. > > Can you explain why we don't need a memory fence there? We do need one and we have one. The assignments executed in the relevant method in cpCache.cpp (i.e. ConstantPoolCacheEntry::set_direct_or_vtable_call) are . . . set_f1(method()); . . . set_bytecode_1(invoke_code); . . . If you look at the definition of these two methods they are void set_f1(Metadata* f1) { Metadata* existing_f1 = (Metadata*)_f1; // read once assert(existing_f1 == NULL || existing_f1 == f1, "illegal field change"); _f1 = f1; } and void ConstantPoolCacheEntry::set_bytecode_1(Bytecodes::Code code) { #ifdef ASSERT // Read once. volatile Bytecodes::Code c = bytecode_1(); assert(c == 0 || c == code || code == 0, "update must be consistent"); #endif // Need to flush pending stores here before bytecode is written. OrderAccess::release_store_ptr(&_indices, _indices | ((u_char)code << bytecode_1_shift)); On x86 the release_store_ptr operation just reduces to an assignment of volatile field _indices. That alone doesn't stop the compiler re-ordering it before the assignment of f1. Making both fields volatile does stop them being re-ordered. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu May 25 14:07:45 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 May 2017 15:07:45 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> Message-ID: <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> On 25/05/17 14:57, Andrew Dinn wrote: > On x86 the release_store_ptr operation just reduces to an assignment of > volatile field _indices. That alone doesn't stop the compiler > re-ordering it before the assignment of f1. Making both fields volatile > does stop them being re-ordered. Please bear with me. We have to set f1 and then bytecode_1. We do not want the store to bytecode_1 to move before the store to f1. OrderAccess::release_store_ptr() should be strong enough to guarantee that, regardless of whether f1 is volatile or not. If it's not, there should be a compiler fence in release_store_ptr(). Andrew. From adinn at redhat.com Thu May 25 14:25:06 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 15:25:06 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> Message-ID: On 25/05/17 15:07, Andrew Haley wrote: > On 25/05/17 14:57, Andrew Dinn wrote: >> On x86 the release_store_ptr operation just reduces to an assignment of >> volatile field _indices. That alone doesn't stop the compiler >> re-ordering it before the assignment of f1. Making both fields volatile >> does stop them being re-ordered. > > Please bear with me. We have to set f1 and then bytecode_1. We do not > want the store to bytecode_1 to move before the store to f1. > > OrderAccess::release_store_ptr() should be strong enough to guarantee that, > regardless of whether f1 is volatile or not. > If it's not, there should be a compiler fence in release_store_ptr(). On a weak architecture like AArch64 OrderAccess::release_store_ptr() will be translated to an ordered write. That will ensure that order of generated store instructions and order of memory system visibility for those stores reflect source order. On x86 OrderAccess::release_store_ptr() reduces to a simple write. That's because TCO means that there is no need to do anything in order to ensure that /memory visibility/ order respects instruction generation/execution order. However, on x86 there most definitely /is/ a need to ensure that the compiler generates these store instructions in source order. That's why both fields need to be volatile. A C++ compiler may not re-order volatile writes. Yes, C++ volatile pretty much sux doesn't it! regards, Andrew Dinn ----------- Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." ---------------------------------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu May 25 14:38:55 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 May 2017 15:38:55 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> Message-ID: <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> On 25/05/17 15:25, Andrew Dinn wrote: > On 25/05/17 15:07, Andrew Haley wrote: >> On 25/05/17 14:57, Andrew Dinn wrote: >>> On x86 the release_store_ptr operation just reduces to an assignment of >>> volatile field _indices. That alone doesn't stop the compiler >>> re-ordering it before the assignment of f1. Making both fields volatile >>> does stop them being re-ordered. >> >> Please bear with me. We have to set f1 and then bytecode_1. We do not >> want the store to bytecode_1 to move before the store to f1. >> >> OrderAccess::release_store_ptr() should be strong enough to guarantee that, >> regardless of whether f1 is volatile or not. >> If it's not, there should be a compiler fence in release_store_ptr(). > > On a weak architecture like AArch64 OrderAccess::release_store_ptr() > will be translated to an ordered write. That will ensure that order of > generated store instructions and order of memory system visibility for > those stores reflect source order. > > On x86 OrderAccess::release_store_ptr() reduces to a simple write. > That's because TCO means that there is no need to do anything in order > to ensure that /memory visibility/ order respects instruction > generation/execution order. > > However, on x86 there most definitely /is/ a need to ensure that the > compiler generates these store instructions in source order. That's why > both fields need to be volatile. A C++ compiler may not re-order > volatile writes. Well, that's wrong. The bug is in OrderAccess::release_store_ptr(), which must not allow this reordering. Put a proper release barrier in there before the store, and all will be well: __atomic_thread_fence(__ATOMIC_RELEASE); There's really no need to make both fields volatile. And to do so leaves a lurking bug for any other unsuspecting user of release_store_ptr(). Andrew. From adinn at redhat.com Thu May 25 15:03:24 2017 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 25 May 2017 16:03:24 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> Message-ID: <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> On 25/05/17 15:38, Andrew Haley wrote: > Well, that's wrong. The bug is in OrderAccess::release_store_ptr(), > which must not allow this reordering. Put a proper release barrier > in there before the store, and all will be well: > > __atomic_thread_fence(__ATOMIC_RELEASE); > > There's really no need to make both fields volatile. And to do so > leaves a lurking bug for any other unsuspecting user of > release_store_ptr(). Oops. Apologies for this but I misread the gdb output when I ran this on jdk10. The re-ordering of the store instructions is not happening in jdk10 or jdk9. It does happen on jdk8 (that probably explains why the reproducer only manifests the NPE in jdk8 :-). The jdk10 implementation of release_Store_ptr has indeed already been reworked to insert a compiler barrier (using "asm volatile memory") but not a memory store barrier. I retract this patch. The problem with jdk8 still exists. It probably needs fixing by backporting the changes to the store_release etc. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Thu May 25 15:33:27 2017 From: aph at redhat.com (Andrew Haley) Date: Thu, 25 May 2017 16:33:27 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> Message-ID: <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> On 25/05/17 16:03, Andrew Dinn wrote: > The jdk10 implementation of release_Store_ptr has indeed already been > reworked to insert a compiler barrier (using "asm volatile memory") but > not a memory store barrier. Cool. An asm volatile memory barrier is a bit stronger than is perhaps needed, but it almost certainly will make no difference, and is compatible with old releases of GCC. Andrew. From paul.hohensee at gmail.com Thu May 25 20:29:52 2017 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Thu, 25 May 2017 13:29:52 -0700 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> Message-ID: I don't know that you want to retract the patch. There's still a bug here imo that your patch fixes. The pointer formal parameter types for all orderAccess methods are volatile in order to force the compiler, at the point of method invocation, to order the memory access through the pointer within the method with respect to other volatile accesses. Accesses in the caller won't be ordered by the compiler with respect to the access in the orderAccess method unless the caller accesses are also volatile. And that's the bug. If we want the compiler to not reorder accesses to _f1, _f1 must be declared volatile. volatile MethodData* _f1; says that the MethodData is volatile (i.e., all accesses to parts of the MethodData object are volatile), which isn't what we want if we're intent on ordering with respect to accesses to _f1. MethodData* volatile _f1; is the way to do that. Thanks, Paul On Thu, May 25, 2017 at 8:33 AM, Andrew Haley wrote: > On 25/05/17 16:03, Andrew Dinn wrote: > > The jdk10 implementation of release_Store_ptr has indeed already been > > reworked to insert a compiler barrier (using "asm volatile memory") but > > not a memory store barrier. > > Cool. An asm volatile memory barrier is a bit stronger than is > perhaps needed, but it almost certainly will make no difference, > and is compatible with old releases of GCC. > > Andrew. > From david.holmes at oracle.com Fri May 26 02:20:09 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 12:20:09 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> Message-ID: On 26/05/2017 6:29 AM, Paul Hohensee wrote: > I don't know that you want to retract the patch. There's still a bug here > imo that your patch fixes. I agree. This is a common error when dealing with pointer variables, especially when looking at surrounding usage on non-pointer variables. We need the _f1 pointer to be volatile, not the thing to which the _f1 pointer points (well it's possible we may need both, I haven't dived that deep). Any variable passed to an OrderAccess, or Atomic, function should be volatile to minimise the chances the C compiler will do something unexpected with it. I don't even know what to make of the vmStructs.cpp existing code! David > The pointer formal parameter types for all orderAccess methods are volatile > in order to force the compiler, at the point of method invocation, to order > the memory access through the pointer within the method with respect to > other volatile accesses. Accesses in the caller won't be ordered by the > compiler with respect to the access in the orderAccess method unless the > caller accesses are also volatile. And that's the bug. > > If we want the compiler to not reorder accesses to _f1, _f1 must be > declared volatile. > > volatile MethodData* _f1; > > says that the MethodData is volatile (i.e., all accesses to parts of the > MethodData object are volatile), which isn't what we want if we're intent > on ordering with respect to accesses to _f1. > > MethodData* volatile _f1; > > is the way to do that. > > Thanks, > > Paul > > On Thu, May 25, 2017 at 8:33 AM, Andrew Haley wrote: > >> On 25/05/17 16:03, Andrew Dinn wrote: >>> The jdk10 implementation of release_Store_ptr has indeed already been >>> reworked to insert a compiler barrier (using "asm volatile memory") but >>> not a memory store barrier. >> >> Cool. An asm volatile memory barrier is a bit stronger than is >> perhaps needed, but it almost certainly will make no difference, >> and is compatible with old releases of GCC. >> >> Andrew. >> From adinn at redhat.com Fri May 26 08:11:27 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 May 2017 09:11:27 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> Message-ID: <464bf18f-8d57-e8a7-f3a2-5ebbd4a993c1@redhat.com> On 26/05/17 03:20, David Holmes wrote: > On 26/05/2017 6:29 AM, Paul Hohensee wrote: >> I don't know that you want to retract the patch. There's still a bug here >> imo that your patch fixes. > > I agree. This is a common error when dealing with pointer variables, > especially when looking at surrounding usage on non-pointer variables. > We need the _f1 pointer to be volatile, not the thing to which the _f1 > pointer points (well it's possible we may need both, I haven't dived > that deep). > > Any variable passed to an OrderAccess, or Atomic, function should be > volatile to minimise the chances the C compiler will do something > unexpected with it. > > I don't even know what to make of the vmStructs.cpp existing code! Hmm, well this is a conundrum then. One piece of advice from the two of you and another from Andrew Haley (who is 'technically' my boss --- i.e. he's the technical lead for my team :-). Your view is precisely what I originally assumed was at play here i.e. that where successive writes to fields must be seen by other threads in the correct order that is to be achieved on x86 by making both fields volatile. This guarantees sequencing of generated store instructions by the compiler in accordance with source order and, hence, because x86 is TCO, visibility of those store instructions in that same order. Of course, that is inadequate on weak-memory models like ppc and AArch64. So, to make your suggestion work properly for all architectures the second write (at least, if not the first) also needs to be implemented using a call to store_release. That will definitely ensure that the first write is visible before the second on all architectures. It has been our hope (Andrew's and mine) since we completed the AArch64 port that all pairs of stores which require ordering do indeed employ a store_release (we have had to correct a few cases over the last few years). Andrew's belief seems to be that your model is error prone and is fixed more correctly by introducing a memory and/or compiler barrier into the implementation of release_store. If instead release_store is used consistently whenever the second of a pair of writes needs to be guaranteed to be visible after the first then it will provide the desired outcome. This belief seems indeed to be backed up by the changes made to the jdk9 code base quite some while back (the ones I failed to notice). The relevant commit is 7143664: Clean up OrderAccess implementations and usage (n.b. I believe the author is one D Holmes :-) I think Andrew's view is probably sound (and not just because he is my boss). Since we must use release_store everywhere we want visibility of writes to be ordered then also requiring both the fields involved to be volatile is redundant. Given what little C++ volatile declarations do achieve it might be wiser not to be using volatile declarations at all. We would certainly start finding missing store_release calls quicker ;-) regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From aph at redhat.com Fri May 26 08:26:01 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 26 May 2017 09:26:01 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> Message-ID: <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> On 26/05/17 03:20, David Holmes wrote: > Any variable passed to an OrderAccess, or Atomic, function should be > volatile to minimise the chances the C compiler will do something > unexpected with it. That's not much more than paranoia, IMO. If the barriers are strong enough it'll be fine. The problem was, I suppose, with old compilers which didn't handle memory barriers properly, but we should be moving towards standard ways of doing these things. Standard atomics have been available since C++11 (I think) and GCC has had support since long before then. Maybe in the JDK10 timeframe we can look at upgrading the compilers for all platforms. Andrew. From david.holmes at oracle.com Fri May 26 09:20:19 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 19:20:19 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> Message-ID: <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> Hi Andrew, On 26/05/2017 6:26 PM, Andrew Haley wrote: > On 26/05/17 03:20, David Holmes wrote: >> Any variable passed to an OrderAccess, or Atomic, function should be >> volatile to minimise the chances the C compiler will do something >> unexpected with it. > > That's not much more than paranoia, IMO. If the barriers are strong > enough it'll be fine. The problem was, I suppose, with old compilers > which didn't handle memory barriers properly, but we should be moving > towards standard ways of doing these things. Standard atomics have > been available since C++11 (I think) and GCC has had support since long > before then. The issue isn't just the barriers that might be involved inside orderAccess methods. If these variables are being used in racy lock-free code then they should be marked volatile to ensure other compiler optimizations don't interfere. Perhaps that is paranoia, but I'd rather a little harmless paranoia than try to debug what might otherwise go wrong. Regardless of anything else the declaration(s) of _f1 are "wrong" under our existing approach to lock-free code. Fixing those declarations may or may not make any difference to the observed spurious NPE problem. The backport of the improved compiler_barrier is a separate issue. > Maybe in the JDK10 timeframe we can look at upgrading the compilers > for all platforms. I have no doubt we will upgrade compilers, but whether we try to use C++11 features/APIs is a different matter. IIRC there are already some open RFEs to look into this. Thanks, David > Andrew. > From adinn at redhat.com Fri May 26 09:35:27 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 May 2017 10:35:27 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> Message-ID: <6021f2b1-4b0a-5229-aaa1-57d608f4a49c@redhat.com> Hi David, On 26/05/17 10:20, David Holmes wrote: > Hi Andrew, > > On 26/05/2017 6:26 PM, Andrew Haley wrote: >> On 26/05/17 03:20, David Holmes wrote: >>> Any variable passed to an OrderAccess, or Atomic, function should be >>> volatile to minimise the chances the C compiler will do something >>> unexpected with it. >> >> That's not much more than paranoia, IMO. If the barriers are strong >> enough it'll be fine. The problem was, I suppose, with old compilers >> which didn't handle memory barriers properly, but we should be moving >> towards standard ways of doing these things. Standard atomics have >> been available since C++11 (I think) and GCC has had support since long >> before then. > > The issue isn't just the barriers that might be involved inside > orderAccess methods. If these variables are being used in racy lock-free > code then they should be marked volatile to ensure other compiler > optimizations don't interfere. Perhaps that is paranoia, but I'd rather > a little harmless paranoia than try to debug what might otherwise go wrong. I don't understand what you are suggesting here. How is such racy, lock-free code ever going to work on architectures with weak memory models? > ... regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From david.holmes at oracle.com Fri May 26 09:40:40 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 19:40:40 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <6021f2b1-4b0a-5229-aaa1-57d608f4a49c@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <6021f2b1-4b0a-5229-aaa1-57d608f4a49c@redhat.com> Message-ID: <77ef6b38-b5a4-8922-4250-57ce13c4dda8@oracle.com> On 26/05/2017 7:35 PM, Andrew Dinn wrote: > Hi David, > > On 26/05/17 10:20, David Holmes wrote: >> Hi Andrew, >> >> On 26/05/2017 6:26 PM, Andrew Haley wrote: >>> On 26/05/17 03:20, David Holmes wrote: >>>> Any variable passed to an OrderAccess, or Atomic, function should be >>>> volatile to minimise the chances the C compiler will do something >>>> unexpected with it. >>> >>> That's not much more than paranoia, IMO. If the barriers are strong >>> enough it'll be fine. The problem was, I suppose, with old compilers >>> which didn't handle memory barriers properly, but we should be moving >>> towards standard ways of doing these things. Standard atomics have >>> been available since C++11 (I think) and GCC has had support since long >>> before then. >> >> The issue isn't just the barriers that might be involved inside >> orderAccess methods. If these variables are being used in racy lock-free >> code then they should be marked volatile to ensure other compiler >> optimizations don't interfere. Perhaps that is paranoia, but I'd rather >> a little harmless paranoia than try to debug what might otherwise go wrong. > > I don't understand what you are suggesting here. How is such racy, > lock-free code ever going to work on architectures with weak memory models? By using load-acquire/store-release and atomic operations - that's how you write lock-free algorithms. David >> ... > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From david.holmes at oracle.com Fri May 26 09:43:31 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 19:43:31 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <464bf18f-8d57-e8a7-f3a2-5ebbd4a993c1@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <464bf18f-8d57-e8a7-f3a2-5ebbd4a993c1@redhat.com> Message-ID: <5dfd5bdc-1019-cd5c-f3c6-6e9e86c527d5@oracle.com> One important correction: > 7143664: Clean up OrderAccess implementations and usage > > (n.b. I believe the author is one D Holmes No that work was done by Erik Osterlund (before he joined Oracle). I was only the sponsor. 7143664: Clean up OrderAccess implementations and usage Summary: Clarify and correct the abstract model for memory barriers provided by the orderAccess class. Refactor the implementations using template specialization to allow the bulk of the code to be shared, with platform specific customizations applied as needed. Reviewed-by: acorn, dcubed, dholmes, dlong, goetz, kbarrett, sgehwolf Contributed-by: Erik Osterlund Cheers, David ----- On 26/05/2017 6:11 PM, Andrew Dinn wrote: > On 26/05/17 03:20, David Holmes wrote: >> On 26/05/2017 6:29 AM, Paul Hohensee wrote: >>> I don't know that you want to retract the patch. There's still a bug here >>> imo that your patch fixes. >> >> I agree. This is a common error when dealing with pointer variables, >> especially when looking at surrounding usage on non-pointer variables. >> We need the _f1 pointer to be volatile, not the thing to which the _f1 >> pointer points (well it's possible we may need both, I haven't dived >> that deep). >> >> Any variable passed to an OrderAccess, or Atomic, function should be >> volatile to minimise the chances the C compiler will do something >> unexpected with it. >> >> I don't even know what to make of the vmStructs.cpp existing code! > > Hmm, well this is a conundrum then. One piece of advice from the two of > you and another from Andrew Haley (who is 'technically' my boss --- i.e. > he's the technical lead for my team :-). > > Your view is precisely what I originally assumed was at play here i.e. > that where successive writes to fields must be seen by other threads in > the correct order that is to be achieved on x86 by making both fields > volatile. This guarantees sequencing of generated store instructions by > the compiler in accordance with source order and, hence, because x86 is > TCO, visibility of those store instructions in that same order. > > Of course, that is inadequate on weak-memory models like ppc and > AArch64. So, to make your suggestion work properly for all architectures > the second write (at least, if not the first) also needs to be > implemented using a call to store_release. That will definitely ensure > that the first write is visible before the second on all architectures. > It has been our hope (Andrew's and mine) since we completed the AArch64 > port that all pairs of stores which require ordering do indeed employ a > store_release (we have had to correct a few cases over the last few years). > > Andrew's belief seems to be that your model is error prone and is fixed > more correctly by introducing a memory and/or compiler barrier into the > implementation of release_store. If instead release_store is used > consistently whenever the second of a pair of writes needs to be > guaranteed to be visible after the first then it will provide the > desired outcome. This belief seems indeed to be backed up by the changes > made to the jdk9 code base quite some while back (the ones I failed to > notice). The relevant commit is > > 7143664: Clean up OrderAccess implementations and usage > > (n.b. I believe the author is one D Holmes :-) > > I think Andrew's view is probably sound (and not just because he is my > boss). Since we must use release_store everywhere we want visibility of > writes to be ordered then also requiring both the fields involved to be > volatile is redundant. Given what little C++ volatile declarations do > achieve it might be wiser not to be using volatile declarations at all. > We would certainly start finding missing store_release calls quicker ;-) > > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From adinn at redhat.com Fri May 26 09:48:59 2017 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 26 May 2017 10:48:59 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <77ef6b38-b5a4-8922-4250-57ce13c4dda8@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <6021f2b1-4b0a-5229-aaa1-57d608f4a49c@redhat.com> <77ef6b38-b5a4-8922-4250-57ce13c4dda8@oracle.com> Message-ID: On 26/05/17 10:40, David Holmes wrote: > On 26/05/2017 7:35 PM, Andrew Dinn wrote: >> Hi David, >> >> On 26/05/17 10:20, David Holmes wrote: >>> Hi Andrew, >>> >>> On 26/05/2017 6:26 PM, Andrew Haley wrote: >>>> On 26/05/17 03:20, David Holmes wrote: >>>>> Any variable passed to an OrderAccess, or Atomic, function should be >>>>> volatile to minimise the chances the C compiler will do something >>>>> unexpected with it. >>>> >>>> That's not much more than paranoia, IMO. If the barriers are strong >>>> enough it'll be fine. The problem was, I suppose, with old compilers >>>> which didn't handle memory barriers properly, but we should be moving >>>> towards standard ways of doing these things. Standard atomics have >>>> been available since C++11 (I think) and GCC has had support since long >>>> before then. >>> >>> The issue isn't just the barriers that might be involved inside >>> orderAccess methods. If these variables are being used in racy lock-free >>> code then they should be marked volatile to ensure other compiler >>> optimizations don't interfere. Perhaps that is paranoia, but I'd rather >>> a little harmless paranoia than try to debug what might otherwise go >>> wrong. >> >> I don't understand what you are suggesting here. How is such racy, >> lock-free code ever going to work on architectures with weak memory >> models? > > By using load-acquire/store-release and atomic operations - that's how > you write lock-free algorithms. Now I'm even more confused. Surely, the implementations of load-acquire/store-release and atomic operations themselves guarantee that 'other compiler optimizations don't interfere'. Why doesn't that make the volatile declarations redundant? regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander From david.holmes at oracle.com Fri May 26 11:02:27 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 21:02:27 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <6021f2b1-4b0a-5229-aaa1-57d608f4a49c@redhat.com> <77ef6b38-b5a4-8922-4250-57ce13c4dda8@oracle.com> Message-ID: <57be32fc-be71-b13c-0cd3-8fbb899df688@oracle.com> On 26/05/2017 7:48 PM, Andrew Dinn wrote: > On 26/05/17 10:40, David Holmes wrote: >> On 26/05/2017 7:35 PM, Andrew Dinn wrote: >>> Hi David, >>> >>> On 26/05/17 10:20, David Holmes wrote: >>>> Hi Andrew, >>>> >>>> On 26/05/2017 6:26 PM, Andrew Haley wrote: >>>>> On 26/05/17 03:20, David Holmes wrote: >>>>>> Any variable passed to an OrderAccess, or Atomic, function should be >>>>>> volatile to minimise the chances the C compiler will do something >>>>>> unexpected with it. >>>>> >>>>> That's not much more than paranoia, IMO. If the barriers are strong >>>>> enough it'll be fine. The problem was, I suppose, with old compilers >>>>> which didn't handle memory barriers properly, but we should be moving >>>>> towards standard ways of doing these things. Standard atomics have >>>>> been available since C++11 (I think) and GCC has had support since long >>>>> before then. >>>> >>>> The issue isn't just the barriers that might be involved inside >>>> orderAccess methods. If these variables are being used in racy lock-free >>>> code then they should be marked volatile to ensure other compiler >>>> optimizations don't interfere. Perhaps that is paranoia, but I'd rather >>>> a little harmless paranoia than try to debug what might otherwise go >>>> wrong. >>> >>> I don't understand what you are suggesting here. How is such racy, >>> lock-free code ever going to work on architectures with weak memory >>> models? >> >> By using load-acquire/store-release and atomic operations - that's how >> you write lock-free algorithms. > > Now I'm even more confused. Surely, the implementations of > load-acquire/store-release and atomic operations themselves guarantee > that 'other compiler optimizations don't interfere'. Why doesn't that > make the volatile declarations redundant? Good question. Perhaps with the right implementation it does. But for the last 15+ years as far as I am aware the general "wisdom" has been that use of C/C++ volatile was a necessary, but nowhere near sufficient, condition when writing such algorithms with 'hand-crafted' memory barriers and atomic instructions that are outside the C/C++ language, and the compiler. Cheers, David > regards, > > > Andrew Dinn > ----------- > Senior Principal Software Engineer > Red Hat UK Ltd > Registered in England and Wales under Company Registration No. 03798903 > Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander > From aph at redhat.com Fri May 26 12:35:29 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 26 May 2017 13:35:29 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> Message-ID: <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> On 26/05/17 10:20, David Holmes wrote: > Hi Andrew, > > On 26/05/2017 6:26 PM, Andrew Haley wrote: >> On 26/05/17 03:20, David Holmes wrote: >>> Any variable passed to an OrderAccess, or Atomic, function should be >>> volatile to minimise the chances the C compiler will do something >>> unexpected with it. >> >> That's not much more than paranoia, IMO. If the barriers are strong >> enough it'll be fine. The problem was, I suppose, with old compilers >> which didn't handle memory barriers properly, but we should be moving >> towards standard ways of doing these things. Standard atomics have >> been available since C++11 (I think) and GCC has had support since long >> before then. > > The issue isn't just the barriers that might be involved inside > orderAccess methods. If these variables are being used in racy > lock-free code then they should be marked volatile to ensure other > compiler optimizations don't interfere. Perhaps that is paranoia, > but I'd rather a little harmless paranoia than try to debug what > might otherwise go wrong. I'm always leery of this kind of reasoning because the hardware I most care about has a very weakly-ordered memory system and will reorder everything in the absence of synchronization. If it is actually necessary to use volatile on a TSO machine to get multi-thread ordering then it is almost certainly incorrect code, because volatile is not sufficient to do what is needed on non-TSO hardware. So, if you "fix" code on a TSO machine by using volatile, you are making work for me because I'll have to debug it on a non-TSO machine. Fix it in a portable way by using the correct primitives and it's correct everywhere, it's easier to reason about, and you lost nothing. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From paul.hohensee at gmail.com Fri May 26 13:47:00 2017 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Fri, 26 May 2017 06:47:00 -0700 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: What David said, and a little history. orderAccess was originally written (by me, though not as well as Erik's rewrite) in order to support ia64, which also has a very weakly ordered memory system. The idea is that there are two sources of potential reordering, the first by the C++ compilers and the second by the hardware. Using the volatile specifier consistently blocks the C++ compilers from reordering, and the orderAccess methods block the hardware from reordering. The idea was to minimize the number of required hardware barriers (which can be quite expensive), so the model allows for code that need only prevent compiler reordering. Another way to put it is that it allows for the minimal use of hardware barriers. An alternative would be to use only orderAccess methods to access data that require ordering. The reason that works is because the formal parameter types on the orderAccess methods' pointer formals are marked volatile, thus preventing the C++ compilers from, say, inlining orderAccess methods and reordering accesses derived from them. I'm not a ppc memory ordering expert, but from the discussion it seems to me that there are two bugs, one fixed by amending the ppc implementation of release_store_ptr and the other by marking _f1 volatile. Thanks, Paul On Fri, May 26, 2017 at 5:35 AM, Andrew Haley wrote: > On 26/05/17 10:20, David Holmes wrote: > > Hi Andrew, > > > > On 26/05/2017 6:26 PM, Andrew Haley wrote: > >> On 26/05/17 03:20, David Holmes wrote: > >>> Any variable passed to an OrderAccess, or Atomic, function should be > >>> volatile to minimise the chances the C compiler will do something > >>> unexpected with it. > >> > >> That's not much more than paranoia, IMO. If the barriers are strong > >> enough it'll be fine. The problem was, I suppose, with old compilers > >> which didn't handle memory barriers properly, but we should be moving > >> towards standard ways of doing these things. Standard atomics have > >> been available since C++11 (I think) and GCC has had support since long > >> before then. > > > > The issue isn't just the barriers that might be involved inside > > orderAccess methods. If these variables are being used in racy > > lock-free code then they should be marked volatile to ensure other > > compiler optimizations don't interfere. Perhaps that is paranoia, > > but I'd rather a little harmless paranoia than try to debug what > > might otherwise go wrong. > > I'm always leery of this kind of reasoning because the hardware I most > care about has a very weakly-ordered memory system and will reorder > everything in the absence of synchronization. If it is actually > necessary to use volatile on a TSO machine to get multi-thread > ordering then it is almost certainly incorrect code, because volatile > is not sufficient to do what is needed on non-TSO hardware. > > So, if you "fix" code on a TSO machine by using volatile, you are > making work for me because I'll have to debug it on a non-TSO machine. > Fix it in a portable way by using the correct primitives and it's > correct everywhere, it's easier to reason about, and you lost nothing. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From david.holmes at oracle.com Fri May 26 13:57:32 2017 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 May 2017 23:57:32 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: On 26/05/2017 10:35 PM, Andrew Haley wrote: > On 26/05/17 10:20, David Holmes wrote: >> Hi Andrew, >> >> On 26/05/2017 6:26 PM, Andrew Haley wrote: >>> On 26/05/17 03:20, David Holmes wrote: >>>> Any variable passed to an OrderAccess, or Atomic, function should be >>>> volatile to minimise the chances the C compiler will do something >>>> unexpected with it. >>> >>> That's not much more than paranoia, IMO. If the barriers are strong >>> enough it'll be fine. The problem was, I suppose, with old compilers >>> which didn't handle memory barriers properly, but we should be moving >>> towards standard ways of doing these things. Standard atomics have >>> been available since C++11 (I think) and GCC has had support since long >>> before then. >> >> The issue isn't just the barriers that might be involved inside >> orderAccess methods. If these variables are being used in racy >> lock-free code then they should be marked volatile to ensure other >> compiler optimizations don't interfere. Perhaps that is paranoia, >> but I'd rather a little harmless paranoia than try to debug what >> might otherwise go wrong. > > I'm always leery of this kind of reasoning because the hardware I most > care about has a very weakly-ordered memory system and will reorder > everything in the absence of synchronization. If it is actually > necessary to use volatile on a TSO machine to get multi-thread > ordering then it is almost certainly incorrect code, because volatile > is not sufficient to do what is needed on non-TSO hardware. > > So, if you "fix" code on a TSO machine by using volatile, you are > making work for me because I'll have to debug it on a non-TSO machine. No we do not "fix" the code by adding volatile. We as rule mark all variables involved as "volatile" because it is the only thing we can do to tell the compiler that there are things going on it is not aware of. In addition we use barriers and atomic instructions to be correct on every platform strong or weakly ordered - at least that is the intent. Now it may be that if your compiler is truly multi-thread aware and fence aware and atomic aware, and you use all those things directly that you don't need to also use "volatile". But the JVM does not at this time exist in that world. David > Fix it in a portable way by using the correct primitives and it's > correct everywhere, it's easier to reason about, and you lost nothing. > From paul.hohensee at gmail.com Fri May 26 14:09:06 2017 From: paul.hohensee at gmail.com (Paul Hohensee) Date: Fri, 26 May 2017 07:09:06 -0700 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: Note that the current model doesn't prevent one from using orderAccess for all accesses to orderable data, so one can use that model if desired. On Fri, May 26, 2017 at 6:47 AM, Paul Hohensee wrote: > What David said, and a little history. > > orderAccess was originally written (by me, though not as well as Erik's > rewrite) in order to support ia64, which also has a very weakly ordered > memory system. The idea is that there are two sources of potential > reordering, the first by the C++ compilers and the second by the hardware. > Using the volatile specifier consistently blocks the C++ compilers from > reordering, and the orderAccess methods block the hardware from reordering. > The idea was to minimize the number of required hardware barriers (which > can be quite expensive), so the model allows for code that need only > prevent compiler reordering. Another way to put it is that it allows for > the minimal use of hardware barriers. > > An alternative would be to use only orderAccess methods to access data > that require ordering. The reason that works is because the formal > parameter types on the orderAccess methods' pointer formals are marked > volatile, thus preventing the C++ compilers from, say, inlining orderAccess > methods and reordering accesses derived from them. > > I'm not a ppc memory ordering expert, but from the discussion it seems to > me that there are two bugs, one fixed by amending the ppc implementation of > release_store_ptr and the other by marking _f1 volatile. > > Thanks, > > Paul > > On Fri, May 26, 2017 at 5:35 AM, Andrew Haley wrote: > >> On 26/05/17 10:20, David Holmes wrote: >> > Hi Andrew, >> > >> > On 26/05/2017 6:26 PM, Andrew Haley wrote: >> >> On 26/05/17 03:20, David Holmes wrote: >> >>> Any variable passed to an OrderAccess, or Atomic, function should be >> >>> volatile to minimise the chances the C compiler will do something >> >>> unexpected with it. >> >> >> >> That's not much more than paranoia, IMO. If the barriers are strong >> >> enough it'll be fine. The problem was, I suppose, with old compilers >> >> which didn't handle memory barriers properly, but we should be moving >> >> towards standard ways of doing these things. Standard atomics have >> >> been available since C++11 (I think) and GCC has had support since long >> >> before then. >> > >> > The issue isn't just the barriers that might be involved inside >> > orderAccess methods. If these variables are being used in racy >> > lock-free code then they should be marked volatile to ensure other >> > compiler optimizations don't interfere. Perhaps that is paranoia, >> > but I'd rather a little harmless paranoia than try to debug what >> > might otherwise go wrong. >> >> I'm always leery of this kind of reasoning because the hardware I most >> care about has a very weakly-ordered memory system and will reorder >> everything in the absence of synchronization. If it is actually >> necessary to use volatile on a TSO machine to get multi-thread >> ordering then it is almost certainly incorrect code, because volatile >> is not sufficient to do what is needed on non-TSO hardware. >> >> So, if you "fix" code on a TSO machine by using volatile, you are >> making work for me because I'll have to debug it on a non-TSO machine. >> Fix it in a portable way by using the correct primitives and it's >> correct everywhere, it's easier to reason about, and you lost nothing. >> >> -- >> Andrew Haley >> Java Platform Lead Engineer >> Red Hat UK Ltd. >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >> > > From volker.simonis at gmail.com Fri May 26 16:03:10 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 26 May 2017 16:03:10 +0000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: Volatile not only prevents reordering by the compiler. It also prevents other, otherwise legal transformations/optimizations (like for example reloading a variable [1]) which have to be prevented in order to write correct, lock free programs. So I think declaring the variables involved in such algorithms volatile is currently still necessary. Regards, Volker [1] RFR(XS): JDK-8129440 G1 crash during concurrent root region scan ( http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013928.html) Paul Hohensee schrieb am Fr. 26. Mai 2017 um 17:09: > Note that the current model doesn't prevent one from using orderAccess for > all accesses to orderable data, so one can use that model if desired. > > On Fri, May 26, 2017 at 6:47 AM, Paul Hohensee > wrote: > > > What David said, and a little history. > > > > orderAccess was originally written (by me, though not as well as Erik's > > rewrite) in order to support ia64, which also has a very weakly ordered > > memory system. The idea is that there are two sources of potential > > reordering, the first by the C++ compilers and the second by the > hardware. > > Using the volatile specifier consistently blocks the C++ compilers from > > reordering, and the orderAccess methods block the hardware from > reordering. > > The idea was to minimize the number of required hardware barriers (which > > can be quite expensive), so the model allows for code that need only > > prevent compiler reordering. Another way to put it is that it allows for > > the minimal use of hardware barriers. > > > > An alternative would be to use only orderAccess methods to access data > > that require ordering. The reason that works is because the formal > > parameter types on the orderAccess methods' pointer formals are marked > > volatile, thus preventing the C++ compilers from, say, inlining > orderAccess > > methods and reordering accesses derived from them. > > > > I'm not a ppc memory ordering expert, but from the discussion it seems to > > me that there are two bugs, one fixed by amending the ppc implementation > of > > release_store_ptr and the other by marking _f1 volatile. > > > > Thanks, > > > > Paul > > > > On Fri, May 26, 2017 at 5:35 AM, Andrew Haley wrote: > > > >> On 26/05/17 10:20, David Holmes wrote: > >> > Hi Andrew, > >> > > >> > On 26/05/2017 6:26 PM, Andrew Haley wrote: > >> >> On 26/05/17 03:20, David Holmes wrote: > >> >>> Any variable passed to an OrderAccess, or Atomic, function should be > >> >>> volatile to minimise the chances the C compiler will do something > >> >>> unexpected with it. > >> >> > >> >> That's not much more than paranoia, IMO. If the barriers are strong > >> >> enough it'll be fine. The problem was, I suppose, with old compilers > >> >> which didn't handle memory barriers properly, but we should be moving > >> >> towards standard ways of doing these things. Standard atomics have > >> >> been available since C++11 (I think) and GCC has had support since > long > >> >> before then. > >> > > >> > The issue isn't just the barriers that might be involved inside > >> > orderAccess methods. If these variables are being used in racy > >> > lock-free code then they should be marked volatile to ensure other > >> > compiler optimizations don't interfere. Perhaps that is paranoia, > >> > but I'd rather a little harmless paranoia than try to debug what > >> > might otherwise go wrong. > >> > >> I'm always leery of this kind of reasoning because the hardware I most > >> care about has a very weakly-ordered memory system and will reorder > >> everything in the absence of synchronization. If it is actually > >> necessary to use volatile on a TSO machine to get multi-thread > >> ordering then it is almost certainly incorrect code, because volatile > >> is not sufficient to do what is needed on non-TSO hardware. > >> > >> So, if you "fix" code on a TSO machine by using volatile, you are > >> making work for me because I'll have to debug it on a non-TSO machine. > >> Fix it in a portable way by using the correct primitives and it's > >> correct everywhere, it's easier to reason about, and you lost nothing. > >> > >> -- > >> Andrew Haley > >> Java Platform Lead Engineer > >> Red Hat UK Ltd. > >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > >> > > > > > From aph at redhat.com Fri May 26 16:09:33 2017 From: aph at redhat.com (Andrew Haley) Date: Fri, 26 May 2017 17:09:33 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> On 26/05/17 17:03, Volker Simonis wrote: > Volatile not only prevents reordering by the compiler. It also > prevents other, otherwise legal transformations/optimizations (like > for example reloading a variable [1]) which have to be prevented in > order to write correct, lock free programs. Yes, but so do compiler barriers. > So I think declaring the variables involved in such algorithms > volatile is currently still necessary. IMO, only if compiler barriers don't work; and that implies broken compilers. But from the responses I've seen, the assumption is that the compilers used to build HotSpot are broken. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From kim.barrett at oracle.com Sat May 27 02:44:47 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 May 2017 22:44:47 -0400 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> Message-ID: > On May 26, 2017, at 12:09 PM, Andrew Haley wrote: > > On 26/05/17 17:03, Volker Simonis wrote: > >> Volatile not only prevents reordering by the compiler. It also >> prevents other, otherwise legal transformations/optimizations (like >> for example reloading a variable [1]) which have to be prevented in >> order to write correct, lock free programs. > > Yes, but so do compiler barriers. > >> So I think declaring the variables involved in such algorithms >> volatile is currently still necessary. > > IMO, only if compiler barriers don't work; and that implies broken > compilers. But from the responses I've seen, the assumption is that > the compilers used to build HotSpot are broken. Compiler barriers don't work if they aren't present. And for TCO systems, that problem exists in jdk8. It was Erik O's jdk9 changes that introduced compiler barriers. Before then, in code like the following: x = new_x; OrderAccess::release_store(&y, new_y); on TCO systems, the compiler was free to move the store of x after the store of y if x is not volatile, because there is no compile barrier in the release_store. Old compilers tended to treat volatile accesses as stronger constraints than required by the standard. Newer compilers, not so much. Hence the sprinkling of volatile pixie dust. It might be worthwhile backporting the compile barriers to jdk8. It's a separate question whether y needs to be volatile. In that snippet, strictly speaking, it doesn't, as the release_store parameter takes care of that. However, there's been a sort of informal use of volatile declarations to flag such variables are "interesting" and as a sort of marker for future std::atomic<> or the like. From kim.barrett at oracle.com Sat May 27 02:45:23 2017 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 26 May 2017 22:45:23 -0400 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: > On May 26, 2017, at 12:03 PM, Volker Simonis wrote: > > Volatile not only prevents reordering by the compiler. It also prevents > other, otherwise legal transformations/optimizations (like for example > reloading a variable [1]) which have to be prevented in order to write > correct, lock free programs. > > So I think declaring the variables involved in such algorithms volatile is > currently still necessary. Seems like the thing to do would be to use Atomic::load instead of a bare reference. From aph at redhat.com Sat May 27 06:44:54 2017 From: aph at redhat.com (Andrew Haley) Date: Sat, 27 May 2017 07:44:54 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> Message-ID: On 27/05/17 03:44, Kim Barrett wrote: > It might be worthwhile backporting the compile barriers to jdk8. Certainly, IMO. They're necessary for correctness. Standard C++ now treats all code with data races as undefined behaviour, and we've got to get used to that. The more we tell the C++ compiler about what we want, the more it can optimize and the faster our JVMs will be. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From volker.simonis at gmail.com Sat May 27 08:27:54 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Sat, 27 May 2017 10:27:54 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> Message-ID: On Sat, May 27, 2017 at 4:45 AM, Kim Barrett wrote: >> On May 26, 2017, at 12:03 PM, Volker Simonis wrote: >> >> Volatile not only prevents reordering by the compiler. It also prevents >> other, otherwise legal transformations/optimizations (like for example >> reloading a variable [1]) which have to be prevented in order to write >> correct, lock free programs. >> >> So I think declaring the variables involved in such algorithms volatile is >> currently still necessary. > > Seems like the thing to do would be to use Atomic::load instead of a > bare reference. > Yes, but Atomic::load is not overloaded for oop/narrowOop and the naming doesn't really express what we want to achieve. The proposed fix was to change 'oopDesc::load_heap_oop()' such that it casts its plain pointer argument into a 'pointer to volatile' argument. Unfortunately, I've just realized, that this fix (i.e. JDK-8129440 [2]) was never pushed which I think is bad (we have it in our SAP JVM since long time). The comment on 'oopDesc::load_heap_oop()' clearly states that it is "Called by GC to check for null before decoding". This obviously can not work reliably if the oop is reloaded a second time after the null check (and before the decoding). I don't see how a compiler barrier could help here because this is not a question of reordering. [2] https://bugs.openjdk.java.net/browse/JDK-8129440 From volker.simonis at gmail.com Sat May 27 09:10:33 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Sat, 27 May 2017 11:10:33 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <678f73ae-0993-f5ef-e3dc-6a6940dd0a0c@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> Message-ID: On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: > On 26/05/17 17:03, Volker Simonis wrote: > >> Volatile not only prevents reordering by the compiler. It also >> prevents other, otherwise legal transformations/optimizations (like >> for example reloading a variable [1]) which have to be prevented in >> order to write correct, lock free programs. > > Yes, but so do compiler barriers. > Please correct me if I'm wrong, but I thought "compiler barriers" are to prevent reordering by the compiler. However, this is a question of optimization. If you have two subsequent loads from the same address, the compiler is free to do only the first load and keep the value in a register if the address is not pointing to a volatile value. This is one of the well known semantics of volatile. But there's another, less known 'optimization' which is possible, if an address is not pointing to a volatile value. If there's just a single load, the compiler is free to reload that value a second time later on (instead of spilling it to the stack or to another register). And that was exactly the problem with JDK-8129440 [2]: static inline oop load_heap_oop(oop* p) { return *p; } ... template inline void G1RootRegionScanClosure::do_oop_nv(T* p) { // 1. load 'heap_oop' from 'p' T heap_oop = oopDesc::load_heap_oop(p); if (!oopDesc::is_null(heap_oop)) { // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); _cm->grayRoot(obj, hr); } } How would a compiler barrier help here? How would it look like and where would it have to be placed to? I think this problem can currently only be solved reliably by declaring the loaded value 'volatile'. Regards, Volker [2] https://bugs.openjdk.java.net/browse/JDK-8129440 >> So I think declaring the variables involved in such algorithms >> volatile is currently still necessary. > > IMO, only if compiler barriers don't work; and that implies broken > compilers. But from the responses I've seen, the assumption is that > the compilers used to build HotSpot are broken. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Sun May 28 08:45:19 2017 From: aph at redhat.com (Andrew Haley) Date: Sun, 28 May 2017 09:45:19 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> Message-ID: On 27/05/17 10:10, Volker Simonis wrote: > On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >> On 26/05/17 17:03, Volker Simonis wrote: >> >>> Volatile not only prevents reordering by the compiler. It also >>> prevents other, otherwise legal transformations/optimizations (like >>> for example reloading a variable [1]) which have to be prevented in >>> order to write correct, lock free programs. >> >> Yes, but so do compiler barriers. > > Please correct me if I'm wrong, but I thought "compiler barriers" are > to prevent reordering by the compiler. However, this is a question of > optimization. If you have two subsequent loads from the same address, > the compiler is free to do only the first load and keep the value in a > register if the address is not pointing to a volatile value. No it isn't: that is precisely what a compiler barrier prevents. A compiler barrier (from the POV of the compiler) clobbers all of the memory state. Neither reads nor writes may move past a compiler barrier. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Mon May 29 12:20:26 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 29 May 2017 14:20:26 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> Message-ID: <592C120A.1080908@oracle.com> Hi Andrew, I just thought I'd put my opinions in here as I see I have been mentioned a few times already. First of all, I find using the volatile keyword on things that are involved in lock-free protocols meaningful from a readability point of view. It allows the reader of the code to see care is needed here. About the compiler barriers - you are right. Volatile should indeed not be necessary if the compiler barriers do everything right. The compiler should not reorder things and it should not prevent reloading. On windows we rely on the deprecated _ReadWriteBarrier(). According to MSDN, it guarantees: "The _ReadWriteBarrier intrinsic limits the compiler optimizations that can remove or reorder memory accesses across the point of the call." This should cut it. The GCC memory clobber is defined as: "The "memory" clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing the asm. Further, the compiler does not assume that any values read from memory before an asm remain unchanged after that asm; it reloads them as needed. Using the "memory" clobber effectively forms a read/write memory barrier for the compiler." This seems to only guarantee values will not be re-ordered. But in the documentation for ExtendedAsm it also states: "You will also want to add the volatile keyword if the memory affected is not listed in the inputs or outputs of the asm, as the `memory' clobber does not count as a side-effect of the asm." and "The volatile keyword indicates that the instruction has important side-effects. GCC will not delete a volatile asm if it is reachable. (The instruction can still be deleted if GCC can prove that control-flow will never reach the location of the instruction.) Note that even a volatile asm instruction can be moved relative to other code, including across jump instructions." This is a bit vague, but seems to suggest that by making the asm statement volatile and having a memory clobber, it definitely will not reload variables. About not re-ordering non-volatile accesses, it shouldn't but it is not quite clearly stated. I have never observed such a re-ordering across a volatile memory clobber. But the semantics seem a bit vague. As for clang, the closest to a definition of what it does I have seen is: "A clobber constraint is indicated by a ?~? prefix. A clobber does not consume an input operand, nor generate an output. Clobbers cannot use any of the general constraint code letters ? they may use only explicit register constraints, e.g. ?~{eax}?. The one exception is that a clobber string of ?~{memory}? indicates that the assembly writes to arbitrary undeclared memory locations ? not only the memory pointed to by a declared indirect output." Apart from sweeping statements saying clang inline assembly is largely compatible and working similar to GCC, I have not seen clear guarantees. And then there are more compilers. As a conclusion, by using volatile in addition to OrderAccess you rely on standardized compiler semantics (at least for volatile-to-volatile re-orderings and re-loading, but not for volatile-to-nonvolatile, but that's another can of worms), and regrettably if you rely on OrderAccess memory model doing what it says it will do, then it should indeed work without volatile, but to make that work, OrderAccess relies on non-standardized compiler-specific barriers. In practice it should work well on all our supported compilers without volatile. And if it didn't, it would indeed be a bug in OrderAccess that needs to be solved in OrderAccess. Personally though, I am a helmet-on-synchronization kind of person, so I would take precaution anyway and use volatile whenever possible, because 1) it makes the code more readable, and 2) it provides one extra layer of safety that is more standardized. It seems that over the years it has happened multiple times that we assumed OrderAccess is bullet proof, and then realized that it wasn't and observed a crash that would never have happened if the code was written in a helmet-on-synchronization way. At least that's how I feel about it. Now one might argue that by using C++11 atomics that are standardized, all these problems would go away as we would rely in standardized primitives and then just trust the compiler. But then there could arise problems when the C++ compiler decides to be less conservative than we want, e.g. by not doing fence in sequentially consistent loads to optimize for non-multiple copy atomic CPUs arguing that IRIW issues that violate sequential consistency are non-issues in practice. That makes those loads "almost" sequentially consistent, which might be good enough. But it feels good to have a choice here to be more conservative. To have the synchronization helmet on. Meta summary: 1) Current OrderAccess without volatile: - should work, but relies on compiler-specific not standardized and sometimes poorly documented compiler barriers. 2) Current OrderAccess with volatile: - relies on standardized volatile semantics to guarantee compiler reordering and reloading issues do not occur. 3) C++11 Atomic backend for OrderAccess - relies on standardized semantics to guarantee compiler and hardware reordering issues - nevertheless isn't always flawless, and when it isn't, it gets painful Hope this sheds some light on the trade-offs. Thanks, /Erik On 2017-05-28 10:45, Andrew Haley wrote: > On 27/05/17 10:10, Volker Simonis wrote: >> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>> On 26/05/17 17:03, Volker Simonis wrote: >>> >>>> Volatile not only prevents reordering by the compiler. It also >>>> prevents other, otherwise legal transformations/optimizations (like >>>> for example reloading a variable [1]) which have to be prevented in >>>> order to write correct, lock free programs. >>> Yes, but so do compiler barriers. >> Please correct me if I'm wrong, but I thought "compiler barriers" are >> to prevent reordering by the compiler. However, this is a question of >> optimization. If you have two subsequent loads from the same address, >> the compiler is free to do only the first load and keep the value in a >> register if the address is not pointing to a volatile value. > No it isn't: that is precisely what a compiler barrier prevents. A > compiler barrier (from the POV of the compiler) clobbers all of > the memory state. Neither reads nor writes may move past a compiler > barrier. > From volker.simonis at gmail.com Mon May 29 17:02:25 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Mon, 29 May 2017 19:02:25 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <592C120A.1080908@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: Hi Erik, thanks for the nice summary. Just for the sake of completeness, here's the corresponding documentation for the xlc compiler barrier [1]. It kind of implements the gcc syntax, but the wording is slightly different: "Add memory to the list of clobbered registers if assembler instructions can change a memory location in an unpredictable fashion. The memory clobber ensures that the data used after the completion of the assembly statement is valid and synchronized. However, the memory clobber can result in many unnecessary reloads, reducing the benefits of hardware prefetching. Thus, the memory clobber can impose a performance penalty and should be used with caution." We haven't used it until now, so I can not say if it really does what it is supposed to do. I'm also concerned about the performance warning. It seems like the "unnecessary reloads" can really hurt on architectures like ppc which have much more registers than x86. Declaring a memory location 'volatile' seems much more simple and light-weight in order to achieve the desired effect. So I tend to agree with you and David that we should proceed to mark things with 'volatile'. Sorry for constantly "spamming" this thread with another problem (i.e. JDK-8129440 [2]) but I still think that it is related and important. In its current state, the way how "load_heap_oop()" and its application works is broken. And this is not because of a problem in OrderAccess, but because of missing compiler barriers: static inline oop load_heap_oop(oop* p) { return *p; } ... template inline void G1RootRegionScanClosure::do_oop_nv(T* p) { // 1. load 'heap_oop' from 'p' T heap_oop = oopDesc::load_heap_oop(p); if (!oopDesc::is_null(heap_oop)) { // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); _cm->grayRoot(obj, hr); } } Notice that we don't need memory barriers here - all we need is to prevent the compiler from loading the oop (i.e. 'heap_oop') a second time. After Andrews explanation (thanks for that!) and Martin's examples from Google, I think we could fix this by rewriting 'load_heap_oop()' (and friends) as follows: static inline oop load_heap_oop(oop* p) { oop o = *p; __asm__ volatile ("" : : : "memory"); return o; } In order to make this consistent across all platforms, we would probably have to introduce a new, public "compiler barrier" function in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we don't currently seem to have a cross-platform concept for "compiler-only barriers"). But I'm still not convinced that it would be better than simply writing (and that's the way how we've actually solved it internally): static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } Declaring that single memory location to be 'volatile' seems to be a much more local change compared to globally "clobbering" all the memory. And it doesn't rely on a the compilers providing a compiler barrier. It does however rely on the compiler doing the "right thing" for volatile - but after all what has been said here so far, that seems more likely? The problem may also depend on the specific compiler/cpu combination. For ppc64, both gcc (on linux) and xlc (on aix), do the right thing for volatile variables - they don't insert any memory barriers (i.e. no instructions) but just access the corresponding variables as if there was a compiler barrier. This is exactly what we currently want in HotSpot, because fine-grained control of memory barriers is controlled by the use of OrderAccess (and OrderAccess implies "compiler barrier", at least after the latest fixes). Any thoughts? Should we introduce a cross-platform, "compiler-only barrier" or should we stick to using "volatile" for such cases? Regards, Volker [1] https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html [2] https://bugs.openjdk.java.net/browse/JDK-8129440 On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund wrote: > Hi Andrew, > > I just thought I'd put my opinions in here as I see I have been mentioned a > few times already. > > First of all, I find using the volatile keyword on things that are involved > in lock-free protocols meaningful from a readability point of view. It > allows the reader of the code to see care is needed here. > > About the compiler barriers - you are right. Volatile should indeed not be > necessary if the compiler barriers do everything right. The compiler should > not reorder things and it should not prevent reloading. > > On windows we rely on the deprecated _ReadWriteBarrier(). According to MSDN, > it guarantees: > > "The _ReadWriteBarrier intrinsic limits the compiler optimizations that can > remove or reorder memory accesses across the point of the call." > > This should cut it. > > The GCC memory clobber is defined as: > > "The "memory" clobber tells the compiler that the assembly code performs > memory reads or writes to items other than those listed in the input and > output operands (for example, accessing the memory pointed to by one of the > input parameters). To ensure memory contains correct values, GCC may need to > flush specific register values to memory before executing the asm. Further, > the compiler does not assume that any values read from memory before an asm > remain unchanged after that asm; it reloads them as needed. Using the > "memory" clobber effectively forms a read/write memory barrier for the > compiler." > > This seems to only guarantee values will not be re-ordered. But in the > documentation for ExtendedAsm it also states: > > "You will also want to add the volatile keyword if the memory affected is > not listed in the inputs or outputs of the asm, as the `memory' clobber does > not count as a side-effect of the asm." > > and > > "The volatile keyword indicates that the instruction has important > side-effects. GCC will not delete a volatile asm if it is reachable. (The > instruction can still be deleted if GCC can prove that control-flow will > never reach the location of the instruction.) Note that even a volatile asm > instruction can be moved relative to other code, including across jump > instructions." > > This is a bit vague, but seems to suggest that by making the asm statement > volatile and having a memory clobber, it definitely will not reload > variables. About not re-ordering non-volatile accesses, it shouldn't but it > is not quite clearly stated. I have never observed such a re-ordering across > a volatile memory clobber. But the semantics seem a bit vague. > > As for clang, the closest to a definition of what it does I have seen is: > > "A clobber constraint is indicated by a ?~? prefix. A clobber does not > consume an input operand, nor generate an output. Clobbers cannot use any of > the general constraint code letters ? they may use only explicit register > constraints, e.g. ?~{eax}?. The one exception is that a clobber string of > ?~{memory}? indicates that the assembly writes to arbitrary undeclared > memory locations ? not only the memory pointed to by a declared indirect > output." > > Apart from sweeping statements saying clang inline assembly is largely > compatible and working similar to GCC, I have not seen clear guarantees. And > then there are more compilers. > > As a conclusion, by using volatile in addition to OrderAccess you rely on > standardized compiler semantics (at least for volatile-to-volatile > re-orderings and re-loading, but not for volatile-to-nonvolatile, but that's > another can of worms), and regrettably if you rely on OrderAccess memory > model doing what it says it will do, then it should indeed work without > volatile, but to make that work, OrderAccess relies on non-standardized > compiler-specific barriers. In practice it should work well on all our > supported compilers without volatile. And if it didn't, it would indeed be a > bug in OrderAccess that needs to be solved in OrderAccess. > > Personally though, I am a helmet-on-synchronization kind of person, so I > would take precaution anyway and use volatile whenever possible, because 1) > it makes the code more readable, and 2) it provides one extra layer of > safety that is more standardized. It seems that over the years it has > happened multiple times that we assumed OrderAccess is bullet proof, and > then realized that it wasn't and observed a crash that would never have > happened if the code was written in a helmet-on-synchronization way. At > least that's how I feel about it. > > Now one might argue that by using C++11 atomics that are standardized, all > these problems would go away as we would rely in standardized primitives and > then just trust the compiler. But then there could arise problems when the > C++ compiler decides to be less conservative than we want, e.g. by not doing > fence in sequentially consistent loads to optimize for non-multiple copy > atomic CPUs arguing that IRIW issues that violate sequential consistency are > non-issues in practice. That makes those loads "almost" sequentially > consistent, which might be good enough. But it feels good to have a choice > here to be more conservative. To have the synchronization helmet on. > > Meta summary: > 1) Current OrderAccess without volatile: > - should work, but relies on compiler-specific not standardized and > sometimes poorly documented compiler barriers. > > 2) Current OrderAccess with volatile: > - relies on standardized volatile semantics to guarantee compiler > reordering and reloading issues do not occur. > > 3) C++11 Atomic backend for OrderAccess > - relies on standardized semantics to guarantee compiler and hardware > reordering issues > - nevertheless isn't always flawless, and when it isn't, it gets painful > > Hope this sheds some light on the trade-offs. > > Thanks, > /Erik > > > On 2017-05-28 10:45, Andrew Haley wrote: >> >> On 27/05/17 10:10, Volker Simonis wrote: >>> >>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>>> >>>> On 26/05/17 17:03, Volker Simonis wrote: >>>> >>>>> Volatile not only prevents reordering by the compiler. It also >>>>> prevents other, otherwise legal transformations/optimizations (like >>>>> for example reloading a variable [1]) which have to be prevented in >>>>> order to write correct, lock free programs. >>>> >>>> Yes, but so do compiler barriers. >>> >>> Please correct me if I'm wrong, but I thought "compiler barriers" are >>> to prevent reordering by the compiler. However, this is a question of >>> optimization. If you have two subsequent loads from the same address, >>> the compiler is free to do only the first load and keep the value in a >>> register if the address is not pointing to a volatile value. >> >> No it isn't: that is precisely what a compiler barrier prevents. A >> compiler barrier (from the POV of the compiler) clobbers all of >> the memory state. Neither reads nor writes may move past a compiler >> barrier. >> > From erik.osterlund at oracle.com Mon May 29 17:56:22 2017 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Mon, 29 May 2017 19:56:22 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: Hi Volker, Thank you for filling in more compiler info. If there is a choice between providing a new compiler barrier interface and defining its semantics vs using existing volatile semantics, then volatile semantics seems better to me. Also, my new Access API allows you to do BasicAccess::load_oop(addr) to perform load_heap_oop and load_decode_heap_oop with volatile semantics. Sounds like that would help here. Thanks, /Erik > On 29 May 2017, at 19:02, Volker Simonis wrote: > > Hi Erik, > > thanks for the nice summary. Just for the sake of completeness, here's > the corresponding documentation for the xlc compiler barrier [1]. It > kind of implements the gcc syntax, but the wording is slightly > different: > > "Add memory to the list of clobbered registers if assembler > instructions can change a memory location in an unpredictable fashion. > The memory clobber ensures that the data used after the completion of > the assembly statement is valid and synchronized. > However, the memory clobber can result in many unnecessary reloads, > reducing the benefits of hardware prefetching. Thus, the memory > clobber can impose a performance penalty and should be used with > caution." > > We haven't used it until now, so I can not say if it really does what > it is supposed to do. I'm also concerned about the performance > warning. It seems like the "unnecessary reloads" can really hurt on > architectures like ppc which have much more registers than x86. > Declaring a memory location 'volatile' seems much more simple and > light-weight in order to achieve the desired effect. So I tend to > agree with you and David that we should proceed to mark things with > 'volatile'. > > Sorry for constantly "spamming" this thread with another problem (i.e. > JDK-8129440 [2]) but I still think that it is related and important. > In its current state, the way how "load_heap_oop()" and its > application works is broken. And this is not because of a problem in > OrderAccess, but because of missing compiler barriers: > > static inline oop load_heap_oop(oop* p) { return *p; } > ... > template > inline void G1RootRegionScanClosure::do_oop_nv(T* p) { > // 1. load 'heap_oop' from 'p' > T heap_oop = oopDesc::load_heap_oop(p); > if (!oopDesc::is_null(heap_oop)) { > // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! > oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); > HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); > _cm->grayRoot(obj, hr); > } > } > > Notice that we don't need memory barriers here - all we need is to > prevent the compiler from loading the oop (i.e. 'heap_oop') a second > time. After Andrews explanation (thanks for that!) and Martin's > examples from Google, I think we could fix this by rewriting > 'load_heap_oop()' (and friends) as follows: > > static inline oop load_heap_oop(oop* p) { > oop o = *p; > __asm__ volatile ("" : : : "memory"); > return o; > } > > In order to make this consistent across all platforms, we would > probably have to introduce a new, public "compiler barrier" function > in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we > don't currently seem to have a cross-platform concept for > "compiler-only barriers"). But I'm still not convinced that it would > be better than simply writing (and that's the way how we've actually > solved it internally): > > static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } > > Declaring that single memory location to be 'volatile' seems to be a > much more local change compared to globally "clobbering" all the > memory. And it doesn't rely on a the compilers providing a compiler > barrier. It does however rely on the compiler doing the "right thing" > for volatile - but after all what has been said here so far, that > seems more likely? > > The problem may also depend on the specific compiler/cpu combination. > For ppc64, both gcc (on linux) and xlc (on aix), do the right thing > for volatile variables - they don't insert any memory barriers (i.e. > no instructions) but just access the corresponding variables as if > there was a compiler barrier. This is exactly what we currently want > in HotSpot, because fine-grained control of memory barriers is > controlled by the use of OrderAccess (and OrderAccess implies > "compiler barrier", at least after the latest fixes). > > Any thoughts? Should we introduce a cross-platform, "compiler-only > barrier" or should we stick to using "volatile" for such cases? > > Regards, > Volker > > [1] https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html > [2] https://bugs.openjdk.java.net/browse/JDK-8129440 > > On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund > wrote: >> Hi Andrew, >> >> I just thought I'd put my opinions in here as I see I have been mentioned a >> few times already. >> >> First of all, I find using the volatile keyword on things that are involved >> in lock-free protocols meaningful from a readability point of view. It >> allows the reader of the code to see care is needed here. >> >> About the compiler barriers - you are right. Volatile should indeed not be >> necessary if the compiler barriers do everything right. The compiler should >> not reorder things and it should not prevent reloading. >> >> On windows we rely on the deprecated _ReadWriteBarrier(). According to MSDN, >> it guarantees: >> >> "The _ReadWriteBarrier intrinsic limits the compiler optimizations that can >> remove or reorder memory accesses across the point of the call." >> >> This should cut it. >> >> The GCC memory clobber is defined as: >> >> "The "memory" clobber tells the compiler that the assembly code performs >> memory reads or writes to items other than those listed in the input and >> output operands (for example, accessing the memory pointed to by one of the >> input parameters). To ensure memory contains correct values, GCC may need to >> flush specific register values to memory before executing the asm. Further, >> the compiler does not assume that any values read from memory before an asm >> remain unchanged after that asm; it reloads them as needed. Using the >> "memory" clobber effectively forms a read/write memory barrier for the >> compiler." >> >> This seems to only guarantee values will not be re-ordered. But in the >> documentation for ExtendedAsm it also states: >> >> "You will also want to add the volatile keyword if the memory affected is >> not listed in the inputs or outputs of the asm, as the `memory' clobber does >> not count as a side-effect of the asm." >> >> and >> >> "The volatile keyword indicates that the instruction has important >> side-effects. GCC will not delete a volatile asm if it is reachable. (The >> instruction can still be deleted if GCC can prove that control-flow will >> never reach the location of the instruction.) Note that even a volatile asm >> instruction can be moved relative to other code, including across jump >> instructions." >> >> This is a bit vague, but seems to suggest that by making the asm statement >> volatile and having a memory clobber, it definitely will not reload >> variables. About not re-ordering non-volatile accesses, it shouldn't but it >> is not quite clearly stated. I have never observed such a re-ordering across >> a volatile memory clobber. But the semantics seem a bit vague. >> >> As for clang, the closest to a definition of what it does I have seen is: >> >> "A clobber constraint is indicated by a ?~? prefix. A clobber does not >> consume an input operand, nor generate an output. Clobbers cannot use any of >> the general constraint code letters ? they may use only explicit register >> constraints, e.g. ?~{eax}?. The one exception is that a clobber string of >> ?~{memory}? indicates that the assembly writes to arbitrary undeclared >> memory locations ? not only the memory pointed to by a declared indirect >> output." >> >> Apart from sweeping statements saying clang inline assembly is largely >> compatible and working similar to GCC, I have not seen clear guarantees. And >> then there are more compilers. >> >> As a conclusion, by using volatile in addition to OrderAccess you rely on >> standardized compiler semantics (at least for volatile-to-volatile >> re-orderings and re-loading, but not for volatile-to-nonvolatile, but that's >> another can of worms), and regrettably if you rely on OrderAccess memory >> model doing what it says it will do, then it should indeed work without >> volatile, but to make that work, OrderAccess relies on non-standardized >> compiler-specific barriers. In practice it should work well on all our >> supported compilers without volatile. And if it didn't, it would indeed be a >> bug in OrderAccess that needs to be solved in OrderAccess. >> >> Personally though, I am a helmet-on-synchronization kind of person, so I >> would take precaution anyway and use volatile whenever possible, because 1) >> it makes the code more readable, and 2) it provides one extra layer of >> safety that is more standardized. It seems that over the years it has >> happened multiple times that we assumed OrderAccess is bullet proof, and >> then realized that it wasn't and observed a crash that would never have >> happened if the code was written in a helmet-on-synchronization way. At >> least that's how I feel about it. >> >> Now one might argue that by using C++11 atomics that are standardized, all >> these problems would go away as we would rely in standardized primitives and >> then just trust the compiler. But then there could arise problems when the >> C++ compiler decides to be less conservative than we want, e.g. by not doing >> fence in sequentially consistent loads to optimize for non-multiple copy >> atomic CPUs arguing that IRIW issues that violate sequential consistency are >> non-issues in practice. That makes those loads "almost" sequentially >> consistent, which might be good enough. But it feels good to have a choice >> here to be more conservative. To have the synchronization helmet on. >> >> Meta summary: >> 1) Current OrderAccess without volatile: >> - should work, but relies on compiler-specific not standardized and >> sometimes poorly documented compiler barriers. >> >> 2) Current OrderAccess with volatile: >> - relies on standardized volatile semantics to guarantee compiler >> reordering and reloading issues do not occur. >> >> 3) C++11 Atomic backend for OrderAccess >> - relies on standardized semantics to guarantee compiler and hardware >> reordering issues >> - nevertheless isn't always flawless, and when it isn't, it gets painful >> >> Hope this sheds some light on the trade-offs. >> >> Thanks, >> /Erik >> >> >>> On 2017-05-28 10:45, Andrew Haley wrote: >>> >>>> On 27/05/17 10:10, Volker Simonis wrote: >>>> >>>>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>>>> >>>>>> On 26/05/17 17:03, Volker Simonis wrote: >>>>>> >>>>>> Volatile not only prevents reordering by the compiler. It also >>>>>> prevents other, otherwise legal transformations/optimizations (like >>>>>> for example reloading a variable [1]) which have to be prevented in >>>>>> order to write correct, lock free programs. >>>>> >>>>> Yes, but so do compiler barriers. >>>> >>>> Please correct me if I'm wrong, but I thought "compiler barriers" are >>>> to prevent reordering by the compiler. However, this is a question of >>>> optimization. If you have two subsequent loads from the same address, >>>> the compiler is free to do only the first load and keep the value in a >>>> register if the address is not pointing to a volatile value. >>> >>> No it isn't: that is precisely what a compiler barrier prevents. A >>> compiler barrier (from the POV of the compiler) clobbers all of >>> the memory state. Neither reads nor writes may move past a compiler >>> barrier. >>> >> From david.holmes at oracle.com Mon May 29 20:55:36 2017 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 May 2017 06:55:36 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: <561661eb-b463-726a-1d32-84ef5f32af13@oracle.com> On 30/05/2017 3:02 AM, Volker Simonis wrote: > Sorry for constantly "spamming" this thread with another problem (i.e. > JDK-8129440 [2]) but I still think that it is related and important. > In its current state, the way how "load_heap_oop()" and its > application works is broken. And this is not because of a problem in > OrderAccess, but because of missing compiler barriers: > > static inline oop load_heap_oop(oop* p) { return *p; } > ... > template > inline void G1RootRegionScanClosure::do_oop_nv(T* p) { > // 1. load 'heap_oop' from 'p' > T heap_oop = oopDesc::load_heap_oop(p); > if (!oopDesc::is_null(heap_oop)) { > // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! > oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); Do you mean that the compiler has not stashed heap_oop somewhere and re-executes oopDesc::load_heap_oop(p) again? That would be quite nasty I think in general as it breaks any logic that wants to read a non-local variable once to get it into a local and reuse that knowing that it won't change even if the real variable does! David ----- > HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); > _cm->grayRoot(obj, hr); > } > } > > Notice that we don't need memory barriers here - all we need is to > prevent the compiler from loading the oop (i.e. 'heap_oop') a second > time. After Andrews explanation (thanks for that!) and Martin's > examples from Google, I think we could fix this by rewriting > 'load_heap_oop()' (and friends) as follows: > > static inline oop load_heap_oop(oop* p) { > oop o = *p; > __asm__ volatile ("" : : : "memory"); > return o; > } > > In order to make this consistent across all platforms, we would > probably have to introduce a new, public "compiler barrier" function > in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we > don't currently seem to have a cross-platform concept for > "compiler-only barriers"). But I'm still not convinced that it would > be better than simply writing (and that's the way how we've actually > solved it internally): > > static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } > > Declaring that single memory location to be 'volatile' seems to be a > much more local change compared to globally "clobbering" all the > memory. And it doesn't rely on a the compilers providing a compiler > barrier. It does however rely on the compiler doing the "right thing" > for volatile - but after all what has been said here so far, that > seems more likely? > > The problem may also depend on the specific compiler/cpu combination. > For ppc64, both gcc (on linux) and xlc (on aix), do the right thing > for volatile variables - they don't insert any memory barriers (i.e. > no instructions) but just access the corresponding variables as if > there was a compiler barrier. This is exactly what we currently want > in HotSpot, because fine-grained control of memory barriers is > controlled by the use of OrderAccess (and OrderAccess implies > "compiler barrier", at least after the latest fixes). > > Any thoughts? Should we introduce a cross-platform, "compiler-only > barrier" or should we stick to using "volatile" for such cases? > > Regards, > Volker > > [1] https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html > [2] https://bugs.openjdk.java.net/browse/JDK-8129440 > > On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund > wrote: >> Hi Andrew, >> >> I just thought I'd put my opinions in here as I see I have been mentioned a >> few times already. >> >> First of all, I find using the volatile keyword on things that are involved >> in lock-free protocols meaningful from a readability point of view. It >> allows the reader of the code to see care is needed here. >> >> About the compiler barriers - you are right. Volatile should indeed not be >> necessary if the compiler barriers do everything right. The compiler should >> not reorder things and it should not prevent reloading. >> >> On windows we rely on the deprecated _ReadWriteBarrier(). According to MSDN, >> it guarantees: >> >> "The _ReadWriteBarrier intrinsic limits the compiler optimizations that can >> remove or reorder memory accesses across the point of the call." >> >> This should cut it. >> >> The GCC memory clobber is defined as: >> >> "The "memory" clobber tells the compiler that the assembly code performs >> memory reads or writes to items other than those listed in the input and >> output operands (for example, accessing the memory pointed to by one of the >> input parameters). To ensure memory contains correct values, GCC may need to >> flush specific register values to memory before executing the asm. Further, >> the compiler does not assume that any values read from memory before an asm >> remain unchanged after that asm; it reloads them as needed. Using the >> "memory" clobber effectively forms a read/write memory barrier for the >> compiler." >> >> This seems to only guarantee values will not be re-ordered. But in the >> documentation for ExtendedAsm it also states: >> >> "You will also want to add the volatile keyword if the memory affected is >> not listed in the inputs or outputs of the asm, as the `memory' clobber does >> not count as a side-effect of the asm." >> >> and >> >> "The volatile keyword indicates that the instruction has important >> side-effects. GCC will not delete a volatile asm if it is reachable. (The >> instruction can still be deleted if GCC can prove that control-flow will >> never reach the location of the instruction.) Note that even a volatile asm >> instruction can be moved relative to other code, including across jump >> instructions." >> >> This is a bit vague, but seems to suggest that by making the asm statement >> volatile and having a memory clobber, it definitely will not reload >> variables. About not re-ordering non-volatile accesses, it shouldn't but it >> is not quite clearly stated. I have never observed such a re-ordering across >> a volatile memory clobber. But the semantics seem a bit vague. >> >> As for clang, the closest to a definition of what it does I have seen is: >> >> "A clobber constraint is indicated by a ?~? prefix. A clobber does not >> consume an input operand, nor generate an output. Clobbers cannot use any of >> the general constraint code letters ? they may use only explicit register >> constraints, e.g. ?~{eax}?. The one exception is that a clobber string of >> ?~{memory}? indicates that the assembly writes to arbitrary undeclared >> memory locations ? not only the memory pointed to by a declared indirect >> output." >> >> Apart from sweeping statements saying clang inline assembly is largely >> compatible and working similar to GCC, I have not seen clear guarantees. And >> then there are more compilers. >> >> As a conclusion, by using volatile in addition to OrderAccess you rely on >> standardized compiler semantics (at least for volatile-to-volatile >> re-orderings and re-loading, but not for volatile-to-nonvolatile, but that's >> another can of worms), and regrettably if you rely on OrderAccess memory >> model doing what it says it will do, then it should indeed work without >> volatile, but to make that work, OrderAccess relies on non-standardized >> compiler-specific barriers. In practice it should work well on all our >> supported compilers without volatile. And if it didn't, it would indeed be a >> bug in OrderAccess that needs to be solved in OrderAccess. >> >> Personally though, I am a helmet-on-synchronization kind of person, so I >> would take precaution anyway and use volatile whenever possible, because 1) >> it makes the code more readable, and 2) it provides one extra layer of >> safety that is more standardized. It seems that over the years it has >> happened multiple times that we assumed OrderAccess is bullet proof, and >> then realized that it wasn't and observed a crash that would never have >> happened if the code was written in a helmet-on-synchronization way. At >> least that's how I feel about it. >> >> Now one might argue that by using C++11 atomics that are standardized, all >> these problems would go away as we would rely in standardized primitives and >> then just trust the compiler. But then there could arise problems when the >> C++ compiler decides to be less conservative than we want, e.g. by not doing >> fence in sequentially consistent loads to optimize for non-multiple copy >> atomic CPUs arguing that IRIW issues that violate sequential consistency are >> non-issues in practice. That makes those loads "almost" sequentially >> consistent, which might be good enough. But it feels good to have a choice >> here to be more conservative. To have the synchronization helmet on. >> >> Meta summary: >> 1) Current OrderAccess without volatile: >> - should work, but relies on compiler-specific not standardized and >> sometimes poorly documented compiler barriers. >> >> 2) Current OrderAccess with volatile: >> - relies on standardized volatile semantics to guarantee compiler >> reordering and reloading issues do not occur. >> >> 3) C++11 Atomic backend for OrderAccess >> - relies on standardized semantics to guarantee compiler and hardware >> reordering issues >> - nevertheless isn't always flawless, and when it isn't, it gets painful >> >> Hope this sheds some light on the trade-offs. >> >> Thanks, >> /Erik >> >> >> On 2017-05-28 10:45, Andrew Haley wrote: >>> >>> On 27/05/17 10:10, Volker Simonis wrote: >>>> >>>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>>>> >>>>> On 26/05/17 17:03, Volker Simonis wrote: >>>>> >>>>>> Volatile not only prevents reordering by the compiler. It also >>>>>> prevents other, otherwise legal transformations/optimizations (like >>>>>> for example reloading a variable [1]) which have to be prevented in >>>>>> order to write correct, lock free programs. >>>>> >>>>> Yes, but so do compiler barriers. >>>> >>>> Please correct me if I'm wrong, but I thought "compiler barriers" are >>>> to prevent reordering by the compiler. However, this is a question of >>>> optimization. If you have two subsequent loads from the same address, >>>> the compiler is free to do only the first load and keep the value in a >>>> register if the address is not pointing to a volatile value. >>> >>> No it isn't: that is precisely what a compiler barrier prevents. A >>> compiler barrier (from the POV of the compiler) clobbers all of >>> the memory state. Neither reads nor writes may move past a compiler >>> barrier. >>> >> From aph at redhat.com Tue May 30 08:50:36 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 30 May 2017 09:50:36 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <592C120A.1080908@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> On 29/05/17 13:20, Erik ?sterlund wrote: > As a conclusion, by using volatile in addition to OrderAccess you > rely on standardized compiler semantics (at least for > volatile-to-volatile re-orderings and re-loading, but not for > volatile-to-nonvolatile, but that's another can of worms), and > regrettably if you rely on OrderAccess memory model doing what it > says it will do, then it should indeed work without volatile, but to > make that work, OrderAccess relies on non-standardized > compiler-specific barriers. In practice it should work well on all > our supported compilers without volatile. And if it didn't, it would > indeed be a bug in OrderAccess that needs to be solved in > OrderAccess. Right. And, target by target, we can rework OrderAccess to use real C++11 atomics, and everything will be better. Eventually. It's important that we do so because racy accesses are undefined behaviour in C++11. (And, arguably, before that, but I'm not going to go there.) > Personally though, I am a helmet-on-synchronization kind of person, > so I would take precaution anyway and use volatile whenever > possible, because 1) it makes the code more readable, and 2) it > provides one extra layer of safety that is more standardized. It > seems that over the years it has happened multiple times that we > assumed OrderAccess is bullet proof, and then realized that it > wasn't and observed a crash that would never have happened if the > code was written in a helmet-on-synchronization way. At least that's > how I feel about it. I have no problem with that. What I *do* have a problem with is the use of volatile to fix bugs that really need to be corrected with proper barriers. > Now one might argue that by using C++11 atomics that are > standardized, all these problems would go away as we would rely in > standardized primitives and then just trust the compiler. And I absolutely do argue that. In fact, it is the only correct way to go with C++11 compilers. IMO. > But then there could arise problems when the C++ compiler decides to > be less conservative than we want, e.g. by not doing fence in > sequentially consistent loads to optimize for non-multiple copy > atomic CPUs arguing that IRIW issues that violate sequential > consistency are non-issues in practice. A C++ compiler will not decide to do that. C++ compiler authors know well enough what sequential consistency means. Besides, if there is any idiom in the JVM that actually requires IRIW we should remove it as soon as possible. > That makes those loads "almost" sequentially consistent, which might > be good enough. But it feels good to have a choice here to be more > conservative. To have the synchronization helmet on. I have no real problem with that. Using volatile has the problem, from my point of view, that it might conceal bugs that would be revealed on a weakly-ordered machine that you or I then have to fix, but I can live with it. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Tue May 30 09:04:58 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 30 May 2017 10:04:58 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: <0a79603e-6c6d-8eaf-6fe9-0774317f0560@redhat.com> On 29/05/17 18:02, Volker Simonis wrote: > "Add memory to the list of clobbered registers if assembler > instructions can change a memory location in an unpredictable fashion. > The memory clobber ensures that the data used after the completion of > the assembly statement is valid and synchronized. > However, the memory clobber can result in many unnecessary reloads, > reducing the benefits of hardware prefetching. Thus, the memory > clobber can impose a performance penalty and should be used with > caution." > > We haven't used it until now, so I can not say if it really does what > it is supposed to do. I'm also concerned about the performance > warning. It seems like the "unnecessary reloads" can really hurt on > architectures like ppc which have much more registers than x86. > Declaring a memory location 'volatile' seems much more simple and > light-weight in order to achieve the desired effect. So I tend to > agree with you and David that we should proceed to mark things with > 'volatile'. If volatile is what is needed, yes. The problem that we're discussing is that on x86, OrderAccess was actually incorrect: it should work with all accesses, not just volatile ones. The addition of volatile was potentially papering over a bug. > Sorry for constantly "spamming" this thread with another problem (i.e. > JDK-8129440 [2]) but I still think that it is related and important. > In its current state, the way how "load_heap_oop()" and its > application works is broken. And this is not because of a problem in > OrderAccess, but because of missing compiler barriers: > > static inline oop load_heap_oop(oop* p) { return *p; } > ... > template > inline void G1RootRegionScanClosure::do_oop_nv(T* p) { > // 1. load 'heap_oop' from 'p' > T heap_oop = oopDesc::load_heap_oop(p); > if (!oopDesc::is_null(heap_oop)) { > // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! > oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); > HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); > _cm->grayRoot(obj, hr); > } > } > > Notice that we don't need memory barriers here - all we need is to > prevent the compiler from loading the oop (i.e. 'heap_oop') a second > time. After Andrews explanation (thanks for that!) and Martin's > examples from Google, I think we could fix this by rewriting > 'load_heap_oop()' (and friends) as follows: > > static inline oop load_heap_oop(oop* p) { > oop o = *p; > __asm__ volatile ("" : : : "memory"); > return o; > } I wouldn't do that: it's much too violent an action because it clobbers all of memory. You don't want to do it every time anyone reads an oop from the heap. > In order to make this consistent across all platforms, we would > probably have to introduce a new, public "compiler barrier" function > in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we > don't currently seem to have a cross-platform concept for > "compiler-only barriers"). But I'm still not convinced that it would > be better than simply writing (and that's the way how we've actually > solved it internally): > > static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } That looks better. It's still UB post-C++11, but it should be OK. > Declaring that single memory location to be 'volatile' seems to be a > much more local change compared to globally "clobbering" all the > memory. And it doesn't rely on a the compilers providing a compiler > barrier. It does however rely on the compiler doing the "right thing" > for volatile - but after all what has been said here so far, that > seems more likely? It does. The problem here is that the compiler is not being told what is going on, and as the saying goes 'If you lie to the compiler, it will get its revenge.' > Any thoughts? Should we introduce a cross-platform, "compiler-only > barrier" or should we stick to using "volatile" for such cases? Eventually it will have to be C++11 atomics, which give you exactly the language you need to express this stuff. The above would be a relaxed atomic load. Andrew. From aph at redhat.com Tue May 30 09:59:21 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 30 May 2017 10:59:21 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <0a79603e-6c6d-8eaf-6fe9-0774317f0560@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <0a79603e-6c6d-8eaf-6fe9-0774317f0560@redhat.com> Message-ID: Just to be clear on my position: we want to constrain the compiler as little as we can, while maintaining correctness. x86 OrderAccess was incorrect, so compiler barriers had to be added. Where volatile is sufficient we should use that today, but bear in mind that racy accesses are now undefined behaviour in C++. Andrew. From erik.osterlund at oracle.com Tue May 30 10:57:52 2017 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 30 May 2017 12:57:52 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> Message-ID: <592D5030.2020904@oracle.com> On 2017-05-30 10:50, Andrew Haley wrote: > On 29/05/17 13:20, Erik ?sterlund wrote: > >> As a conclusion, by using volatile in addition to OrderAccess you >> rely on standardized compiler semantics (at least for >> volatile-to-volatile re-orderings and re-loading, but not for >> volatile-to-nonvolatile, but that's another can of worms), and >> regrettably if you rely on OrderAccess memory model doing what it >> says it will do, then it should indeed work without volatile, but to >> make that work, OrderAccess relies on non-standardized >> compiler-specific barriers. In practice it should work well on all >> our supported compilers without volatile. And if it didn't, it would >> indeed be a bug in OrderAccess that needs to be solved in >> OrderAccess. > Right. And, target by target, we can rework OrderAccess to use real > C++11 atomics, and everything will be better. Eventually. I do not completely disagree, but I see drawbacks with that too. I am not convinced C++11 is a silver bullet. Note that we lose some explicit control that might end up biting us. And when it does, it will be even harder to detect as we have sold ourselves to the C++11 atomic silver bullet, abstracting away the generated code. For example, C++11 atomic accesses were designed to play nicely with other C++11 atomic accesses. Both the load-side and store-side have to look in very specific ways for e.g. the seq_cst semantics to hold. For example depending if you want seq_cst to have IRIW constraints or not, some PPC compiler could choose to have the sync instruction on either the load side or the store side. Since all seq_cst accesses are controlled by C++11 and go through their compiler, they can make that choice as the accesses stay inside of their "ABI". But the choice needs to be consistent with the choice we make in the JVM and our hand crafted assembly. That is, our hand crafted assembly code has to go by the same "ABI". And we can no longer guarantee it does as we have lost control over what instructions are generated. One concrete example that comes to mind is the JNIFastGetField optimization on ARMv7. The memory model of ARMv7 does not respect causality between loads and stores. Therefore, in theory (and maybe in practice), problems can arise when three threads are involved in a synchronization dance where consistent causality chains are assumed. In the JNIFastGetField optimization we do the following with hand coded assembly: 1) load safepoint counter (written by VM thread) 2) speculatively load primitive value from object (possibly clobbered by a GC thread) 3) load safepoint counter again (written by VM thread) and check it did not change These loads are all normal loads in hand coded assembly. Now for this synchronization to work, it is assumed that if the store to the safepoint counter observed at 1 happens-before the store observed by the speculatively read primitive value from 2). Due to the lack of causality in the memory model, this is explicitly not guaranteed to hold with normal loads and stores on ARMv7, and hence unless we had proper synchronization in the runtime, we could observe clobbered values from these optimized JNI getters and think they are okay. But since our OrderAccess::fence translates to dmb sy specifically (which is conservative), the store will bubble up to the top level of the hierarchical memory model of ARMv7, and therefore we can break the pathological causality chain issue in the JVM by issuing OrderAccess::fence when storing the safepoint counter values. That way, our hand crafted assembly will work with normal loads in the fast path. If OrderAccess::fence translated to anything else than dmb sy, this would break. So if we went with the proposed C++11 mappings from https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html then e.g. a seq_cst store on ARMv7 would translate into dmb ish; str; dmb ish, and that would not suffice to break the causality chain. Of course, it might be that the OS layer saved us from the above pathology anyway, either intentionally or unintentionally, but that is besides the point I am trying to make. The point is that the hand coded assembly and other dynamically generated code has to emit accesses that are compatible with the machine code generated by OrderAccess in the runtime. And when we give that control away to C++11 and abstract away the generated machine code, things might go horribly wrong in the most unexpected and obscure ways that I think would be a nightmare to debug. Having said that, I am not convinced C++11 is not a good idea either. I would just like to balance out the view that C++11 is a synchronization silver bullet for the JVM that is simply a superior solution without any pitfalls and that doing anything else is wrong. There are things to be considered there as well, like the extent of possible ABI incompatibilities. > It's important that we do so because racy accesses are undefined > behaviour in C++11. (And, arguably, before that, but I'm not going to > go there.) What paragraph are we referring to here that would break OrderAccess in C++11? >> Personally though, I am a helmet-on-synchronization kind of person, >> so I would take precaution anyway and use volatile whenever >> possible, because 1) it makes the code more readable, and 2) it >> provides one extra layer of safety that is more standardized. It >> seems that over the years it has happened multiple times that we >> assumed OrderAccess is bullet proof, and then realized that it >> wasn't and observed a crash that would never have happened if the >> code was written in a helmet-on-synchronization way. At least that's >> how I feel about it. > I have no problem with that. What I *do* have a problem with is the > use of volatile to fix bugs that really need to be corrected with > proper barriers. I think you misunderstood me here. I did not propose to use volatile so we don't have to fix bugs in OrderAccess. Conversely, I said if we find such issues, we should definitely fix them in OrderAccess. But despite that, I personally like the pragmatic safety approach, and would use volatile in my lock-free code anyway to make it a) more readable, and b) provide an extra level of safety against our increasingly aggressive compilers. It's like wearing a helmet when biking. You don't expect to fall and should not fall, but why take risks if you don't have to and there is an easy way of preventing disaster if that happens. At least that's how I think about it myself. >> Now one might argue that by using C++11 atomics that are >> standardized, all these problems would go away as we would rely in >> standardized primitives and then just trust the compiler. > And I absolutely do argue that. In fact, it is the only correct way > to go with C++11 compilers. IMO. Not entirely convinced that statement is true as I think I mentioned previously. >> But then there could arise problems when the C++ compiler decides to >> be less conservative than we want, e.g. by not doing fence in >> sequentially consistent loads to optimize for non-multiple copy >> atomic CPUs arguing that IRIW issues that violate sequential >> consistency are non-issues in practice. > A C++ compiler will not decide to do that. C++ compiler authors know > well enough what sequential consistency means. Besides, if there is > any idiom in the JVM that actually requires IRIW we should remove it > as soon as possible. With the risk of going slightly off topic... C++ compiler authors have indeed done that in the past. And I have a huge problem with this. I think the exposed model semantics need to be respected. If the model says seq_cst, then the generated code should be seq_cst and not "almost" seq_cst. You can't expect users of a memory model to have to know that it intentionally (rather than accidentally) violates the guarantees because it was considered close enough and that nobody should be able to observe the difference in practice. (don't get me started on that one) The issue is not whether an algorithm depends on IRIW or not. The issue is that we have to explicitly reason about IRIW to prove that it works. The lack of IRIW violates seq_cst and by extension linearizaiton points that rely in seq_cst, and by extension algorithms that rely on linearization points. By breaking the very building blocks that were used to reason about algorithms and their correctness, we rely on chance for it to work. The algorithm may or may not work. It probably does work without IRIW constraints in the vast majority of cases. But we have to explicitly reason about that expanded state machine of possible races caused by IRIW issues to actually know that it works rather than leaving it to chance. Reasoning about this extended state machine can take a lot of work and puts the bar unreasonably high for writing synchronized code in my opinion. And I think the alternative of leaving it to chance (albeit with good odds) seems like an unfortunate choice. >> That makes those loads "almost" sequentially consistent, which might >> be good enough. But it feels good to have a choice here to be more >> conservative. To have the synchronization helmet on. > I have no real problem with that. Using volatile has the problem, > from my point of view, that it might conceal bugs that would be > revealed on a weakly-ordered machine that you or I then have to fix, > but I can live with it. I do not see how using volatile has anything to do with weakly ordered machines. We use it where it is compiler reorderings specifically that need to be prevented. If it is not just a compiler reordering that needed to be prevented, then of course the use of volatile is incorrect and a bug. Either way, relying on C++11 atomics might also conceal bugs that would be revealed on a weakly-ordered machine due to conflicting ABIs between the statically generated C++ code and the dynamically generated code, as previously mentioned. Thanks, /Erik From aph at redhat.com Tue May 30 12:45:41 2017 From: aph at redhat.com (Andrew Haley) Date: Tue, 30 May 2017 13:45:41 +0100 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <592D5030.2020904@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> <592D5030.2020904@oracle.com> Message-ID: <64fda788-4ff6-978e-4476-a5af36bd708a@redhat.com> On 30/05/17 11:57, Erik ?sterlund wrote: > > Having said that, I am not convinced C++11 is not a good idea either. I > would just like to balance out the view that C++11 is a synchronization > silver bullet for the JVM that is simply a superior solution without any > pitfalls and that doing anything else is wrong. There are things to be > considered there as well, like the extent of possible ABI incompatibilities. Sure, and I appreciate your comment, but you never got the idea that using C++11 atomics is a synchronization silver bullet from me: IMO it's necessary but (probably) not sufficient. >> It's important that we do so because racy accesses are undefined >> behaviour in C++11. (And, arguably, before that, but I'm not going to >> go there.) > > What paragraph are we referring to here that would break OrderAccess > in C++11? Nowhere: it's racy accesses *without* synchronization that are UB. >>> Personally though, I am a helmet-on-synchronization kind of person, >>> so I would take precaution anyway and use volatile whenever >>> possible, because 1) it makes the code more readable, and 2) it >>> provides one extra layer of safety that is more standardized. It >>> seems that over the years it has happened multiple times that we >>> assumed OrderAccess is bullet proof, and then realized that it >>> wasn't and observed a crash that would never have happened if the >>> code was written in a helmet-on-synchronization way. At least that's >>> how I feel about it. >> >> I have no problem with that. What I *do* have a problem with is the >> use of volatile to fix bugs that really need to be corrected with >> proper barriers. > > I think you misunderstood me here. I did not propose to use volatile so > we don't have to fix bugs in OrderAccess. Conversely, I said if we find > such issues, we should definitely fix them in OrderAccess. But despite > that, I personally like the pragmatic safety approach, and would use > volatile in my lock-free code anyway to make it a) more readable, and b) > provide an extra level of safety against our increasingly aggressive > compilers. It's like wearing a helmet when biking. You don't expect to > fall and should not fall, but why take risks if you don't have to and > there is an easy way of preventing disaster if that happens. At least > that's how I think about it myself. As I said, I have no problem with that. I'm happy with that justification for volatile, even when not strictly necessary, as long as it's not done in places that would significantly impede performance. I'm sure you would agree with that anyway. You have to remember where this discussion started, which was a proposed use of volatile to fix a bug where a barrier was needed. > The issue is not whether an algorithm depends on IRIW or not. The issue > is that we have to explicitly reason about IRIW to prove that it works. > The lack of IRIW violates seq_cst and by extension linearizaiton points > that rely in seq_cst, and by extension algorithms that rely on > linearization points. By breaking the very building blocks that were > used to reason about algorithms and their correctness, we rely on chance > for it to work. The algorithm may or may not work. It probably does work > without IRIW constraints in the vast majority of cases. But we have to > explicitly reason about that expanded state machine of possible races > caused by IRIW issues to actually know that it works rather than leaving > it to chance. Reasoning about this extended state machine can take a lot > of work and puts the bar unreasonably high for writing synchronized code > in my opinion. And I think the alternative of leaving it to chance > (albeit with good odds) seems like an unfortunate choice. Sure, and I know all of that, and it sounds like you are arguing with a point that someone else made. Where sequential consistency really is required in the VM, seq_cst in the C++ compiler must really be seq_cst. But whoever thought otherwise? No-one, as far as I know. > Either way, relying on C++11 atomics might also conceal bugs that would > be revealed on a weakly-ordered machine due to conflicting ABIs between > the statically generated C++ code and the dynamically generated code, as > previously mentioned. We have to make sure that the generated code is ABI-compatible, of course. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From volker.simonis at gmail.com Tue May 30 15:37:18 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 30 May 2017 17:37:18 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: On Mon, May 29, 2017 at 7:56 PM, Erik Osterlund wrote: > Hi Volker, > > Thank you for filling in more compiler info. > > If there is a choice between providing a new compiler barrier interface and defining its semantics vs using existing volatile semantics, then volatile semantics seems better to me. > > Also, my new Access API allows you to do BasicAccess::load_oop(addr) to perform load_heap_oop and load_decode_heap_oop with volatile semantics. Sounds like that would help here. Sorry for my ignorance, but what is the "new Access API" and "BasicAccess"? It actually sounds quite interesting :) > > Thanks, > /Erik > >> On 29 May 2017, at 19:02, Volker Simonis wrote: >> >> Hi Erik, >> >> thanks for the nice summary. Just for the sake of completeness, here's >> the corresponding documentation for the xlc compiler barrier [1]. It >> kind of implements the gcc syntax, but the wording is slightly >> different: >> >> "Add memory to the list of clobbered registers if assembler >> instructions can change a memory location in an unpredictable fashion. >> The memory clobber ensures that the data used after the completion of >> the assembly statement is valid and synchronized. >> However, the memory clobber can result in many unnecessary reloads, >> reducing the benefits of hardware prefetching. Thus, the memory >> clobber can impose a performance penalty and should be used with >> caution." >> >> We haven't used it until now, so I can not say if it really does what >> it is supposed to do. I'm also concerned about the performance >> warning. It seems like the "unnecessary reloads" can really hurt on >> architectures like ppc which have much more registers than x86. >> Declaring a memory location 'volatile' seems much more simple and >> light-weight in order to achieve the desired effect. So I tend to >> agree with you and David that we should proceed to mark things with >> 'volatile'. >> >> Sorry for constantly "spamming" this thread with another problem (i.e. >> JDK-8129440 [2]) but I still think that it is related and important. >> In its current state, the way how "load_heap_oop()" and its >> application works is broken. And this is not because of a problem in >> OrderAccess, but because of missing compiler barriers: >> >> static inline oop load_heap_oop(oop* p) { return *p; } >> ... >> template >> inline void G1RootRegionScanClosure::do_oop_nv(T* p) { >> // 1. load 'heap_oop' from 'p' >> T heap_oop = oopDesc::load_heap_oop(p); >> if (!oopDesc::is_null(heap_oop)) { >> // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! >> oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); >> HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); >> _cm->grayRoot(obj, hr); >> } >> } >> >> Notice that we don't need memory barriers here - all we need is to >> prevent the compiler from loading the oop (i.e. 'heap_oop') a second >> time. After Andrews explanation (thanks for that!) and Martin's >> examples from Google, I think we could fix this by rewriting >> 'load_heap_oop()' (and friends) as follows: >> >> static inline oop load_heap_oop(oop* p) { >> oop o = *p; >> __asm__ volatile ("" : : : "memory"); >> return o; >> } >> >> In order to make this consistent across all platforms, we would >> probably have to introduce a new, public "compiler barrier" function >> in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we >> don't currently seem to have a cross-platform concept for >> "compiler-only barriers"). But I'm still not convinced that it would >> be better than simply writing (and that's the way how we've actually >> solved it internally): >> >> static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } >> >> Declaring that single memory location to be 'volatile' seems to be a >> much more local change compared to globally "clobbering" all the >> memory. And it doesn't rely on a the compilers providing a compiler >> barrier. It does however rely on the compiler doing the "right thing" >> for volatile - but after all what has been said here so far, that >> seems more likely? >> >> The problem may also depend on the specific compiler/cpu combination. >> For ppc64, both gcc (on linux) and xlc (on aix), do the right thing >> for volatile variables - they don't insert any memory barriers (i.e. >> no instructions) but just access the corresponding variables as if >> there was a compiler barrier. This is exactly what we currently want >> in HotSpot, because fine-grained control of memory barriers is >> controlled by the use of OrderAccess (and OrderAccess implies >> "compiler barrier", at least after the latest fixes). >> >> Any thoughts? Should we introduce a cross-platform, "compiler-only >> barrier" or should we stick to using "volatile" for such cases? >> >> Regards, >> Volker >> >> [1] https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html >> [2] https://bugs.openjdk.java.net/browse/JDK-8129440 >> >> On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund >> wrote: >>> Hi Andrew, >>> >>> I just thought I'd put my opinions in here as I see I have been mentioned a >>> few times already. >>> >>> First of all, I find using the volatile keyword on things that are involved >>> in lock-free protocols meaningful from a readability point of view. It >>> allows the reader of the code to see care is needed here. >>> >>> About the compiler barriers - you are right. Volatile should indeed not be >>> necessary if the compiler barriers do everything right. The compiler should >>> not reorder things and it should not prevent reloading. >>> >>> On windows we rely on the deprecated _ReadWriteBarrier(). According to MSDN, >>> it guarantees: >>> >>> "The _ReadWriteBarrier intrinsic limits the compiler optimizations that can >>> remove or reorder memory accesses across the point of the call." >>> >>> This should cut it. >>> >>> The GCC memory clobber is defined as: >>> >>> "The "memory" clobber tells the compiler that the assembly code performs >>> memory reads or writes to items other than those listed in the input and >>> output operands (for example, accessing the memory pointed to by one of the >>> input parameters). To ensure memory contains correct values, GCC may need to >>> flush specific register values to memory before executing the asm. Further, >>> the compiler does not assume that any values read from memory before an asm >>> remain unchanged after that asm; it reloads them as needed. Using the >>> "memory" clobber effectively forms a read/write memory barrier for the >>> compiler." >>> >>> This seems to only guarantee values will not be re-ordered. But in the >>> documentation for ExtendedAsm it also states: >>> >>> "You will also want to add the volatile keyword if the memory affected is >>> not listed in the inputs or outputs of the asm, as the `memory' clobber does >>> not count as a side-effect of the asm." >>> >>> and >>> >>> "The volatile keyword indicates that the instruction has important >>> side-effects. GCC will not delete a volatile asm if it is reachable. (The >>> instruction can still be deleted if GCC can prove that control-flow will >>> never reach the location of the instruction.) Note that even a volatile asm >>> instruction can be moved relative to other code, including across jump >>> instructions." >>> >>> This is a bit vague, but seems to suggest that by making the asm statement >>> volatile and having a memory clobber, it definitely will not reload >>> variables. About not re-ordering non-volatile accesses, it shouldn't but it >>> is not quite clearly stated. I have never observed such a re-ordering across >>> a volatile memory clobber. But the semantics seem a bit vague. >>> >>> As for clang, the closest to a definition of what it does I have seen is: >>> >>> "A clobber constraint is indicated by a ?~? prefix. A clobber does not >>> consume an input operand, nor generate an output. Clobbers cannot use any of >>> the general constraint code letters ? they may use only explicit register >>> constraints, e.g. ?~{eax}?. The one exception is that a clobber string of >>> ?~{memory}? indicates that the assembly writes to arbitrary undeclared >>> memory locations ? not only the memory pointed to by a declared indirect >>> output." >>> >>> Apart from sweeping statements saying clang inline assembly is largely >>> compatible and working similar to GCC, I have not seen clear guarantees. And >>> then there are more compilers. >>> >>> As a conclusion, by using volatile in addition to OrderAccess you rely on >>> standardized compiler semantics (at least for volatile-to-volatile >>> re-orderings and re-loading, but not for volatile-to-nonvolatile, but that's >>> another can of worms), and regrettably if you rely on OrderAccess memory >>> model doing what it says it will do, then it should indeed work without >>> volatile, but to make that work, OrderAccess relies on non-standardized >>> compiler-specific barriers. In practice it should work well on all our >>> supported compilers without volatile. And if it didn't, it would indeed be a >>> bug in OrderAccess that needs to be solved in OrderAccess. >>> >>> Personally though, I am a helmet-on-synchronization kind of person, so I >>> would take precaution anyway and use volatile whenever possible, because 1) >>> it makes the code more readable, and 2) it provides one extra layer of >>> safety that is more standardized. It seems that over the years it has >>> happened multiple times that we assumed OrderAccess is bullet proof, and >>> then realized that it wasn't and observed a crash that would never have >>> happened if the code was written in a helmet-on-synchronization way. At >>> least that's how I feel about it. >>> >>> Now one might argue that by using C++11 atomics that are standardized, all >>> these problems would go away as we would rely in standardized primitives and >>> then just trust the compiler. But then there could arise problems when the >>> C++ compiler decides to be less conservative than we want, e.g. by not doing >>> fence in sequentially consistent loads to optimize for non-multiple copy >>> atomic CPUs arguing that IRIW issues that violate sequential consistency are >>> non-issues in practice. That makes those loads "almost" sequentially >>> consistent, which might be good enough. But it feels good to have a choice >>> here to be more conservative. To have the synchronization helmet on. >>> >>> Meta summary: >>> 1) Current OrderAccess without volatile: >>> - should work, but relies on compiler-specific not standardized and >>> sometimes poorly documented compiler barriers. >>> >>> 2) Current OrderAccess with volatile: >>> - relies on standardized volatile semantics to guarantee compiler >>> reordering and reloading issues do not occur. >>> >>> 3) C++11 Atomic backend for OrderAccess >>> - relies on standardized semantics to guarantee compiler and hardware >>> reordering issues >>> - nevertheless isn't always flawless, and when it isn't, it gets painful >>> >>> Hope this sheds some light on the trade-offs. >>> >>> Thanks, >>> /Erik >>> >>> >>>> On 2017-05-28 10:45, Andrew Haley wrote: >>>> >>>>> On 27/05/17 10:10, Volker Simonis wrote: >>>>> >>>>>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>>>>> >>>>>>> On 26/05/17 17:03, Volker Simonis wrote: >>>>>>> >>>>>>> Volatile not only prevents reordering by the compiler. It also >>>>>>> prevents other, otherwise legal transformations/optimizations (like >>>>>>> for example reloading a variable [1]) which have to be prevented in >>>>>>> order to write correct, lock free programs. >>>>>> >>>>>> Yes, but so do compiler barriers. >>>>> >>>>> Please correct me if I'm wrong, but I thought "compiler barriers" are >>>>> to prevent reordering by the compiler. However, this is a question of >>>>> optimization. If you have two subsequent loads from the same address, >>>>> the compiler is free to do only the first load and keep the value in a >>>>> register if the address is not pointing to a volatile value. >>>> >>>> No it isn't: that is precisely what a compiler barrier prevents. A >>>> compiler barrier (from the POV of the compiler) clobbers all of >>>> the memory state. Neither reads nor writes may move past a compiler >>>> barrier. >>>> >>> > From volker.simonis at gmail.com Tue May 30 15:40:36 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 30 May 2017 17:40:36 +0200 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <561661eb-b463-726a-1d32-84ef5f32af13@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <561661eb-b463-726a-1d32-84ef5f32af13@oracle.com> Message-ID: On Mon, May 29, 2017 at 10:55 PM, David Holmes wrote: > > > On 30/05/2017 3:02 AM, Volker Simonis wrote: >> >> Sorry for constantly "spamming" this thread with another problem (i.e. >> JDK-8129440 [2]) but I still think that it is related and important. >> In its current state, the way how "load_heap_oop()" and its >> application works is broken. And this is not because of a problem in >> OrderAccess, but because of missing compiler barriers: >> >> static inline oop load_heap_oop(oop* p) { return *p; } >> ... >> template >> inline void G1RootRegionScanClosure::do_oop_nv(T* p) { >> // 1. load 'heap_oop' from 'p' >> T heap_oop = oopDesc::load_heap_oop(p); >> if (!oopDesc::is_null(heap_oop)) { >> // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! >> oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); > > > Do you mean that the compiler has not stashed heap_oop somewhere and > re-executes oopDesc::load_heap_oop(p) again? That would be quite nasty I > think in general as it breaks any logic that wants to read a non-local > variable once to get it into a local and reuse that knowing that it won't > change even if the real variable does! > Yes, that's exactly what I mean and that's exactly what we've observed on AIX with xlc. Notice that the compiler is free to do such transformations if the load is not from a volatile field. That's why we've opened the bug and fixed out internal version. But we still think this fix needs to go into OpenJDK as well. Regards, Volker > David > ----- > > >> HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); >> _cm->grayRoot(obj, hr); >> } >> } >> >> Notice that we don't need memory barriers here - all we need is to >> prevent the compiler from loading the oop (i.e. 'heap_oop') a second >> time. After Andrews explanation (thanks for that!) and Martin's >> examples from Google, I think we could fix this by rewriting >> 'load_heap_oop()' (and friends) as follows: >> >> static inline oop load_heap_oop(oop* p) { >> oop o = *p; >> __asm__ volatile ("" : : : "memory"); >> return o; >> } >> >> In order to make this consistent across all platforms, we would >> probably have to introduce a new, public "compiler barrier" function >> in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we >> don't currently seem to have a cross-platform concept for >> "compiler-only barriers"). But I'm still not convinced that it would >> be better than simply writing (and that's the way how we've actually >> solved it internally): >> >> static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } >> >> Declaring that single memory location to be 'volatile' seems to be a >> much more local change compared to globally "clobbering" all the >> memory. And it doesn't rely on a the compilers providing a compiler >> barrier. It does however rely on the compiler doing the "right thing" >> for volatile - but after all what has been said here so far, that >> seems more likely? >> >> The problem may also depend on the specific compiler/cpu combination. >> For ppc64, both gcc (on linux) and xlc (on aix), do the right thing >> for volatile variables - they don't insert any memory barriers (i.e. >> no instructions) but just access the corresponding variables as if >> there was a compiler barrier. This is exactly what we currently want >> in HotSpot, because fine-grained control of memory barriers is >> controlled by the use of OrderAccess (and OrderAccess implies >> "compiler barrier", at least after the latest fixes). >> >> Any thoughts? Should we introduce a cross-platform, "compiler-only >> barrier" or should we stick to using "volatile" for such cases? >> >> Regards, >> Volker >> >> [1] >> https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html >> [2] https://bugs.openjdk.java.net/browse/JDK-8129440 >> >> On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund >> wrote: >>> >>> Hi Andrew, >>> >>> I just thought I'd put my opinions in here as I see I have been mentioned >>> a >>> few times already. >>> >>> First of all, I find using the volatile keyword on things that are >>> involved >>> in lock-free protocols meaningful from a readability point of view. It >>> allows the reader of the code to see care is needed here. >>> >>> About the compiler barriers - you are right. Volatile should indeed not >>> be >>> necessary if the compiler barriers do everything right. The compiler >>> should >>> not reorder things and it should not prevent reloading. >>> >>> On windows we rely on the deprecated _ReadWriteBarrier(). According to >>> MSDN, >>> it guarantees: >>> >>> "The _ReadWriteBarrier intrinsic limits the compiler optimizations that >>> can >>> remove or reorder memory accesses across the point of the call." >>> >>> This should cut it. >>> >>> The GCC memory clobber is defined as: >>> >>> "The "memory" clobber tells the compiler that the assembly code performs >>> memory reads or writes to items other than those listed in the input and >>> output operands (for example, accessing the memory pointed to by one of >>> the >>> input parameters). To ensure memory contains correct values, GCC may need >>> to >>> flush specific register values to memory before executing the asm. >>> Further, >>> the compiler does not assume that any values read from memory before an >>> asm >>> remain unchanged after that asm; it reloads them as needed. Using the >>> "memory" clobber effectively forms a read/write memory barrier for the >>> compiler." >>> >>> This seems to only guarantee values will not be re-ordered. But in the >>> documentation for ExtendedAsm it also states: >>> >>> "You will also want to add the volatile keyword if the memory affected is >>> not listed in the inputs or outputs of the asm, as the `memory' clobber >>> does >>> not count as a side-effect of the asm." >>> >>> and >>> >>> "The volatile keyword indicates that the instruction has important >>> side-effects. GCC will not delete a volatile asm if it is reachable. (The >>> instruction can still be deleted if GCC can prove that control-flow will >>> never reach the location of the instruction.) Note that even a volatile >>> asm >>> instruction can be moved relative to other code, including across jump >>> instructions." >>> >>> This is a bit vague, but seems to suggest that by making the asm >>> statement >>> volatile and having a memory clobber, it definitely will not reload >>> variables. About not re-ordering non-volatile accesses, it shouldn't but >>> it >>> is not quite clearly stated. I have never observed such a re-ordering >>> across >>> a volatile memory clobber. But the semantics seem a bit vague. >>> >>> As for clang, the closest to a definition of what it does I have seen is: >>> >>> "A clobber constraint is indicated by a ?~? prefix. A clobber does not >>> consume an input operand, nor generate an output. Clobbers cannot use any >>> of >>> the general constraint code letters ? they may use only explicit register >>> constraints, e.g. ?~{eax}?. The one exception is that a clobber string of >>> ?~{memory}? indicates that the assembly writes to arbitrary undeclared >>> memory locations ? not only the memory pointed to by a declared indirect >>> output." >>> >>> Apart from sweeping statements saying clang inline assembly is largely >>> compatible and working similar to GCC, I have not seen clear guarantees. >>> And >>> then there are more compilers. >>> >>> As a conclusion, by using volatile in addition to OrderAccess you rely on >>> standardized compiler semantics (at least for volatile-to-volatile >>> re-orderings and re-loading, but not for volatile-to-nonvolatile, but >>> that's >>> another can of worms), and regrettably if you rely on OrderAccess memory >>> model doing what it says it will do, then it should indeed work without >>> volatile, but to make that work, OrderAccess relies on non-standardized >>> compiler-specific barriers. In practice it should work well on all our >>> supported compilers without volatile. And if it didn't, it would indeed >>> be a >>> bug in OrderAccess that needs to be solved in OrderAccess. >>> >>> Personally though, I am a helmet-on-synchronization kind of person, so I >>> would take precaution anyway and use volatile whenever possible, because >>> 1) >>> it makes the code more readable, and 2) it provides one extra layer of >>> safety that is more standardized. It seems that over the years it has >>> happened multiple times that we assumed OrderAccess is bullet proof, and >>> then realized that it wasn't and observed a crash that would never have >>> happened if the code was written in a helmet-on-synchronization way. At >>> least that's how I feel about it. >>> >>> Now one might argue that by using C++11 atomics that are standardized, >>> all >>> these problems would go away as we would rely in standardized primitives >>> and >>> then just trust the compiler. But then there could arise problems when >>> the >>> C++ compiler decides to be less conservative than we want, e.g. by not >>> doing >>> fence in sequentially consistent loads to optimize for non-multiple copy >>> atomic CPUs arguing that IRIW issues that violate sequential consistency >>> are >>> non-issues in practice. That makes those loads "almost" sequentially >>> consistent, which might be good enough. But it feels good to have a >>> choice >>> here to be more conservative. To have the synchronization helmet on. >>> >>> Meta summary: >>> 1) Current OrderAccess without volatile: >>> - should work, but relies on compiler-specific not standardized and >>> sometimes poorly documented compiler barriers. >>> >>> 2) Current OrderAccess with volatile: >>> - relies on standardized volatile semantics to guarantee compiler >>> reordering and reloading issues do not occur. >>> >>> 3) C++11 Atomic backend for OrderAccess >>> - relies on standardized semantics to guarantee compiler and hardware >>> reordering issues >>> - nevertheless isn't always flawless, and when it isn't, it gets >>> painful >>> >>> Hope this sheds some light on the trade-offs. >>> >>> Thanks, >>> /Erik >>> >>> >>> On 2017-05-28 10:45, Andrew Haley wrote: >>>> >>>> >>>> On 27/05/17 10:10, Volker Simonis wrote: >>>>> >>>>> >>>>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley wrote: >>>>>> >>>>>> >>>>>> On 26/05/17 17:03, Volker Simonis wrote: >>>>>> >>>>>>> Volatile not only prevents reordering by the compiler. It also >>>>>>> prevents other, otherwise legal transformations/optimizations (like >>>>>>> for example reloading a variable [1]) which have to be prevented in >>>>>>> order to write correct, lock free programs. >>>>>> >>>>>> >>>>>> Yes, but so do compiler barriers. >>>>> >>>>> >>>>> Please correct me if I'm wrong, but I thought "compiler barriers" are >>>>> to prevent reordering by the compiler. However, this is a question of >>>>> optimization. If you have two subsequent loads from the same address, >>>>> the compiler is free to do only the first load and keep the value in a >>>>> register if the address is not pointing to a volatile value. >>>> >>>> >>>> No it isn't: that is precisely what a compiler barrier prevents. A >>>> compiler barrier (from the POV of the compiler) clobbers all of >>>> the memory state. Neither reads nor writes may move past a compiler >>>> barrier. >>>> >>> > From david.holmes at oracle.com Tue May 30 21:33:47 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 May 2017 07:33:47 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <64fda788-4ff6-978e-4476-a5af36bd708a@redhat.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> <592D5030.2020904@oracle.com> <64fda788-4ff6-978e-4476-a5af36bd708a@redhat.com> Message-ID: <35858f93-1625-af37-6d1a-773fddd73654@oracle.com> On 30/05/2017 10:45 PM, Andrew Haley wrote: > You have to remember where this discussion started, which was a > proposed use of volatile to fix a bug where a barrier was needed. No that was not the case, as has been pointed out numerous times. There were two bugs: 1. Incorrect placement of volatile in a declaration 2. Need to backport the compiler_barrier changes for OrderAccess. No one suggested doing #1 in lieu of #2. We wanted #1 as well as #2. David From volker.simonis at gmail.com Tue May 30 22:24:39 2017 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 30 May 2017 22:24:39 +0000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <67244460-ac75-44a9-4b2b-0bd5f36bd74f@redhat.com> <88a3d832-9875-8a3a-529c-a3a3dfe295de@redhat.com> <69ad2e6a-74fb-e0bb-07df-3b25c143f81d@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> Message-ID: Volker Simonis schrieb am Di. 30. Mai 2017 um 17:37: > On Mon, May 29, 2017 at 7:56 PM, Erik Osterlund > wrote: > > Hi Volker, > > > > Thank you for filling in more compiler info. > > > > If there is a choice between providing a new compiler barrier interface > and defining its semantics vs using existing volatile semantics, then > volatile semantics seems better to me. > > > > Also, my new Access API allows you to do > BasicAccess::load_oop(addr) to perform load_heap_oop and > load_decode_heap_oop with volatile semantics. Sounds like that would help > here. > > Sorry for my ignorance, but what is the "new Access API" and > "BasicAccess"? It actually sounds quite interesting :) > Sorry, my bad:( Please ignore this mail, I totally forgot about the new GC interface... > > > > > Thanks, > > /Erik > > > >> On 29 May 2017, at 19:02, Volker Simonis > wrote: > >> > >> Hi Erik, > >> > >> thanks for the nice summary. Just for the sake of completeness, here's > >> the corresponding documentation for the xlc compiler barrier [1]. It > >> kind of implements the gcc syntax, but the wording is slightly > >> different: > >> > >> "Add memory to the list of clobbered registers if assembler > >> instructions can change a memory location in an unpredictable fashion. > >> The memory clobber ensures that the data used after the completion of > >> the assembly statement is valid and synchronized. > >> However, the memory clobber can result in many unnecessary reloads, > >> reducing the benefits of hardware prefetching. Thus, the memory > >> clobber can impose a performance penalty and should be used with > >> caution." > >> > >> We haven't used it until now, so I can not say if it really does what > >> it is supposed to do. I'm also concerned about the performance > >> warning. It seems like the "unnecessary reloads" can really hurt on > >> architectures like ppc which have much more registers than x86. > >> Declaring a memory location 'volatile' seems much more simple and > >> light-weight in order to achieve the desired effect. So I tend to > >> agree with you and David that we should proceed to mark things with > >> 'volatile'. > >> > >> Sorry for constantly "spamming" this thread with another problem (i.e. > >> JDK-8129440 [2]) but I still think that it is related and important. > >> In its current state, the way how "load_heap_oop()" and its > >> application works is broken. And this is not because of a problem in > >> OrderAccess, but because of missing compiler barriers: > >> > >> static inline oop load_heap_oop(oop* p) { return *p; } > >> ... > >> template > >> inline void G1RootRegionScanClosure::do_oop_nv(T* p) { > >> // 1. load 'heap_oop' from 'p' > >> T heap_oop = oopDesc::load_heap_oop(p); > >> if (!oopDesc::is_null(heap_oop)) { > >> // 2. Compiler reloads 'heap_oop' from 'p' which may now be null! > >> oop obj = oopDesc::decode_heap_oop_not_null(heap_oop); > >> HeapRegion* hr = _g1h->heap_region_containing((HeapWord*) obj); > >> _cm->grayRoot(obj, hr); > >> } > >> } > >> > >> Notice that we don't need memory barriers here - all we need is to > >> prevent the compiler from loading the oop (i.e. 'heap_oop') a second > >> time. After Andrews explanation (thanks for that!) and Martin's > >> examples from Google, I think we could fix this by rewriting > >> 'load_heap_oop()' (and friends) as follows: > >> > >> static inline oop load_heap_oop(oop* p) { > >> oop o = *p; > >> __asm__ volatile ("" : : : "memory"); > >> return o; > >> } > >> > >> In order to make this consistent across all platforms, we would > >> probably have to introduce a new, public "compiler barrier" function > >> in OrderAccess (e.g. 'OrderAccess::compiler_barrier()' because we > >> don't currently seem to have a cross-platform concept for > >> "compiler-only barriers"). But I'm still not convinced that it would > >> be better than simply writing (and that's the way how we've actually > >> solved it internally): > >> > >> static inline oop load_heap_oop(oop* p) { return * (volatile oop*) p; } > >> > >> Declaring that single memory location to be 'volatile' seems to be a > >> much more local change compared to globally "clobbering" all the > >> memory. And it doesn't rely on a the compilers providing a compiler > >> barrier. It does however rely on the compiler doing the "right thing" > >> for volatile - but after all what has been said here so far, that > >> seems more likely? > >> > >> The problem may also depend on the specific compiler/cpu combination. > >> For ppc64, both gcc (on linux) and xlc (on aix), do the right thing > >> for volatile variables - they don't insert any memory barriers (i.e. > >> no instructions) but just access the corresponding variables as if > >> there was a compiler barrier. This is exactly what we currently want > >> in HotSpot, because fine-grained control of memory barriers is > >> controlled by the use of OrderAccess (and OrderAccess implies > >> "compiler barrier", at least after the latest fixes). > >> > >> Any thoughts? Should we introduce a cross-platform, "compiler-only > >> barrier" or should we stick to using "volatile" for such cases? > >> > >> Regards, > >> Volker > >> > >> [1] > https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/language_ref/asm.html > >> [2] https://bugs.openjdk.java.net/browse/JDK-8129440 > >> > >> On Mon, May 29, 2017 at 2:20 PM, Erik ?sterlund > >> wrote: > >>> Hi Andrew, > >>> > >>> I just thought I'd put my opinions in here as I see I have been > mentioned a > >>> few times already. > >>> > >>> First of all, I find using the volatile keyword on things that are > involved > >>> in lock-free protocols meaningful from a readability point of view. It > >>> allows the reader of the code to see care is needed here. > >>> > >>> About the compiler barriers - you are right. Volatile should indeed > not be > >>> necessary if the compiler barriers do everything right. The compiler > should > >>> not reorder things and it should not prevent reloading. > >>> > >>> On windows we rely on the deprecated _ReadWriteBarrier(). According to > MSDN, > >>> it guarantees: > >>> > >>> "The _ReadWriteBarrier intrinsic limits the compiler optimizations > that can > >>> remove or reorder memory accesses across the point of the call." > >>> > >>> This should cut it. > >>> > >>> The GCC memory clobber is defined as: > >>> > >>> "The "memory" clobber tells the compiler that the assembly code > performs > >>> memory reads or writes to items other than those listed in the input > and > >>> output operands (for example, accessing the memory pointed to by one > of the > >>> input parameters). To ensure memory contains correct values, GCC may > need to > >>> flush specific register values to memory before executing the asm. > Further, > >>> the compiler does not assume that any values read from memory before > an asm > >>> remain unchanged after that asm; it reloads them as needed. Using the > >>> "memory" clobber effectively forms a read/write memory barrier for the > >>> compiler." > >>> > >>> This seems to only guarantee values will not be re-ordered. But in the > >>> documentation for ExtendedAsm it also states: > >>> > >>> "You will also want to add the volatile keyword if the memory affected > is > >>> not listed in the inputs or outputs of the asm, as the `memory' > clobber does > >>> not count as a side-effect of the asm." > >>> > >>> and > >>> > >>> "The volatile keyword indicates that the instruction has important > >>> side-effects. GCC will not delete a volatile asm if it is reachable. > (The > >>> instruction can still be deleted if GCC can prove that control-flow > will > >>> never reach the location of the instruction.) Note that even a > volatile asm > >>> instruction can be moved relative to other code, including across jump > >>> instructions." > >>> > >>> This is a bit vague, but seems to suggest that by making the asm > statement > >>> volatile and having a memory clobber, it definitely will not reload > >>> variables. About not re-ordering non-volatile accesses, it shouldn't > but it > >>> is not quite clearly stated. I have never observed such a re-ordering > across > >>> a volatile memory clobber. But the semantics seem a bit vague. > >>> > >>> As for clang, the closest to a definition of what it does I have seen > is: > >>> > >>> "A clobber constraint is indicated by a ?~? prefix. A clobber does not > >>> consume an input operand, nor generate an output. Clobbers cannot use > any of > >>> the general constraint code letters ? they may use only explicit > register > >>> constraints, e.g. ?~{eax}?. The one exception is that a clobber string > of > >>> ?~{memory}? indicates that the assembly writes to arbitrary undeclared > >>> memory locations ? not only the memory pointed to by a declared > indirect > >>> output." > >>> > >>> Apart from sweeping statements saying clang inline assembly is largely > >>> compatible and working similar to GCC, I have not seen clear > guarantees. And > >>> then there are more compilers. > >>> > >>> As a conclusion, by using volatile in addition to OrderAccess you rely > on > >>> standardized compiler semantics (at least for volatile-to-volatile > >>> re-orderings and re-loading, but not for volatile-to-nonvolatile, but > that's > >>> another can of worms), and regrettably if you rely on OrderAccess > memory > >>> model doing what it says it will do, then it should indeed work without > >>> volatile, but to make that work, OrderAccess relies on non-standardized > >>> compiler-specific barriers. In practice it should work well on all our > >>> supported compilers without volatile. And if it didn't, it would > indeed be a > >>> bug in OrderAccess that needs to be solved in OrderAccess. > >>> > >>> Personally though, I am a helmet-on-synchronization kind of person, so > I > >>> would take precaution anyway and use volatile whenever possible, > because 1) > >>> it makes the code more readable, and 2) it provides one extra layer of > >>> safety that is more standardized. It seems that over the years it has > >>> happened multiple times that we assumed OrderAccess is bullet proof, > and > >>> then realized that it wasn't and observed a crash that would never have > >>> happened if the code was written in a helmet-on-synchronization way. At > >>> least that's how I feel about it. > >>> > >>> Now one might argue that by using C++11 atomics that are standardized, > all > >>> these problems would go away as we would rely in standardized > primitives and > >>> then just trust the compiler. But then there could arise problems when > the > >>> C++ compiler decides to be less conservative than we want, e.g. by not > doing > >>> fence in sequentially consistent loads to optimize for non-multiple > copy > >>> atomic CPUs arguing that IRIW issues that violate sequential > consistency are > >>> non-issues in practice. That makes those loads "almost" sequentially > >>> consistent, which might be good enough. But it feels good to have a > choice > >>> here to be more conservative. To have the synchronization helmet on. > >>> > >>> Meta summary: > >>> 1) Current OrderAccess without volatile: > >>> - should work, but relies on compiler-specific not standardized and > >>> sometimes poorly documented compiler barriers. > >>> > >>> 2) Current OrderAccess with volatile: > >>> - relies on standardized volatile semantics to guarantee compiler > >>> reordering and reloading issues do not occur. > >>> > >>> 3) C++11 Atomic backend for OrderAccess > >>> - relies on standardized semantics to guarantee compiler and hardware > >>> reordering issues > >>> - nevertheless isn't always flawless, and when it isn't, it gets > painful > >>> > >>> Hope this sheds some light on the trade-offs. > >>> > >>> Thanks, > >>> /Erik > >>> > >>> > >>>> On 2017-05-28 10:45, Andrew Haley wrote: > >>>> > >>>>> On 27/05/17 10:10, Volker Simonis wrote: > >>>>> > >>>>>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley > wrote: > >>>>>> > >>>>>>> On 26/05/17 17:03, Volker Simonis wrote: > >>>>>>> > >>>>>>> Volatile not only prevents reordering by the compiler. It also > >>>>>>> prevents other, otherwise legal transformations/optimizations (like > >>>>>>> for example reloading a variable [1]) which have to be prevented in > >>>>>>> order to write correct, lock free programs. > >>>>>> > >>>>>> Yes, but so do compiler barriers. > >>>>> > >>>>> Please correct me if I'm wrong, but I thought "compiler barriers" are > >>>>> to prevent reordering by the compiler. However, this is a question of > >>>>> optimization. If you have two subsequent loads from the same address, > >>>>> the compiler is free to do only the first load and keep the value in > a > >>>>> register if the address is not pointing to a volatile value. > >>>> > >>>> No it isn't: that is precisely what a compiler barrier prevents. A > >>>> compiler barrier (from the POV of the compiler) clobbers all of > >>>> the memory state. Neither reads nor writes may move past a compiler > >>>> barrier. > >>>> > >>> > > > From david.holmes at oracle.com Wed May 31 00:49:07 2017 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 May 2017 10:49:07 +1000 Subject: RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException In-Reply-To: <35858f93-1625-af37-6d1a-773fddd73654@oracle.com> References: <092b320b-3230-cb69-99d3-1d7777723578@redhat.com> <648fc2bf-34d0-2796-dd79-9e498536b23b@redhat.com> <806cff0e-a48f-9a31-c384-caa83c423e8b@redhat.com> <2062698c-838b-ad17-0d35-acfc7706878d@oracle.com> <3d963255-2006-de03-4b81-87a3dd64ef72@redhat.com> <45c4ea7f-3137-76ae-d20a-df35922f4266@redhat.com> <592C120A.1080908@oracle.com> <6e340bdf-da1a-5a6b-6db4-8d040cdf0b54@redhat.com> <592D5030.2020904@oracle.com> <64fda788-4ff6-978e-4476-a5af36bd708a@redhat.com> <35858f93-1625-af37-6d1a-773fddd73654@oracle.com> Message-ID: Correction and apology ... On 31/05/2017 7:33 AM, David Holmes wrote: > > > On 30/05/2017 10:45 PM, Andrew Haley wrote: >> You have to remember where this discussion started, which was a >> proposed use of volatile to fix a bug where a barrier was needed. > > No that was not the case, as has been pointed out numerous times. There > were two bugs: > > 1. Incorrect placement of volatile in a declaration > 2. Need to backport the compiler_barrier changes for OrderAccess. > > No one suggested doing #1 in lieu of #2. We wanted #1 as well as #2. My apologies, the very original proposal was just to fix #1 without any apparent knowledge of #2. When #2 was pointed out it was then proposed to drop #1. Paul and I then chimed in that #1 still needed to be fixed to follow lets say "hotspot style", even if, in the presence of correct (and correctly used) compiler-barriers the volatile should not be needed. David > David >